Key points are not available for this paper at this time.
Instruction tuning has been identified as a crucial technique for optimizing large language models (LLMs) to generate human-aligned responses. Nonetheless, gathering diversified and superior-quality instruction data for such tuning presents notable obstacles, especially in privacy-sensitive domains. Federated instruction tuning ( FedIT ) has emerged as a promising solution by consolidating collaborative training across multiple data owners, resulting in a privacy-enhancing learning model. Existing FedIT studies assume that clients have sufficient training data, however, in reality, many clients only have few-shot samples, leading to either overfitting in federated LLM or degraded performance. At the same time, this federated few-shot environment also increases the risk of training data extraction attacks, as the LLM may well memorize the limited training data. To address these issues, this paper proposes a novel federated algorithm, PPFedIT , designed to enhance privacy protection and model performance of federated few-shot learning. PPFedIT comprises three vital steps on the client side: (1) synthetic data generation, which utilizes the strong generation capacity of LLMs to generate synthetic data, aiming to diversify and enrich local data; (2) parameter isolation training, which respectively updates the parameters of a shared global LLM on the synthetic data and the parameters of local LLMs on the local data, consequently mitigating the noise impact of the synthetic data; (3) local aggregation then sharing mechanism, which mixes the parameters of the global LLM and those of a local LLM first, before uploading them to a server for aggregation. This effectively mitigates data extraction attacks. Extensive experiments on three open-source datasets demonstrate PPFedIT significantly enhances model performance (averaging 8.4%) and reduces the risk of data extraction attacks (approximately 20%) in practical and challenging federated few-shot scenarios.
Zhang et al. (Tue,) studied this question.