Los puntos clave no están disponibles para este artículo en este momento.
Instruction tuning has been identified as a crucial technique for optimizing large language models (LLMs) to generate human-aligned responses. Nonetheless, gathering diversified and superior-quality instruction data for such tuning presents notable obstacles, especially in privacy-sensitive domains. Federated instruction tuning ( FedIT ) has emerged as a promising solution by consolidating collaborative training across multiple data owners, resulting in a privacy-enhancing learning model. Existing FedIT studies assume that clients have sufficient training data, however, in reality, many clients only have few-shot samples, leading to either overfitting in federated LLM or degraded performance. At the same time, this federated few-shot environment also increases the risk of training data extraction attacks, as the LLM may well memorize the limited training data. To address these issues, this paper proposes a novel federated algorithm, PPFedIT , designed to enhance privacy protection and model performance of federated few-shot learning. PPFedIT comprises three vital steps on the client side: (1) synthetic data generation, which utilizes the strong generation capacity of LLMs to generate synthetic data, aiming to diversify and enrich local data; (2) parameter isolation training, which respectively updates the parameters of a shared global LLM on the synthetic data and the parameters of local LLMs on the local data, consequently mitigating the noise impact of the synthetic data; (3) local aggregation then sharing mechanism, which mixes the parameters of the global LLM and those of a local LLM first, before uploading them to a server for aggregation. This effectively mitigates data extraction attacks. Extensive experiments on three open-source datasets demonstrate PPFedIT significantly enhances model performance (averaging 8.4%) and reduces the risk of data extraction attacks (approximately 20%) in practical and challenging federated few-shot scenarios.
Building similarity graph...
Analyzing shared references across papers
Loading...
Z G Zhang
Harbin Institute of Technology
Jingyuan Zhang
MediaTek (Taiwan)
Jintao Huang
Harbin Institute of Technology
ACM Transactions on Intelligent Systems and Technology
Monash University
Fudan University
Harbin Institute of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhang et al. (Tue,) studied this question.
synapsesocial.com/papers/6a16d3dd2fcf950e000554e7 — DOI: https://doi.org/10.1145/3806196
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: