May 26, 2026

PPFedIT: Towards Privacy-Preserving Federated Instruction Tuning with Few-shot Local Examples

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Instruction tuning has been identified as a crucial technique for optimizing large language models (LLMs) to generate human-aligned responses. Nonetheless, gathering diversified and superior-quality instruction data for such tuning presents notable obstacles, especially in privacy-sensitive domains. Federated instruction tuning ( FedIT ) has emerged as a promising solution by consolidating collaborative training across multiple data owners, resulting in a privacy-enhancing learning model. Existing FedIT studies assume that clients have sufficient training data, however, in reality, many clients only have few-shot samples, leading to either overfitting in federated LLM or degraded performance. At the same time, this federated few-shot environment also increases the risk of training data extraction attacks, as the LLM may well memorize the limited training data. To address these issues, this paper proposes a novel federated algorithm, PPFedIT , designed to enhance privacy protection and model performance of federated few-shot learning. PPFedIT comprises three vital steps on the client side: (1) synthetic data generation, which utilizes the strong generation capacity of LLMs to generate synthetic data, aiming to diversify and enrich local data; (2) parameter isolation training, which respectively updates the parameters of a shared global LLM on the synthetic data and the parameters of local LLMs on the local data, consequently mitigating the noise impact of the synthetic data; (3) local aggregation then sharing mechanism, which mixes the parameters of the global LLM and those of a local LLM first, before uploading them to a server for aggregation. This effectively mitigates data extraction attacks. Extensive experiments on three open-source datasets demonstrate PPFedIT significantly enhances model performance (averaging 8.4%) and reduces the risk of data extraction attacks (approximately 20%) in practical and challenging federated few-shot scenarios.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Z G Zhang

Harbin Institute of Technology

Jingyuan Zhang

MediaTek (Taiwan)

Jintao Huang

Harbin Institute of Technology

Journals

ACM Transactions on Intelligent Systems and Technology

Actions

Institutions

Monash University

Fudan University

Harbin Institute of Technology

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

PPFedIT: Towards Privacy-Preserving Federated Instruction Tuning with Few-shot Local Examples

Puntos clave

Resumen

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Also consider