June 7, 2024

Differential privacy to mathematically secure fine-tuned large language models for linguistic steganography

Key Points

Key points are not available for this paper at this time.

Abstract

It is a major challenge to maintain Differential Privacy (DP) when fine-tuning a Large Language Model (LLM) while also preserving the increased functionality that make fine-tuned LLMs appealing. In this paper, we explore the utilization of an LLM that has been modified for the purpose of encoded message transmission using text output as a medium, a task which falls under the classification of Linguistic Steganography. By examining the impact that DP preserving fine-tuning has on an LLM intended for such a specific and technical functionality, we evaluate what performance cost is imparted. Our experimentation focuses on using a modified implementation of Differentially Private Stochastic Gradient Descent, while fine-tuning a LLM on curated data taken from the ConvoKit Reddit dataset. We were able to securely fine-tune the LLM while maintaining a relatively strict DP privacy budget, and still benefit from the domain specific increased performance that LLM fine-tuning provides.

Bookmark

Cite This Study

Coffey et al. (Fri,) studied this question.

synapsesocial.com/papers/68e65baeb6db6435875e9eab https://doi.org/https://doi.org/10.1117/12.3013129