This paper proposes a built-in framework that embeds a dedicated “Crypto Expert” directly into large language models (LLMs) architecture. As an initial attempt, we design a differentiable proxy tailored to the Advanced Encryption Standard (AES) algorithm, using our customized neuron units, including SoftXOR, SoftLUT and GF-conv neurons. These units provide functional equivalence to the AES within the Boolean domain, while enabling stable gradients for backpropagation. By integrating this differentiable proxy as a specialized expert within a Mixture-of-Expert (MoE) LLM, the LLM learns to autonomously route and encrypt sensitive tokens during the training phase. After training, the differentiable proxy is seamlessly swapped for a real and discrete AES implementation to guarantee provable security at inference. Our empirical evaluations demonstrate that our approach significantly reduces neuron counts and latency compared to prior ReLU-based representation, mitigates continuous differential attacks, and enforces end-to-end data protection without degrading downstream task utility. We expect this attempt to serve as a catalyst for future research into the seamless fusion of formal cryptographic guarantees and deep learning computation graphs.
Weng et al. (Fri,) studied this question.