Large language models (LLMs) deployed in production undergo continuous behavioral drift: fine-tuning, RLHF updates, jailbreak degradation, and distributional shift alter output token distributions in ways that are not captured by periodic benchmark evaluation. We present KA-LLM, the first streaming behavioral drift detection framework for LLMs grounded in the Karimov–Alekberli (KA) thermodynamic framework. KA-LLM monitors four channels computed over the entropy of per-domain output distributions: C1 (domain-level causal entropy deviation), C2 (cross-domain coupling covariance), C3 (calibration residual z-score), and CB (response-pattern correlation break). Each alert carries attribution identifying which domain channel and which drift mechanism triggered detection—a capability absent from all baseline methods. Validation on a 14-domain, 300-day simulation (30 drift events: capability, alignment, calibration, jailbreak, and distributional shift; published MMLU/TruthfulQA statistics as proxy baselines) demonstrates DR=57% with FPR=0.00/month at θ=2.5, matching CUSUM and Isolation Forest on detection rate while uniquely providing per-domain attribution. The C2 coupling channel detects 5 silent multi-domain drift events (alignment tax, coordinated capability shift) where no single domain exceeds the individual threshold—invisibleto all per-domain baselines. This paper is the fifth in the KA Framework series.
Building similarity graph...
Analyzing shared references across papers
Loading...
Hikmat Karimov
Rahid Alekberli
Azerbaijan Technical University
Building similarity graph...
Analyzing shared references across papers
Loading...
Karimov et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69fa8eca04f884e66b5311b4 — DOI: https://doi.org/10.5281/zenodo.20029517
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: