A single first-order transition model simultaneously detects regime-change anomalies (AUC ≈ 0.841), compresses data (≈ 3.3 bits/token below fixed rate), and forecasts the next token.
A single first-order transition model can simultaneously perform lossless entropy coding, anomaly detection, and forecasting on electrocardiographic and inertial data streams.
A pipeline that compresses a continuous signal to a stream of class-discriminant codebook tokens stores or transmits the stream at a fixed per-token rate of log₂K bits even though consecutive tokens are highly temporally correlated, and a normal-regime transition model fit for anomaly detection over that stream is a probabilistic model of the stream used for only one purpose. We show the same normal-regime first-order transition model serves additionally as a lossless entropy model: an arithmetic coder operating against its conditional probabilities compresses the stream to its conditional entropy, which on real data is well below both the fixed per-token rate and the memoryless token entropy. On real electrocardiographic and inertial data the achievable lossless rate is ≈ 2.68 / 0.57 / 3.73 bits per token versus a fixed 6 (codebook size 64), in every case below the marginal entropy — the gain is from temporal correlation. We verify exact lossless coding: an integer arithmetic coder reconstructs every tested sequence bit-for-bit at ≈ 2.039 bits per token, matching the model cross-entropy ≈ 2.038. A single fitted transition matrix simultaneously detects a regime-change anomaly (AUC ≈ 0.841), compresses (≈ 3.3 bits/token below fixed), and forecasts the next token (top-1 ≈ 0.662, ≈ 42× chance) — one model, three jobs — and the instantaneous code length used as the anomaly score equals the transition-surprisal detector exactly (AUC 0.889), unifying compression and detection in one quantity. We extend with: an adaptive online coder that maintains compression under drift (≈ 0.86 vs static ≈ 3.98 bits/token) while staying exactly decodable (25/25 round-trip); per-channel and multi-token coding; a framed coder that confines a transmission bit error to one frame (≈ 13.8 vs ≈ 868 tokens corrupted) at ≈ 0.19 bits/token overhead; decoder-side detection at no extra cost (AUC 0.825 = encoder); multi-step forecasting via matrix powers (inertial top-1 ≈ 0.84 at five steps); and a privacy property — entropy coding under a population model lowers subject re-identification of the compressed bitstream from ≈ 0.996 to ≈ 0.658, toward chance. We report an honest negative verbatim: a small learned recurrent sequence model compresses better (≈ 2.04 vs ≈ 2.54 bits/token), so the first-order model is not rate-optimal — it is chosen because it is the same model already computed for detection, is interpretable, and runs in constant time (decode ≈ 8.2 µs/token; next-token expected calibration error ≈ 0.008). The contribution is the reuse of one fitted transition model as a verified lossless entropy coder and the resulting single-model multi-use architecture, a read-only addition to an existing class-discriminant token pipeline. Keywords / index terms: lossless compression; entropy coding; arithmetic coding; Markov model; class-discriminant codebook; token stream; anomaly detection; forecasting; multi-use model; conditional entropy; privacy; channel-error resilience; electrocardiogram; human activity recognition; pre-registration. References: 1. I. H. Witten, R. M. Neal, and J. G. Cleary, "Arithmetic coding for data compression," Communications of the ACM, 1987. 2. J. Rissanen and G. G. Langdon, "Arithmetic coding," IBM Journal of Research and Development, 1979. 3. T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed., Wiley, 2006. 4. L. R. Rabiner, "A tutorial on hidden Markov models...," Proceedings of the IEEE, 1989. 5. K. Cho et al., "Learning phrase representations using RNN encoder-decoder...," EMNLP, 2014. 6. P. Wagner et al., "PTB-XL, a large publicly available electrocardiography dataset," Scientific Data, 2020. 7. G. Moody and R. Mark, "The impact of the MIT-BIH arrhythmia database," IEEE EMB Magazine, 2001. 8. D. Anguita et al., "A public domain dataset for human activity recognition using smartphones," ESANN, 2013. 9. R. J. Ferlic and K. K. Ferlic, companion deposits (Papers 19, 21, 22), Zenodo, 2026. Companion deposits in this Zenodo Community (spiral-domain-encoder-campaign): · Paper 19 — 10.5281/zenodo.20788187 · Paper 20 — 10.5281/zenodo.20802759 · Paper 21 — 10.5281/zenodo.20802826 · Paper 22 — 10.5281/zenodo.20805321 · Paper 23 — 10.5281/zenodo.20821668
Ferlic et al. (Wed,) conducted a other in Electrocardiographic and inertial data analysis. Transition-Model Entropy Coding vs. Fixed per-token rate was evaluated on Lossless compression rate and anomaly detection AUC. A single first-order transition model simultaneously detects regime-change anomalies (AUC ≈ 0.841), compresses data (≈ 3.3 bits/token below fixed rate), and forecasts the next token.