Discriminant-subspace residual vector quantization outperformed ambient-feature-space RVQ for ECG classification (macro-AUC 0.8101 vs 0.7692; 95% CI +0.029 to +0.059; P≈0.000).
Performing residual vector quantization within a class-discriminant subspace significantly improves ECG compression accuracy compared to ambient-space RVQ at the same bit rate, offering a tunable accuracy-privacy-rate tradeoff.
Estimación del efecto: Difference +0.041 (95% CI +0.029, +0.059)
Tasa de eventos absoluta: 0.8101% vs 0.7692%
valor p: p=≈0.000
A single discrete token (≈10 bits) compresses an information-rich signal record by ~1000× but saturates: one token cannot close the gap to an uncompressed classifier. The companion work (Paper 19, Parent N) showed that building the single-token codebook within a supervised class-discriminant subspace recovers a significant fraction of that "compression tax." Here we extend the construction to multiple tokens by residual vector quantization (RVQ) performed within the discriminant subspace — a first token is the nearest-centroid index of the projected feature vector, and each further token quantizes the successive in-subspace residual. On a balanced real 12-lead clinical ECG cohort (PTB-XL, five superclasses, n≈6,380), discriminant-subspace RVQ outperforms an otherwise-identical ambient-feature-space RVQ (the EnCodec/SoundStream-style prior-art baseline) at a matched 18-bit budget by +0.041 macro-AUC (0.8101 vs 0.7692; paired-bootstrap p≈0.000, 95% CI +0.029, +0.059), replicating on a second dataset (MIT-BIH, +0.035). A subspace ablation isolates the mechanism: a *dimensionality-matched* unsupervised 12-component PCA subspace is worse than the 120-dimensional ambient space (−0.031), while the discriminant 12-dimensional subspace is +0.072 above it — the discriminant axes, not dimensionality reduction, drive the gain. The accuracy gain concentrates in the first residual token and saturates at a small token depth; discriminant-RVQ Pareto-dominates ambient-RVQ at every depth. The number of tokens is a tunable operating point on a frontier relating downstream accuracy, re-identification privacy, and bit rate — the single-token point being privacy-preferred — whose normalized shape is dataset-invariant (a normalized privacy-approach fraction of ≈1.2 at two tokens on both PTB-XL and MIT-BIH), so an operating point may be calibrated from two measured anchors. Residual depth further reaches resolutions a single flat codebook of equal bits cannot feasibly train. We report a recommended recipe (two tokens, codebook size ≈64, a single shared codebook, an in-subspace residual), a kernel-discriminant-subspace variant that stacks with residual depth (highest observed single-record macro-AUC 0.8175), a centroid-noise privacy mitigation that returns the multi-token stream to single-token re-identification resistance, an embedded/progressively-decodable token stream served by one downstream model across rates, and a built-in Mahalanobis/conformal novelty monitor that strengthens with depth. We report honestly that the multi-token *depth* gain over the single token is realized by lookup and linear downstream heads but not by higher-capacity heads, and that the multi-token mode trades re-identification privacy for accuracy. Every threshold was frozen before data examination; negatives are reported verbatim. Keywords / index terms: residual vector quantization; multi-token compression; class-discriminant subspace; kernel discriminant analysis; rate–relevance frontier; accuracy–privacy tradeoff; progressive/embedded coding; novelty detection; electrocardiogram; pre-registration; spiral-domain encoder; H-pipeline. References: 1. Y. Linde, A. Buzo, and R. Gray, "An algorithm for vector quantizer design," IEEE Trans. Communications, 1980. 2. A. van den Oord, O. Vinyals, and K. Kavukcuoglu, "Neural discrete representation learning (VQ-VAE)," NeurIPS, 2017. 3. N. Zeghidour et al., "SoundStream: an end-to-end neural audio codec," IEEE/ACM TASLP, 2021. 4. A. Défossez, J. Copet, G. Synnaeve, and Y. Adi, "High fidelity neural audio compression (EnCodec)," 2022. 5. R. A. Fisher, "The use of multiple measurements in taxonomic problems," Annals of Eugenics, 1936. 6. G. Baudat and F. Anouar, "Generalized discriminant analysis using a kernel approach," Neural Computation, 2000. 7. N. Tishby, F. Pereira, and W. Bialek, "The information bottleneck method," 1999. 8. V. Vovk, A. Gammerman, and G. Shafer, Algorithmic Learning in a Random World, Springer, 2005. 9. P. Wagner et al., "PTB-XL, a large publicly available electrocardiography dataset," Scientific Data, 2020. 10. G. Moody and R. Mark, "The impact of the MIT-BIH arrhythmia database," IEEE EMB Magazine, 2001. 11. B. Efron and R. J. Tibshirani, An Introduction to the Bootstrap, Chapman Parent N, U.S. Provisional Application No. 64/095,354, filed 2026-06-21 (the single-token foundation). Both build on the spiral-domain H-pipeline applications (Parents H/I/J/K/L/M). Licensing inquiries: Randolph James Ferlic, M.D., randolphf@fieldstoneanalyticsllc.com. Reproducibility archive released under CC-BY 4.0.
Ferlic et al. (Mon,) conducted a other in ECG signals (n=6,380). Discriminant-subspace residual vector quantization (RVQ) vs. Ambient-feature-space RVQ was evaluated on Macro-AUC (Difference +0.041, 95% CI +0.029, +0.059, p=≈0.000). Discriminant-subspace residual vector quantization outperformed ambient-feature-space RVQ for ECG classification (macro-AUC 0.8101 vs 0.7692; 95% CI +0.029 to +0.059; P≈0.000).