What is the clinical evidence from this study?

Study design: Other. Population: ECG classification (n=6380). Intervention: Nyström kernel discriminant analysis vs. Linear discriminant subspace. Primary outcome: single-token macro-AUC (ΔAUC +0.019, p=≈0.000).

What question did this study set out to answer?

This research aims to enhance the classification and compression of ECG data through a refined single-token codebook.

June 24, 2026Open Access

Hardening and Generality of a Class-Discriminant Single-Token Codebook: Nonlinear Discriminant Subspaces, Class-Count and Cohort-Size Robustness, Feature-Family Invariance, Calibration, and a Calibrated-False-Positive Novelty Monitor

Resultado clave

Nyström kernel discriminant analysis raised single-token macro-AUC from 0.7835 to 0.8029 (ΔAUC +0.019, p≈0.000) compared to a linear discriminant subspace.

Puntos clave

This research aims to enhance the classification and compression of ECG data through a refined single-token codebook.
Conducted a pre-registered modal study using a clinical ECG cohort (n≈6,380) and paired-bootstrap statistics.
Utilized nonlinear discriminant analysis and assessed efficacy across seven axes of evaluation.
Implemented a Mahalanobis distance for novelty monitoring and evaluated multiple feature families.
Increased single-token macro-AUC from 0.7835 to 0.8029 (ΔAUC +0.019, p≈0.000) indicates improved discrimination.
Demonstrated robust performance across different class counts, with ΔAUC of +0.087 (binary screening).
Achieved a calibrated false-positive rate of 0.054 at a nominal 5% using the refined novelty monitor.

PICO estructurado

Población

Real 12-lead clinical ECG cohort (PTB-XL dataset, five superclasses, n≈6,380)

Intervención

Single ≈10-bit signal-compression codebook within a supervised class-discriminant subspace using Nyström kernel discriminant analysis

Comparador

Linear discriminant subspace and unsupervised reconstruction-error subspace

Resultado

Single-token macro-AUCsurrogate

Building a single-token signal-compression codebook within a nonlinear supervised class-discriminant subspace significantly improves ECG classification performance over linear or unsupervised methods.

Resultado numérico

Estimación del efecto: ΔAUC +0.019

Tasa de eventos absoluta: 0.8029% vs 0.7835%

valor p: p=≈0.000

Resumen

A pre-registered Modal study that hardens and generalizes a previously established lever (companion Paper 19, Parent N): building a single ≈10-bit signal-compression codebook within a supervised class-discriminant subspace rather than an unsupervised reconstruction-error subspace recovers a statistically significant fraction of the single-token "compression tax" at no architectural cost. Using the identical encoder, downstream-decoupled interface, real 12-lead clinical ECG cohort (PTB-XL, five superclasses, n≈6,380), and paired-bootstrap statistics, we test the lever along seven pre-registered axes. (1) Nonlinear discriminant subspace: a Nyström kernel discriminant analysis raises single-token macro-AUC from 0.7835 (linear) to 0.8029 (ΔAUC +0.019, p≈0.000), confirming the discriminant subspace — not its linearity — as the operative element. (2) Class-count robustness: the lever grows as the task narrows — ΔAUC +0.046 (five-class), +0.057 (three-class), +0.087 (binary screening), strongest in the most commercially common case. (3) Cohort-size robustness: positive at every training size down to n=500 (+0.022), provided codebook size is tied to the training support. (4) Feature-family invariance: reproduces under a disjoint spectral (FFT band-power) feature family (+0.038 vs +0.046 statistical), so it is not an artifact of the hand-crafted feature set. (5) Calibration neutrality: the discriminant token is calibration-neutral (expected calibration error 0.042 vs 0.031 unsupervised; both well-calibrated), so accuracy costs no confidence quality. (6) Calibrated-false-positive novelty monitor: a Mahalanobis nearest-centroid distance is a safe refinement of the Euclidean monitor (AUC 0.717 vs 0.709), with a split-conformal threshold attaining an empirical false-positive rate of 0.054 at a nominal 5%. (7) Complete dual-channel anomaly monitor: fusing in-subspace centroid distance with an orthogonal reconstruction-residual distance detects both in-subspace and off-axis anomalies that either channel alone misses (fused AUC 0.866 vs best single channel 0.741). We additionally report an honest negative: re-clustering within the discriminant subspace by a direct mutual-information objective does not beat plain k-means (−0.029), because k-means in the discriminant subspace is already near mutual-information-optimal. The unifying finding is that the discriminant subspace is the operative element: nonlinearizing it helps; changing the within-subspace clustering objective does not. Every threshold was frozen before data examination and negatives are reported verbatim. Keywords / index terms: single-token compression; class-discriminant codebook; kernel discriminant analysis; Nyström approximation; calibration; novelty detection; Mahalanobis distance; conformal prediction; anomaly fusion; mutual information; electrocardiogram; pre-registration; spiral-domain encoder; H-pipeline. References: 1. R. A. Fisher, "The use of multiple measurements in taxonomic problems," Annals of Eugenics, 1936. 2. B. Schölkopf, A. Smola, and K.-R. Müller, "Nonlinear component analysis as a kernel eigenvalue problem," Neural Computation, 1998. 3. C. Williams and M. Seeger, "Using the Nyström method to speed up kernel machines," NeurIPS, 2001. 4. G. Baudat and F. Anouar, "Generalized discriminant analysis using a kernel approach," Neural Computation, 2000. 5. C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, "On calibration of modern neural networks," ICML, 2017. 6. V. Vovk, A. Gammerman, and G. Shafer, Algorithmic Learning in a Random World, Springer, 2005. 7. N. Tishby, F. Pereira, and W. Bialek, "The information bottleneck method," 1999. 8. P. Wagner et al., "PTB-XL, a large publicly available electrocardiography dataset," Scientific Data, 2020. 9. B. Efron and R. J. Tibshirani, An Introduction to the Bootstrap, Chapman Parent O, U.S. Provisional Application No. 64/096,004, filed 2026-06-22 (the Mahalanobis + split-conformal novelty monitor of §4.6). Both build on the spiral-domain H-pipeline applications (Parents H/I/J/K/L/M). Licensing inquiries: Randolph James Ferlic, M.D., randolphf@fieldstoneanalyticsllc.com. Reproducibility archive released under CC-BY 4.0.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo