What question did this study set out to answer?

The study aims to explore whether double descent occurs in survival analysis and its interaction with censoring and performance metrics.

June 6, 2026Open Access

The survival double descent: generalization dynamics of deep neural networks in time-to-event analysis

Puntos clave

The study aims to explore whether double descent occurs in survival analysis and its interaction with censoring and performance metrics.
Synthetic survival data generated from Weibull hazards with controlled censoring.
Investigation of model capacity variations from under to over parameterized regimes.
Validation using METABRIC breast cancer cohort and SUPPORT study data.
Double descent was confirmed in survival models with calibration plateauing while discrimination varied.
Cox partial likelihood focuses on rankings, leading to extreme risk scores that misestimate survival probabilities.
IBS saturates at a constant value as model width increases, revealing limitations of discrimination-based model selection.

Resumen

Recent work on double descent has challenged classical bias-variance tradeoffs, showing that test error can decrease, increase sharply near the interpolation threshold, and then decrease again as model capacity grows. This phenomenon has been documented in regression and classification, but its relevance to survival analysis remains unclear. Survival data are subject to censoring, which obscures true event times, and widely used models such as the Cox proportional hazards model are optimized via partial likelihoods that emphasize ranking rather than calibrated risk estimation. It is therefore unknown whether double descent occurs in this setting, how censoring influences its manifestation, or how it interacts with standard performance metrics.We investigate these questions using synthetic survival data generated from Weibull hazards with controlled censoring, allowing systematic variation of model capacity from under to over parameterized regimes.While we verified double descent occurs in survival models, calibration plateaus and decouples from discrimination, even under strong Formula: see text regularization. This decoupling arises because the Cox partial likelihood optimizes rankings rather than magnitudes, producing extreme risk scores that break the Breslow estimator used to estimate survival probabilities. Validation on two real-world clinical datasets, the METABRIC breast cancer cohort and the SUPPORT study of seriously ill hospitalized adults, confirms the calibration-discrimination decoupling: IBS saturates at a constant value as network width grows while concordance varies, reproducing the primary synthetic finding across distinct clinical domains and sample sizes. These results highlight limitations of discrimination-based model selection in survival analysis and underscore the need for calibration-aware evaluation in high-capacity prognostic models.

Me gusta

Guardar

Ver artículo completo