TP53 is the most frequently mutated gene in human cancers, and germline mutations in TP53 cause Li-Fraumeni syndrome (LFS), a hereditary predisposition to diverse cancers. Accurate annotation of TP53 mutations based on their survival effects is critical for informed LFS patient management. Motivated by this need, we develop a new approach for Survival-based Clustering of Predictors (SCP) by identifying homogeneous coefficients in Cox regression. We formulate this task as a fusion-penalized Cox regression problem and provide an efficient computational algorithm. A nonconvex distance-to-set penalty is adopted to facilitate parameter tuning and improve estimation accuracy. To overcome data limitations, we further develop TLSCP, a transfer learning extension that borrows coefficient ranking information from a source dataset under the assumption of similar ranking patterns between source and target. TL-SCP integrates ranking information through weighted rank averaging, allowing flexibility in accommodating cohort heterogeneity while maintaining model simplicity. Simulation studies demonstrate TL-SCP's superior performance over SCP in clustering recovery and coefficient estimation. In the application of TP53 mutation annotation where we utilize non-LFS germline TP53 mutation carriers as a source cohort for the target LFS cohort, TL-SCP identifies biologically meaningful TP53 mutation clusters and offers improved clinical interpretability compared to experiment-based annotations.
Liu et al. (Mon,) studied this question.