Abstract Background Serum total testosterone (TT) interacts with multiple physiological systems and is implicated in heterogeneous aging processes in men. However, aging-related phenotypes associated with TT are unlikely to be captured by single biomarkers or conventional clinical categories. This study aims to identify data-driven aging phenotypes based on TT and related clinical biomarkers using an unsupervised analytical framework. Methods Clinical laboratory data from 5,877 Japanese male patients undergoing routine health evaluations are analyzed. After restricting the cohort to individuals with complete age and body mass index data, missing values in other variables are imputed using column-wise mean imputation. Unsupervised clustering is performed using K-means on standardized biomarkers related to endocrine, metabolic, inflammatory, and renal function. Principal component analysis and correlation network analysis are used for visualization. External validation uses cancer prevalence data from the NHANES dataset. Results Four physiological clusters are identified. One cluster shows low TT levels, elevated inflammatory markers, impaired renal function, and higher cancer prevalence in external validation, indicating a high-risk aging profile. Other clusters show preserved hormonal and metabolic profiles. Network analysis reveals cluster-specific differences in the centrality of TT within physiological networks. Principal component analysis shows overlapping cluster distributions, reflecting continuous aging-related variation. Conclusions Unsupervised clustering of TT-related biomarkers reveals aging phenotypes beyond conventional clinical classifications. TT functions as part of an integrated physiological network rather than as an isolated marker. These findings support a systems-level perspective on male aging and demonstrate utility of data-driven phenotyping, while acknowledging the descriptive and cross-sectional nature of the analysis.
Okui et al. (Wed,) studied this question.