• Day-1 chest radiograph predicts later lung disease risk (area under curve ≈ 0.78). • Chest-x-ray pretraining beats natural-image pretraining for neonatal scans. • Progressive layer freezing with a brief linear probe works best. • Routine respiratory distress grades only weakly predict later lung disease. Bronchopulmonary dysplasia (BPD) is a chronic lung disease affecting 35% of extremely low birth weight infants and is defined by oxygen dependence at 36 weeks postmenstrual age. Preventive interventions carry severe risks and early prediction is crucial to avoid unnecessary toxicity in low-risk infants. Admission radiographs of extremely preterm infants are routinely acquired within 24h of life and could serve as a non-invasive prognostic tool. We developed a deep learning approach using day 1 chest X-rays from 163 extremely low-birth-weight infants (≤32 weeks gestation, 401-999g). We fine-tuned a ResNet-50 pretrained specifically on adult chest radiographs, employing progressive layer freezing with discriminative learning rates to prevent overfitting and evaluated a CutMix augmentation and linear probing. Complementing prior insights that compare architectures and acquisition timing, we ablate the effects of initialization domain and compute-light fine-tuning choices on performance on small day-1 neonatal CXR cohorts, yielding practical training guidance for site-level and federated deployment. For moderate/severe BPD outcome prediction, our best performing model with progressive freezing, linear probing and CutMix achieved an AUROC of 0.78 ± 0.10, balanced accuracy of 0.69 ± 0.10, and an F1-score of 0.67 ± 0.11. In-domain pre-training significantly outperformed ImageNet initialization (p = 0.031) highlighting the importance of domain-specific pretraining. Routine IRDS grades showed limited prognostic value (AUROC 0.57 ± 0.11), motivating learned image markers. Our approach demonstrates that domain-specific pretraining enables accurate BPD prediction from routine day-1 radiographs. Through progressive freezing and linear probing, the method remains computationally feasible for site-level implementation.
Goedicke-Fritz et al. (Sun,) studied this question.