Abstract
Background
Machine-learning methods are gaining in popularity to predict medical events but their added value to other methods is still to be determined. We compared performances of clinical prediction models for bronchopulmonary dysplasia (BPD) or death in very preterm infants using logistic regression and random forests methods.
Methods
Two population-based cohorts of very preterm infants were used: EPIPAGE-2 (France, 2011) for development and internal validation and EPICE (Europe, 2011) for external validation. Eligible infants were born before 30 weeks’ gestation and admitted in neonatal units. BPD was defined as any respiratory support at 36 weeks postmenstrual age. Candidate predictors were available shortly after birth or at day 3. Logistic regression and random forest models performance was assessed in terms of discrimination (c-statistic) and calibration plots.
Results
Prevalence of BPD/death was 32.1% (668/1923) in EPIPAGE-2 and 41.0% (1368/3335) in EPICE. At both time points, logistic regression and random forest models showed similar performance during internal validation. At birth, external validation in EPICE showed good discrimination (logistic regression model: c-statistics 0.81, 95% CI 0.80–0.83; random forest: 0.80, 95% CI 0.79–0.81) but both models underestimated the probability of BPD/death. Model performances were heterogeneous throughout European regions.
Conclusions
Both modelling methods performed similarly to predict BPD/death shortly after birth in very preterm children.
Impact
-
Whether machine-learning methods predict better short-term respiratory outcomes in very preterm infants than logistic regression models is debated.
-
Random forest-based prediction models did not perform better than logistic regression to predict bronchopulmonary dysplasia or death shortly after birth in very preterm infants.
-
Calibration performances varied among European countries.
-
While offering the same performance, regression models are easier to understand, to disseminate and to apply to different populations.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 14 print issues and online access
$259.00 per year
only $18.50 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout



Similar content being viewed by others
Data availability
The datasets analysed during the current study are available from the principal investigators of both cohorts (Jennifer Zeitlin and Pierre-Yves Ancel) on reasonable request.
References
Siffel, C., Kistler, K. D., Lewis, J. F. M. & Sarda, S. P. Global incidence of bronchopulmonary dysplasia among extremely preterm infants: a systematic literature review. J. Matern-Fetal Neonatal Med. 34, 1721–1731 (2021).
Twilhaar, E. S. et al. Cognitive outcomes of children born extremely or very preterm since the 1990s and associated risk factors: a meta-analysis and meta-regression. JAMA Pediatr. 172, 361–367 (2018).
Doyle, L. W., Halliday, H. L., Ehrenkranz, R. A., Davis, P. G. & Sinclair, J. C. An update on the impact of postnatal systemic corticosteroids on mortality and cerebral palsy in preterm infants: effect modification by risk of bronchopulmonary dysplasia. J. Pediatr. 165, 1258–1260 (2014).
Onland, W. et al. Clinical prediction models for bronchopulmonary dysplasia: a systematic review and external validation study. BMC Pediatr. 13, 207 (2013).
Laughon, M. M. et al. Prediction of bronchopulmonary dysplasia by postnatal age in extremely premature infants. Am. J. Respir. Crit. Care Med. 183, 1715–1722 (2011).
Baud, O., Laughon, M. & Lehert, P. Survival without bronchopulmonary dysplasia of extremely preterm infants: a predictive model at birth. Neonatology. 118, 385–393 (2021).
Baker, E. K. & Davis, P. G. Bronchopulmonary dysplasia outcome estimator in current neonatal practice. Acta Paediatr. 110, 166–167 (2021).
Greenberg, R. G. et al. Online clinical tool to estimate risk of bronchopulmonary dysplasia in extremely preterm infants. Arch. Dis. Child Fetal Neonatal Ed. 107, 683–648 (2022).
Jensen, E. A. et al. The diagnosis of bronchopulmonary dysplasia in very preterm infants. an evidence-based approach. Am. J. Respir. Crit. Care Med. 200, 751–759 (2019).
Higgins, R. D. et al. Bronchopulmonary dysplasia: executive summary of a workshop. J. Pediatr. 197, 300–308 (2018).
Lei, J. et al. Risk identification of bronchopulmonary dysplasia in premature infants based on machine learning. Front. Pediatr. 9, 719352 (2021).
Khurshid, F. et al. Comparison of multivariable logistic regression and machine learning models for predicting bronchopulmonary dysplasia or death in very preterm infants. Front. Pediatr. 9, 759776 (2021).
Miotto, R., Wang, F., Wang, S., Jiang, X. & Dudley, J. T. Deep learning for healthcare: review, opportunities and challenges. Brief. Bioinform. 19, 1236–1246 (2018).
Christodoulou, E. et al. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J. Clin. Epidemiol. 110, 12–22 (2019).
Jaskari, J. et al. Machine learning methods for neonatal mortality and morbidity classification. IEEE Access; https://doi.org/10.1109/ACCESS.2020.3006710 (2020).
Shu, C. H. et al. Early prediction of mortality and morbidities in VLBW preterm neonates using machine learning. Pediatr. Res. https://doi.org/10.1038/s41390-024-03604-7 (2024).
Ancel, P. Y. & Goffinet, F. EPIPAGE 2 Writing Group. EPIPAGE 2: a preterm birth cohort in France in 2011. BMC Pediatr. 14, 97 (2014).
Zeitlin, J. et al. Use of evidence based practices to improve survival without severe morbidity for very preterm infants: results from the EPICE population based cohort. BMJ 354, i2976 (2016).
Romijn, M. et al. Prediction models for bronchopulmonary dysplasia in preterm infants: a systematic review and meta-analysis. J. Pediatr. S0022-3476, 00051–00053 (2023).
Zeitlin, J. et al. Variation in term birthweight across European countries affects the prevalence of small for gestational age among very preterm infants. Acta Paediatr. 106, 1447–1455 (2017).
Riley, R. D. et al. Minimum sample size for developing a multivariable prediction model: PART II - binary and time-to-event outcomes. Stat. Med. 38, 1276–1296 (2019).
Riley, R. D. et al. Minimum sample size for external validation of a clinical prediction model with a binary outcome. Stat. Med. 40, 4230–4251 (2021).
Wood, A. M., White, I. R. & Royston, P. How should variable selection be performed with multiply imputed data?. Stat. Med. 27, 3227–3246 (2008).
Rubin, D. B. & Schenker, N. Multiple imputation in health-care databases: an overview and some applications. Stat. Med. 10, 585–598 (1991).
Vickers, A. J., van Calster, B. & Steyerberg, E. W. A simple, step-by-step guide to interpreting decision curve analysis. Diagn. Progn. Res 3, 18 (2019).
Moons, K. G. M. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann. Intern. Med. 162, W1–W73 (2015).
Strobl, C. et al. Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinform. 8, 1–21 (2007).
van der Ploeg, T., Austin, P. C. & Steyerberg, E. W. Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints. BMC Med. Res. Methodol. 14, 137 (2014).
Steyerberg, E. W. & Harrell, F. E. Prediction models need appropriate internal, internal–external, and external validation. J. Clin. Epidemiol. 69, 245–247 (2016).
Mangold, C. et al. Machine Learning models for predicting neonatal mortality: a systematic review. Neonatology. 118, 394–405 (2021).
van Beek, P. E., Andriessen, P., Onland, W. & Schuit, E. Prognostic models predicting mortality in preterm infants: systematic review and meta-analysis. Pediatrics 147, e2020020461 (2021).
Van Calster, B. et al. Calibration: the Achilles heel of predictive analytics. BMC Med. 17, 230 (2019).
Edstedt Bonamy, A. K. et al. Wide variation in severe neonatal morbidity among very preterm infants in European regions. Arch. Dis. Child Fetal Neonatal Ed. 104, F36–F45 (2019).
Jobe, A. H. & Bancalari, E. Bronchopulmonary dysplasia. Am. J. Respir. Crit. Care Med. 163, 1723–1729 (2001).
Valenzuela-Stutman, D. et al. Bronchopulmonary dysplasia: risk prediction models for very-low- birth-weight infants. J. Perinatol. 39, 1275–1281 (2019).
Acknowledgements
We thank the physicians, nurses, and other healthcare providers at the participating institutions for their contributions, the babies and their parents who allowed us to gather data related to their children. The EPIPAGE-2 Study was supported by the French Institute of Public Health Research, French Health Ministry, National Institute of Health and Medical Research, National Institute of Cancer, National Solidarity Fund for Autonomy, the National Research Agency through the French Equipex Program of Investments in the Future (grant no. ANR-11-AQPX-0038) and the PremUp Foundation. The EPICE cohort received funding from the European Union’s Seventh Framework Programme [FP7/2007-2013] under grant agreement n°259882. Additional funding: Poland (2012–2015 allocation of funds for international projects from the Polish Ministry of Science and Higher Education); Sweden (Stockholm County Council: ALF-project and Clinical Research Appointment and Department of Neonatal Medicine, Karolinska University Hospital), UK (funding for The Neonatal Survey from Neonatal Networks for East Midlands and Yorkshire & Humber regions). H.T. received grants from the French Society of Neonatology and from the French paediatric pneumology and allergology society. P.D. was supported by the NIHR Biomedical Research Centre, Oxford. G.S.C. was supported by the NIHR Biomedical Research Centre, Oxford, and Cancer Research UK (programme grant: C49297/A27294).
Author contributors
Heloise Torchin, Paula Dhiman and Gary S Collins conceptualised and designed the study, carried out the initial analyses and drafted the initial manuscript and critically reviewed and revised the manuscript. Pierre-Yves Ancel and Jennifer Zeitlin obtained funding, collected data, made substantial contributions to interpretation of data and critically reviewed and revised the manuscript. Xavier Durrmeyer, Pierre-Henri Jarreau, Alexandra Nuytten and Patrick Truffert made substantial contributions to interpretation of data and critically reviewed and revised the manuscript for important intellectual content.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Consent statement
Parental consent was obtained for all children participating in the EPIPAGE-2 and EPICE cohorts. No additional consent was required for this study.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Torchin, H., Dhiman, P., Ancel, PY. et al. Early prediction of bronchopulmonary dysplasia: comparison of modelling methods, development and validation studies. Pediatr Res 99, 88–95 (2026). https://doi.org/10.1038/s41390-025-04170-2
Received:
Revised:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41390-025-04170-2
This article is cited by
-
Predicting bronchopulmonary dysplasia at birth—lack of models or lack of data?
Pediatric Research (2025)


