Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Methodology
  • Published:

Early prediction of bronchopulmonary dysplasia: comparison of modelling methods, development and validation studies

Abstract

Background

Machine-learning methods are gaining in popularity to predict medical events but their added value to other methods is still to be determined. We compared performances of clinical prediction models for bronchopulmonary dysplasia (BPD) or death in very preterm infants using logistic regression and random forests methods.

Methods

Two population-based cohorts of very preterm infants were used: EPIPAGE-2 (France, 2011) for development and internal validation and EPICE (Europe, 2011) for external validation. Eligible infants were born before 30 weeks’ gestation and admitted in neonatal units. BPD was defined as any respiratory support at 36 weeks postmenstrual age. Candidate predictors were available shortly after birth or at day 3. Logistic regression and random forest models performance was assessed in terms of discrimination (c-statistic) and calibration plots.

Results

Prevalence of BPD/death was 32.1% (668/1923) in EPIPAGE-2 and 41.0% (1368/3335) in EPICE. At both time points, logistic regression and random forest models showed similar performance during internal validation. At birth, external validation in EPICE showed good discrimination (logistic regression model: c-statistics 0.81, 95% CI 0.80–0.83; random forest: 0.80, 95% CI 0.79–0.81) but both models underestimated the probability of BPD/death. Model performances were heterogeneous throughout European regions.

Conclusions

Both modelling methods performed similarly to predict BPD/death shortly after birth in very preterm children.

Impact

  • Whether machine-learning methods predict better short-term respiratory outcomes in very preterm infants than logistic regression models is debated.

  • Random forest-based prediction models did not perform better than logistic regression to predict bronchopulmonary dysplasia or death shortly after birth in very preterm infants.

  • Calibration performances varied among European countries.

  • While offering the same performance, regression models are easier to understand, to disseminate and to apply to different populations.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Calibration plots of the logistic regression model for the outcome BPD/death in the EPICE cohort (external validation).
Fig. 2: External validation of the logistic regression model at NICU admission in each of the countries participating to the EPICE cohort.
Fig. 3: Decision curve analysis comparing the logistic regression model, the random forest model and gestational age as only variable for the prediction of BPD/death in the EPICE cohort.

Similar content being viewed by others

Data availability

The datasets analysed during the current study are available from the principal investigators of both cohorts (Jennifer Zeitlin and Pierre-Yves Ancel) on reasonable request.

References

  1. Siffel, C., Kistler, K. D., Lewis, J. F. M. & Sarda, S. P. Global incidence of bronchopulmonary dysplasia among extremely preterm infants: a systematic literature review. J. Matern-Fetal Neonatal Med. 34, 1721–1731 (2021).

    Article  PubMed  Google Scholar 

  2. Twilhaar, E. S. et al. Cognitive outcomes of children born extremely or very preterm since the 1990s and associated risk factors: a meta-analysis and meta-regression. JAMA Pediatr. 172, 361–367 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  3. Doyle, L. W., Halliday, H. L., Ehrenkranz, R. A., Davis, P. G. & Sinclair, J. C. An update on the impact of postnatal systemic corticosteroids on mortality and cerebral palsy in preterm infants: effect modification by risk of bronchopulmonary dysplasia. J. Pediatr. 165, 1258–1260 (2014).

    Article  CAS  PubMed  Google Scholar 

  4. Onland, W. et al. Clinical prediction models for bronchopulmonary dysplasia: a systematic review and external validation study. BMC Pediatr. 13, 207 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Laughon, M. M. et al. Prediction of bronchopulmonary dysplasia by postnatal age in extremely premature infants. Am. J. Respir. Crit. Care Med. 183, 1715–1722 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Baud, O., Laughon, M. & Lehert, P. Survival without bronchopulmonary dysplasia of extremely preterm infants: a predictive model at birth. Neonatology. 118, 385–393 (2021).

  7. Baker, E. K. & Davis, P. G. Bronchopulmonary dysplasia outcome estimator in current neonatal practice. Acta Paediatr. 110, 166–167 (2021).

    Article  PubMed  Google Scholar 

  8. Greenberg, R. G. et al. Online clinical tool to estimate risk of bronchopulmonary dysplasia in extremely preterm infants. Arch. Dis. Child Fetal Neonatal Ed. 107, 683–648 (2022).

  9. Jensen, E. A. et al. The diagnosis of bronchopulmonary dysplasia in very preterm infants. an evidence-based approach. Am. J. Respir. Crit. Care Med. 200, 751–759 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Higgins, R. D. et al. Bronchopulmonary dysplasia: executive summary of a workshop. J. Pediatr. 197, 300–308 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Lei, J. et al. Risk identification of bronchopulmonary dysplasia in premature infants based on machine learning. Front. Pediatr. 9, 719352 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Khurshid, F. et al. Comparison of multivariable logistic regression and machine learning models for predicting bronchopulmonary dysplasia or death in very preterm infants. Front. Pediatr. 9, 759776 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Miotto, R., Wang, F., Wang, S., Jiang, X. & Dudley, J. T. Deep learning for healthcare: review, opportunities and challenges. Brief. Bioinform. 19, 1236–1246 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Christodoulou, E. et al. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J. Clin. Epidemiol. 110, 12–22 (2019).

    Article  PubMed  Google Scholar 

  15. Jaskari, J. et al. Machine learning methods for neonatal mortality and morbidity classification. IEEE Access; https://doi.org/10.1109/ACCESS.2020.3006710 (2020).

  16. Shu, C. H. et al. Early prediction of mortality and morbidities in VLBW preterm neonates using machine learning. Pediatr. Res. https://doi.org/10.1038/s41390-024-03604-7 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Ancel, P. Y. & Goffinet, F. EPIPAGE 2 Writing Group. EPIPAGE 2: a preterm birth cohort in France in 2011. BMC Pediatr. 14, 97 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Zeitlin, J. et al. Use of evidence based practices to improve survival without severe morbidity for very preterm infants: results from the EPICE population based cohort. BMJ 354, i2976 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Romijn, M. et al. Prediction models for bronchopulmonary dysplasia in preterm infants: a systematic review and meta-analysis. J. Pediatr. S0022-3476, 00051–00053 (2023).

    Google Scholar 

  20. Zeitlin, J. et al. Variation in term birthweight across European countries affects the prevalence of small for gestational age among very preterm infants. Acta Paediatr. 106, 1447–1455 (2017).

    Article  PubMed  Google Scholar 

  21. Riley, R. D. et al. Minimum sample size for developing a multivariable prediction model: PART II - binary and time-to-event outcomes. Stat. Med. 38, 1276–1296 (2019).

    Article  PubMed  Google Scholar 

  22. Riley, R. D. et al. Minimum sample size for external validation of a clinical prediction model with a binary outcome. Stat. Med. 40, 4230–4251 (2021).

    Article  PubMed  Google Scholar 

  23. Wood, A. M., White, I. R. & Royston, P. How should variable selection be performed with multiply imputed data?. Stat. Med. 27, 3227–3246 (2008).

    Article  PubMed  Google Scholar 

  24. Rubin, D. B. & Schenker, N. Multiple imputation in health-care databases: an overview and some applications. Stat. Med. 10, 585–598 (1991).

    Article  CAS  PubMed  Google Scholar 

  25. Vickers, A. J., van Calster, B. & Steyerberg, E. W. A simple, step-by-step guide to interpreting decision curve analysis. Diagn. Progn. Res 3, 18 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Moons, K. G. M. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann. Intern. Med. 162, W1–W73 (2015).

    Article  PubMed  Google Scholar 

  27. Strobl, C. et al. Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinform. 8, 1–21 (2007).

    Article  Google Scholar 

  28. van der Ploeg, T., Austin, P. C. & Steyerberg, E. W. Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints. BMC Med. Res. Methodol. 14, 137 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Steyerberg, E. W. & Harrell, F. E. Prediction models need appropriate internal, internal–external, and external validation. J. Clin. Epidemiol. 69, 245–247 (2016).

    Article  PubMed  Google Scholar 

  30. Mangold, C. et al. Machine Learning models for predicting neonatal mortality: a systematic review. Neonatology. 118, 394–405 (2021).

  31. van Beek, P. E., Andriessen, P., Onland, W. & Schuit, E. Prognostic models predicting mortality in preterm infants: systematic review and meta-analysis. Pediatrics 147, e2020020461 (2021).

    Article  PubMed  Google Scholar 

  32. Van Calster, B. et al. Calibration: the Achilles heel of predictive analytics. BMC Med. 17, 230 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Edstedt Bonamy, A. K. et al. Wide variation in severe neonatal morbidity among very preterm infants in European regions. Arch. Dis. Child Fetal Neonatal Ed. 104, F36–F45 (2019).

    Article  PubMed  Google Scholar 

  34. Jobe, A. H. & Bancalari, E. Bronchopulmonary dysplasia. Am. J. Respir. Crit. Care Med. 163, 1723–1729 (2001).

    Article  CAS  PubMed  Google Scholar 

  35. Valenzuela-Stutman, D. et al. Bronchopulmonary dysplasia: risk prediction models for very-low- birth-weight infants. J. Perinatol. 39, 1275–1281 (2019).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank the physicians, nurses, and other healthcare providers at the participating institutions for their contributions, the babies and their parents who allowed us to gather data related to their children. The EPIPAGE-2 Study was supported by the French Institute of Public Health Research, French Health Ministry, National Institute of Health and Medical Research, National Institute of Cancer, National Solidarity Fund for Autonomy, the National Research Agency through the French Equipex Program of Investments in the Future (grant no. ANR-11-AQPX-0038) and the PremUp Foundation. The EPICE cohort received funding from the European Union’s Seventh Framework Programme [FP7/2007-2013] under grant agreement n°259882. Additional funding: Poland (2012–2015 allocation of funds for international projects from the Polish Ministry of Science and Higher Education); Sweden (Stockholm County Council: ALF-project and Clinical Research Appointment and Department of Neonatal Medicine, Karolinska University Hospital), UK (funding for The Neonatal Survey from Neonatal Networks for East Midlands and Yorkshire & Humber regions). H.T. received grants from the French Society of Neonatology and from the French paediatric pneumology and allergology society. P.D. was supported by the NIHR Biomedical Research Centre, Oxford. G.S.C. was supported by the NIHR Biomedical Research Centre, Oxford, and Cancer Research UK (programme grant: C49297/A27294).

Author contributors

Heloise Torchin, Paula Dhiman and Gary S Collins conceptualised and designed the study, carried out the initial analyses and drafted the initial manuscript and critically reviewed and revised the manuscript. Pierre-Yves Ancel and Jennifer Zeitlin obtained funding, collected data, made substantial contributions to interpretation of data and critically reviewed and revised the manuscript. Xavier Durrmeyer, Pierre-Henri Jarreau, Alexandra Nuytten and Patrick Truffert made substantial contributions to interpretation of data and critically reviewed and revised the manuscript for important intellectual content.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heloise Torchin.

Ethics declarations

Competing interests

The authors declare no competing interests.

Consent statement

Parental consent was obtained for all children participating in the EPIPAGE-2 and EPICE cohorts. No additional consent was required for this study.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Torchin, H., Dhiman, P., Ancel, PY. et al. Early prediction of bronchopulmonary dysplasia: comparison of modelling methods, development and validation studies. Pediatr Res 99, 88–95 (2026). https://doi.org/10.1038/s41390-025-04170-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41390-025-04170-2

This article is cited by

Search

Quick links