Abstract
Preterm birth increases the risk of adverse neurodevelopmental outcomes, including autism spectrum disorder (ASD). Separately, polygenic risk scores (PRS) evaluate genetic liability to ASD. While both elevated PRS and preterm birth contribute to ASD risk, the extent to which gestational age interacts with genetic liability is unknown. We analyzed ASD PRS in 2387 individuals from the Center for Applied Genomics database, stratified by both ASD diagnosis (531 with ASD, 1856 without ASD) and gestational age (588 preterm, 1799 term). Term-born children with ASD had significantly higher ASD PRS compared to termborn controls (OR 1.17, 95% CI 1.03–1.32, p = 0.017), replicating prior studies. In contrast, no significant genetic difference was observed within our preterm cohort when stratifying by ASD diagnosis. We further identified a novel signal; preterm children overall had significantly lower ASD PRS compared to termborn children, even after controlling for ASD prevalence (coefficient: − 0.14, 95% CI − 0.21 to − 0.06, p = 0.00095). These findings suggest the predictive utility of ASD PRS may be contingent on gestational context. Lower genetic signal in preterm ASD could reflect stronger environmental stress, increased contributions from rare de novo variants, or undiscovered preterm-specific genetic risk loci.
Introduction
An estimated 11% of babies worldwide are born prematurely,1 defined by WHO as delivery before 37 weeks of gestation2. Yet, as survival rates for premature infants improve and the limits of viability are pushed earlier, gains in long-term outcomes have lagged3. Survivors remain disproportionately affected by neurodevelopmental differences when compared to their term counterparts4.
Predicting an individual patient’s risk for a poor outcome is challenging, with experienced neonatologists predicting significantly different patient outcomes based on the same standardized patient scenario5. Many studies have classified preterm neonates at highest risk for poor outcomes based on clinical and epidemiological predictors6,7,8,9. While several well-established prognostic models exist, none integrate genetic predictors,9 representing an untapped opportunity to improve individualized risk assessment.
Autism spectrum disorder (ASD) is a neurodevelopmental disorder linked to prematurity. In a national Swedish cohort of over 4 million people, 6.1% born extremely preterm (22–27 weeks) vs. 1.4% born at term (39–41 weeks) were diagnosed with autism, with each additional week of gestation associated with a ~ 5% lower prevalence of ASD10. Similarly, in a USA-based cohort of extremely premature infants (ELGAN), autism prevalence was 4 times higher than in the general population at 10 years of age11. Prior research demonstrates that some significant early-life clinical events, such as intraventricular hemorrhage,12 postnatal steroid exposure,13 and intrauterine growth restriction,14 are associated with developing autism in premature infants.
Separately, there is also a strong genetic component to the development of ASD. Based on a meta-analysis of twin studies, the heritability of autism is estimated to be 64–91%15. While genome-wide association studies (GWAS) have identified numerous loci contributing to ASD susceptibility,16 the cumulative contribution of common variants, quantified as polygenic risk scores (PRS), has emerged as a tool for assessing individual-level genetic liability to ASD. Higher autism PRS is associated with ASD traits in cohort studies17. While both higher PRS and preterm birth are independent risk factors for ASD, the extent to which preterm birth interacts with or modifies underlying genetic liability remains unknown.
A recent study by Zhang et al. analyzed large cohorts of children with ASD and reported that children with ASD who were born preterm exhibited more severe phenotypic profiles despite showing comparable levels of ASD polygenic risk as term-born individuals with ASD18. These findings suggest that preterm birth may act as an independent risk modifier or influence phenotypic expression without significantly altering polygenic burden. In a separate study, autism risk from a history of mental disorders in the immediate family was not explained by individual PRS, indicating that family history and PRS are best viewed as complementary measures of family-based ASD risk19.
To further explore the relationship between genetic liability and gestational age at birth, we conducted an independent analysis of ASD polygenic risk scores across four groups stratified by both ASD diagnosis and gestational age: preterm with ASD, preterm without ASD, term with ASD, and term without ASD. We hypothesized that polygenic liability would differ by diagnosis status, with the highest PRS average observed in our cohort of term children diagnosed with autism, replicating prior findings.
Method
Subjects and genotyping
Study participants were recruited through the Center for Applied Genomics (CAG) at Children’s Hospital of Philadelphia (CHOP), using the CHOP Health Care Network. All protocols were approved by the CHOP Institutional Review Board, and written informed consent was obtained from participants or their legal guardians by trained medical personnel under physician supervision. Inclusion criteria for control subjects required the absence of any major medical conditions, with exclusion of individuals with a personal history or current diagnosis of cancer. Only participants with genetic similarity to a European referent group (as determined by the first two components of principal components analysis) were included to match the patient population in which the existing autism PRS was originally validated. Birth history, including gestational age at delivery, was documented through clinical records, allowing classification into preterm (gestational age < 37 weeks) and term (≥ 37 weeks) subgroups. ICD-10 codes (defined as any code containing F84) in the medical chart were used to identify individuals with an ASD diagnosis. We included a total of 588 preterm (70 with ASD and 518 without ASD) and 1799 term (461 with ASD and 1,388 without ASD) children in our study (Table 1).
Genomic DNA from enrolled individuals was genotyped using high-density SNP arrays, either the Illumina HumanHap550 or HumanHap610 platforms. Rigorous quality control procedures were implemented: samples with genotype call rates below 95% were excluded, and SNPs were removed if they exhibited a minor allele frequency (MAF) less than 1%, call rate below 98%, or deviated significantly from Hardy–Weinberg equilibrium (P < 1 × 10⁻6). Genotype imputation was performed using the TOPMed Imputation Server20 and the minimac4 algorithm, referencing a comprehensive panel derived from over 100,000 whole-genome sequences. Post-imputation, only variants with MAF > 1% and imputation quality score (Rsq) > 0.5 were retained to ensure robust downstream analyses.
Polygenic risk score analysis
To quantify individual-level common variant burden for ASD, we computed PRS using the PRS-CS (Polygenic Risk Score-Continuous Shrinkage) method21. PRS-CS is a Bayesian regression framework that infers posterior SNP effect sizes under continuous shrinkage priors, incorporating linkage disequilibrium (LD) patterns from a reference panel to improve the accuracy of effect size estimation.
ASD GWAS summary statistics used for PRS calculation were derived from the Psychiatric Genomics Consortium (PGC) ASD meta-analysis (2017 release),16 comprising 46,351 individuals. PRS-CS was run with default parameters using the European LD reference panel from the 1000 Genomes Project (Phase 3). The resulting posterior SNP effect sizes were used to compute individual PRS via the --score function in PLINK v1.9.
To account for population structure, we performed principal component analysis (PCA) on genotyped variants using PLINK. The top 10 genetic-similarity principal components and sex assigned at birth were included as covariates in regression models.
Statistical analysis
PRS values were converted into z-scores prior to visualization and statistical testing. We analyzed the PRS distribution across four stratified groups: preterm children with ASD, preterm children without ASD, term children with ASD, and term children without ASD via generation of a violin plot. The violin plot displays the median, interquartile range, and kernel density of PRS within each group. We then performed logistic regression models to evaluate for differences in common variant burden across groups, including the top 10 ancestry principal components and sex assigned at birth as covariates. Our threshold for statistical significance was p < 0.05.
Finally, we performed an exploratory analysis via linear regression on PRS score by gestational age. For this analysis, autism diagnosis status was not considered. Thus, we standardized the proportion of individuals with ASD across the term and preterm cohort by only including a random sample of 214 individuals born at term with ASD in this model.
Ethics statement
All methods were performed in accordance with the relevant guidelines and regulations. The study was approved by the Institutional Review Board of the Children’s Hospital of Philadelphia, and written informed consent (or assent with parental consent, as appropriate) was obtained from all participants or their legal guardians in accordance with the Declaration of Helsinki.
Results
We examined a cohort of 2,387 individuals to compare genetic liability for ASD stratified by gestational age and autism diagnosis status (Table 1). Figure 1 shows a violin plot of PRS distributions across cohorts, displaying the median, IQR, and density. Overall, individuals with ASD exhibited higher scores than their counterparts without ASD. Notably, a modest statistically significant difference in PRS was observed between term ASD and term non-ASD groups (logistic regression; OR 1.17, 95% CI 1.03–1.32, p = 0.017), indicating that common genetic variant burden is increased in individuals diagnosed with autism when born at term. Every increase in autism PRS score by 1 standard deviation was associated with a 17% increase in odds of autism diagnosis.
Violin plot showing the distribution of standardized polygenic risk scores (PRS) for ASD across four groups defined by ASD status and birth term: Preterm with ASD, Preterm without ASD, Term with ASD, and Term without ASD. Statistical significance was assessed using generalized linear models to control for ancestry. A significant PRS elevation was observed in the Term-ASD group compared to Term-Non-ASD (P = 0.017), whereas comparisons involving preterm subgroups were not significant (P > 0.25).
In contrast, PRS did not significantly differ between preterm ASD and preterm non-ASD groups (OR 1.18, 95% CI 0.88–1.60, p = 0.26) or between preterm ASD and term ASD groups (OR 0.84, 95% CI 0.62–1.14, p = 0.28) in logistic regression models. These findings align with previous observations from Zhang et al.18 which reported that ASD PRS is not significantly elevated in preterm-born ASD individuals compared to their non-ASD counterparts, despite marked phenotypic severity in the former group.
Interestingly, we found that PRS scores were overall lower in the preterm cohort compared to the term cohort when controlling for ASD prevalence (linear regression model; coefficient for preterm: − 0.14, 95% CI − 0.21 to − 0.06, p = 0.00095). Being born prematurely was associated with a reduction in standardized ASD PRS score by 0.14, indicating lower genetic liability from identified common variants. Similarly, unaffected preterm individuals had significantly lower PRS compared to unaffected term-born individuals (logistic regression, OR 0.83, 95% CI 0.74–0.94, p = 0.0026) (Fig. 2).
Forest plot of regression model results by comparison. (A) Results of logistic regression from the four-group analysis. Of note, PRS score reached statistical significance in two out of four models: The model comparing termborn children with ASD to termborn children without ASD and the model comparing pretermborn children without ASD to termborn children without ASD. (B) Results from an exploratory linear regression demonstrating overall lower ASD PRS score in the preterm cohort compared to the term cohort after adjusting for ASD prevalence.
Discussion
In this study, we explored the distribution of ASD PRS scores across four subgroups stratified by both gestational age and ASD diagnosis. Our findings contribute to the growing body of literature that seeks to clarify the relationship between genetic predisposition and clinical factors, particularly prematurity, in the etiology of ASD18.
We found that PRS for ASD is significantly higher among our cohort of term children diagnosed with ASD compared to our cohort of term children not diagnosed with ASD. Interestingly, this genetic signal was not observed among preterm individuals: neither the comparison between preterm ASD and preterm non-ASD, nor that between preterm ASD and term ASD, yielded significant differences. These results indicate that the polygenic architecture of ASD may be more detectable in term-born populations, whereas the genetic liability in preterm-born individuals may be obscured or modified by other mechanisms.
Conversely, there was also a small but statistically significant difference in ASD PRS between preterm vs. term-born individuals without an ASD diagnosis (OR 0.83, 95% CI 0.74–0.94), indicating that common variant burden is lower in preterm individuals without a diagnosis compared to term individuals without a diagnosis. This lower polygenic burden observed in unaffected preterm individuals compared to unaffected term individuals may reflect a protective role of low PRS score in this environmentally vulnerable population, though studies with larger sample sizes are needed to validate this effect.
Our observations align with the findings of Zhang et al.18 who reported that preterm-born individuals with ASD exhibited greater phenotypic severity and multimorbidity despite showing similar levels of PRS as term ASD individuals. They also align with Cullen et al. which found that ASD PRS did not have a significant interaction effect on cognition in preterm individuals22. The utility of adding PRS score, gestational age, and male gender into models to predict autism risk with 90% success as described by Zhang18 may be explained by a protective factor of low PRS burden in preterm individuals at highest risk for ASD.
One possible explanation for the lack of genetic signal in our preterm cohort with ASD is that the etiological basis of ASD in preterm individuals may involve a distinct contribution from rare de novo variants; one study demonstrated an elevated prevalence of pathogenic copy number variants in preterm patients compared to their parents or population databases23. In another study, genomes of preterm-born individuals had a significant increase in de novo mutation burden, and many of the genes affected were involved in fetal brain development24. As such, preterm individuals may be more likely to derive their genetic propensity for autism from rare variation, as opposed to the common variation quantified by ASD PRS.
Furthermore, early life complications associated with prematurity including hypoxia, neuroinflammation,25 postnatal steroid treatment,13 neonatal intensive care interventions such as high frequency ventilation,26 etc. may independently elevate risk for ASD in premature patients. In such cases, common variant burden as captured by PRS may play a comparatively minor role in driving neurodevelopmental outcomes. Finally, we applied an autism PRS score derived on a presumed term cohort. It is possible that different common variation drives the ASD risk in premature children when compared to term populations.
In line with possible differences in common variation distribution by gestational age, we found that preterm birth was associated with a reduction in standardized ASD PRS score by 0.14, indicating that the distribution of PRS scores in the preterm cohort was negatively skewed compared to the term cohort. This difference may reflect higher autism PRS protecting against preterm birth through other associated traits. For example, the positive association of autism PRS with educational attainment16 may make mothers with higher genetic liability less likely to deliver prematurely given that lower maternal educational level is a well-known risk factor for preterm birth27. Further research to understand the differences in common variant distribution in preterm vs. term populations is needed to inform the application of term-derived PRS scores for predictive modeling in preterm populations.
Overall, our findings suggest that the predictive utility of PRS in ASD risk stratification may be contingent on gestational age context. While PRS could serve as a useful risk indicator among term-born neonates, it appears less informative in preterm populations. This highlights the importance of integrating genetic and clinical context when interpreting polygenic scores in neurodevelopmental research and risk modeling. It also highlights the opportunity to develop novel PRS models in preterm cohorts to better capture genetic liability unique to this population.
Future studies with larger sample sizes are needed to further examine potential interaction effects between polygenic burden, gestational age, and environmental stressors. Longitudinal follow-up of preterm cohorts with and without ASD will also be critical to disentangle the timing and nature of neurodevelopmental divergence. Integrating PRS with rare variant burden, epigenetic marks, clinical comorbidity, and brain imaging may provide a more comprehensive understanding of ASD liability across different birth conditions.
Data availability
The genotype data generated in this study have been deposited in the Database of Genotypes and Phenotypes (dbGaP) under accession number phs000607.v1.p1. The dataset is available through controlledaccessat https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000607.v1.p1, subject to dbGaP approval and compliance with data use agreements.
References
Vogel, J. P. et al. The global epidemiology of preterm birth. Best Pract. Res. Clin. Obstet. Gynaecol. 52, 3–12. https://doi.org/10.1016/j.bpobgyn.2018.04.003 (2018).
WHO. Recommended definitions, terminology and format for statistical tables related to the perinatal period and use of a new certificate for cause of perinatal deaths. Modifications recommended by FIGO as amended October 14, 1976. Acta Obstet. Gynecol. Scand. 56, 247–253 (1977).
Kaempf, J. W., Guillen, U., Litt, J. S., Zupancic, J. A. F. & Kirpalani, H. Change in neurodevelopmental outcomes for extremely premature infants over time: a systematic review and meta-analysis. Arch. Dis. Child. Fetal Neonatal Ed. 108, 458–463. https://doi.org/10.1136/archdischild-2022-324457 (2023).
Simmons, L. E., Rubens, C. E., Darmstadt, G. L. & Gravett, M. G. Preventing preterm birth and neonatal mortality: exploring the epidemiology, causes, and interventions. Semin Perinatol. 34, 408–415. https://doi.org/10.1053/j.semperi.2010.09.005 (2010).
Tucker Edmonds, B., McKenzie, F., Panoch, J. E. & Frankel, R. M. Comparing neonatal morbidity and mortality estimates across specialty in periviable counseling. J. Matern Fetal Neonatal Med. 28, 2145–2149. https://doi.org/10.3109/14767058.2014.981807 (2015).
Schmidt, B. et al. Prediction of late death or disability at age 5 years using a count of 3 neonatal morbidities in very low birth weight infants. J. Pediatr. 167, 982–986. https://doi.org/10.1016/j.jpeds.2015.07.067 (2015). e982.
Faramarzi, R., Darabi, A., Emadzadeh, M., Maamouri, G. & Rezvani, R. Predicting neurodevelopmental outcomes in preterm infants: A comprehensive evaluation of neonatal and maternal risk factors. Early Hum. Dev. 184, 105834. https://doi.org/10.1016/j.earlhumdev.2023.105834 (2023).
Medlock, S., Ravelli, A. C., Tamminga, P., Mol, B. W. & Abu-Hanna, A. Prediction of mortality in very premature infants: a systematic review of prediction models. PLoS One. 6, e23441. https://doi.org/10.1371/journal.pone.0023441 (2011).
Crilly, C. J., Haneuse, S. & Litt, J. S. Predicting the outcomes of preterm neonates beyond the neonatal intensive care unit: what are we missing? Pediatr. Res. 89, 426–445. https://doi.org/10.1038/s41390-020-0968-5 (2021).
Crump, C., Sundquist, J. & Sundquist, K. Preterm or early term birth and risk of autism. Pediatrics https://doi.org/10.1542/peds.2020-032300 (2021).
Joseph, R. M. et al. Prevalence and associated features of autism spectrum disorder in extremely low gestational age newborns at age 10 years. Autism Res. 10, 224–232. https://doi.org/10.1002/aur.1644 (2017).
Shehzad, I. et al. Evaluation of autism spectrum disorder risk in infants with intraventricular hemorrhage. Cureus 15, e45541. https://doi.org/10.7759/cureus.45541 (2023).
Davidovitch, M. et al. Postnatal steroid therapy is associated with autism spectrum disorder in children and adolescents of very low birth weight infants. Pediatr. Res. 87, 1045–1051. https://doi.org/10.1038/s41390-019-0700-5 (2020).
Sacchi,C. et al. Neurodevelopmental outcomes following intrauterine growth restriction and very preterm birth. J. Pediatr. 238, 135–144e110 (2021). https://doi.org/10.1016/j.jpeds.2021.07.002
Tick, B., Bolton, P., Happe, F., Rutter, M. & Rijsdijk, F. Heritability of autism spectrum disorders: a meta-analysis of twin studies. J. Child. Psychol. Psychiatry. 57, 585–595. https://doi.org/10.1111/jcpp.12499 (2016).
Grove, J. et al. Identification of common genetic risk variants for autism spectrum disorder. Nat. Genet. 51, 431–444. https://doi.org/10.1038/s41588-019-0344-8 (2019).
Takahashi, N. et al. Association of genetic risks with autism spectrum disorder and early neurodevelopmental delays among children without intellectual disability. JAMA Netw. Open. 3, e1921644. https://doi.org/10.1001/jamanetworkopen.2019.21644 (2020).
Zhang, Y., Yahia, A., Sandin, S., Aden, U. & Tammimies, K. Prematurity and genetic liability for autism spectrum disorder. MedRxiv https://doi.org/10.1101/2024.11.20.24317613 (2024).
Schendel, D. et al. Evaluating the interrelations between the autism polygenic score and psychiatric family history in risk for autism. Autism Res. 15, 171–182. https://doi.org/10.1002/aur.2629 (2022).
Kowalski, M. H. et al. Use of > 100,000 NHLBI Trans-Omics for precision medicine (TOPMed) consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations. PLoS Genet. 15, e1008500. https://doi.org/10.1371/journal.pgen.1008500 (2019).
Ge, T., Chen, C. Y., Ni, Y., Feng, Y. A. & Smoller, J. W. Polygenic prediction via bayesian regression and continuous shrinkage priors. Nat. Commun. 10, 1776. https://doi.org/10.1038/s41467-019-09718-5 (2019).
Cullen, H., Selzam, S., Dimitrakopoulou, K., Plomin, R. & Edwards, A. D. Greater genetic risk for adult psychiatric diseases increases vulnerability to adverse outcome after preterm birth. Sci. Rep. 11, 11443. https://doi.org/10.1038/s41598-021-90045-5 (2021).
Wong, H. S. et al. Contribution of de Novo and inherited rare CNVs to very preterm birth. J. Med. Genet. 57, 552–557. https://doi.org/10.1136/jmedgenet-2019-106619 (2020).
Li, J., Oehlert, J., Snyder, M., Stevenson, D. K. & Shaw, G. M. Fetal de Novo mutations and preterm birth. PLoS Genet. 13, e1006689. https://doi.org/10.1371/journal.pgen.1006689 (2017).
Bokobza, C. et al. Neuroinflammation in preterm babies and autism spectrum disorders. Pediatr. Res. 85, 155–165. https://doi.org/10.1038/s41390-018-0208-4 (2019).
Kuzniewicz, M. W. et al. Prevalence and neonatal factors associated with autism spectrum disorders in preterm infants. J. Pediatr. 164, 20–25. https://doi.org/10.1016/j.jpeds.2013.09.021 (2014).
Granes, L., Tora-Rocamora, I., Palacio, M., De la Torre, L. & Llupia, A. Maternal educational level and preterm birth: exploring inequalities in a hospital-based cohort study. PLoS One. 18, e0283901. https://doi.org/10.1371/journal.pone.0283901 (2023).
Acknowledgements
We gratefully acknowledge the Center for Applied Genomics participants for their contributions, without whom this research would not have been possible.
Funding
This research was supported by the Center for Applied Genomics Institutional Development Fund Award and the Endowed Chair in Genomic Research grant awarded to Dr. Hakon Hakonarson by the Children’s Hospital of Philadelphia. Dr. Barabara Chaiyachati’s time is supported by a NIMH K08 career development award. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author information
Authors and Affiliations
Contributions
B.M.W. conceptualized the project, developed and ran regression models, analyzed results, prepared table 1, prepared figure 2, and wrote the main manuscript text. X.C. generated standardized polygenic risk scores, prepared figure 1, and drafted the Methods section. F.D.M. curated patient cohorts, generating inclusion/exclusion criteria. B.H.C. & S.B.D contributed clinical expertise in neonatology and neonatal outcomes. H.H. supervised the project as the senior author, contributing to conceptualization, funding acquisition, and providing the large patient database from which cohorts were generated. All authors reviewed, revised, and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Wenger, B.M., Chang, X., Mentch, F.D. et al. Polygenic risk for autism spectrum disorder based on four group comparison across term and preterm birth. Sci Rep 16, 2693 (2026). https://doi.org/10.1038/s41598-025-32440-w
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-32440-w

