Abstract
Chronic obstructive pulmonary disease (COPD) is a leading cause of death and disability globally. While genetic and environmental risk factors are known, these insights have not yet resulted in individualized prevention strategies. Polygenic scores (PS) have enhanced our understanding of genetic risk and early-life lung function differences, but their applicability across diverse populations remains unclear. As such, we aimed evaluate whether a COPD PS captures population-wide risk or disproportionately affects sub-populations defined by ancestry or sex. This observational study assessed the association between a previously validated COPD PS and lung function in children with asthma, using spirometry results from children with asthma in the Center of Applied Genomics biorepository. Mixed-effects linear regression models were used to examine the relationship between PS and spirometry measures (FEV1 and FEV1/FVC), with stratified analyses by sex and ancestry. 25,477 spirometry results from 6336 patients were included. Population-wide, children in the high PS group had significantly lower FEV1/FVC z-scores compared to the low PS group (P = 0.0037, beta = − 0.11, 95% confidence interval [CI] − 0.18 to − 0.036). When stratified by sex, high PS was associated with lower FEV1/FVC z-scores in males (P = 0.0088, beta = − 0.14, 95% CI − 0.24 to − 0.035), but no significant effect was found in females. Ancestry-specific analysis revealed that males of European (EUR) ancestry with high PS had significantly lower FEV1 (P = 0.0035, beta = − 0.37, 95% CI − 0.63 to − 0.12) and FEV1/FVC z-scores (P = 0.00076, beta = − 0.35, 95% CI − 0.55 to − 0.15). No significant associations were observed in females with EUR ancestry or in African ancestry groups. Logistic regression analysis showed that high PS in males of EUR ancestry increased the odds of having an FEV1/FVC ratio below 0.7, a marker for COPD (P = 0.0074, OR = 2.01, 95% CI 1.19 to 3.31). This study demonstrates sex-specific effects of a COPD PS in pediatric asthma patients, with males of EUR ancestry at highest risk. The findings underscore the need to include diverse populations in genomic studies to improve the generalizability of PS for COPD prevention.
Similar content being viewed by others
Background
Chronic obstructive pulmonary disease (COPD) remains one of the most significant causes of death and disability worldwide1,2. While the utility of primary prevention through smoking cessation and avoidance of inhaled particulates and toxins is well established, significant gaps in prevention of COPD remain3. Specifically, while both environmental and genetic risk factors for COPD have been identified, these insights have not yet resulted in individualized prevention4,5,6,7.
This effort has been recently helped by assessing COPD risk using polygenic scores (PS) instead of individual genetic variants. Specifically, a PS developed by Moll et al.succeeded in predicting COPD risk earlier than the clinical risk factors of age and cigarette smoking, with stronger effects in patients of European ancestry (EUR)8. This PS used data from a genome-wide association study of lung function from the UK Biobank and SpiroMeta with the final PS including 1.7 million variants associated with forced expiratory volume in one second (FEV1) and 1.2 million variants associated with FEV1/ forced vital capacity (FVC). This same PS was found to be associated with reduced lung growth patterns in children with asthma and lower lung function attainment in former premature infants8,9,10. These ancestry-specific effects may be due to EUR ancestry being the predominant ancestry in both the UK Biobank and SpiroMeta datasets, which limits the generalizability of PS performance in non-European populations.
These promising findings raise the question, does this PS constitute a population wide measure of genetic lung function potential or does it capture genetic risk in specific sub-populations. Addressing this question is essential to translating previous findings into COPD prevention. If this PS captures population wide genetic lung function potential, the effect on COPD risk from the interaction between PS and harmful exposures may be additive. Conversely, if the PS instead captures population specific risk, subpopulations may experience exponentially higher COPD risk, potentially amenable to intervention. In this study, we address this question by examining the effect of this PS on lung function in a large cohort of pediatric patients with asthma, exploring specifically sex and ancestry specific effects.
Methods
Population
Subjects were drawn from The Children’s Hospital of Philadelphia (CHOP) biorepository at the Center for Applied Genomics (CAG). All subjects have provided informed consent, and the study protocol was approved by the Children’s Hospital of Philadelphia institutional review board (IRB protocol 16-013,278)11. A previously validated reproducible phenotype was used to define subjects with asthma12. This phenotype defines asthma cases as patients above the age of four with at least two asthma related ICD-10 codes in at least two different and independent clinical encounters, and at least one asthma related prescription. Patients with significant other pulmonary diseases including cystic fibrosis and bronchopulmonary dysplasia were excluded.
Spirometry data
For patients with asthma, all spirometry results between ages six and 21 years of age from 01/01/2000 and 12/31/2022 were obtained. These tests were completed as part of routine clinical care. Data obtained included FEV1 in liters, FEV1/FVC, sex at birth, height at time of spirometry, and age in years. Only tests with both FEV1 and FVC values available were included in the study. The race neutral Global Lung Function Initiative calculator was used to calculate FEV1 and FEV1/FVC z-scores13. A COPD phenotype was defined as having a FEV1/FVC ratio below 0.7. During quality control, spirometry measurements recorded as occurring on a weekend or associated with physiologically improbable z-scores (z score < − 5 or > 5) were removed. Similarly, patients were excluded if patient height was not updated, defined as three or more spirometry occurrences with the same recorded height, in patients unlikely to have reached adult height (recorded height < 140 cm).
Genetic data
Genotype data were generated on four major genotyping array families from Illumina (HumanHapMap550/610Q, OMNI2.5M, OmniExpress, and the GSA array). Array versions within families were merged on common SNPs and filtered for genotype missingness (geno 0.1), individual missingness (mind 0.02), and minor allele frequency (MAF ≥ 0.01), in that order using PLINK v1.914. Data were imputed using the TOPMed v2 reference panel on the TOPMed Imputation Server15,16,17. Each imputed file set was filtered for imputation quality on a combination of R-squared (R2) and MAF (for SNPs with MAF ≥ 0.05, R2 ≥ 0.3 were kept; for MAF < 0.05, R2 ≥ 0.5 were kept). File sets were merged, and variants present in 95% of samples were retained. Ancestry was assigned based on the results of principal component analysis (PCA). PCA was performed using flashpca on approximately 2.4 million imputed SNPs with MAF > 0.05 that had been pruned for linkage disequilibrium (LD) using PLINK v1.914,18. The first three principal components were plotted, and ancestry designation was performed by comparison to the reference genotypes from the HapMap consortium19. After splitting of ancestries, ancestry specific PCAs were performed using SNPs heavily pruned for (LD) and filtered for MAF ≥ 0.05. The African ancestry-specific PCA contained approximately 180,000 variants and the European ancestry-specific PCA included approximately 130,000 variants.
Polygenic score
PS was calculated for FEV1 and FEV1/FVC with the final composite PS score calculated using the (0.43847*PS FEV1) + (0.58833*PS FEV1/FVC) as described by Moll et al.8 All samples used for PS calculation had the PS SNPs successfully genotyped or imputed. The allelic scoring was done using PLINK v1.914,18.
Statistical analysis
To account for ancestry-specific differences in PS distribution, we standardized the PS within each ancestry group by computing ancestry-specific z-scores. This was done by calculating the mean and standard deviation of the PS within each ancestry group and then converting individual scores to z-scores. This approach allowed for within-ancestry comparisons while preserving the original PS structure. To optimize power, we analyzed the PS as a categorical variable with a cut-off of PS ancestry-specific z-score of 1 used to divide the population in patients with a ‘high’ and ‘low’ PS. T-tests were used to assess if there were differences in spirometry tests between these two groups. If significant (P < 0.05), mixed-effects linear regression models were used to assess if the effect of PS remained significant after accounting for clustering at the subject level and including ancestry as a covariate. For spirometry measures associated with PS at a population level, we further examined subpopulation effects with subpopulation specific regression models. Of note, In this study, ‘subpopulation’ refers to any distinct group within the total study population, including stratifications by sex, genetic ancestry, or PS group.
Finally, in the most sensitive subpopulation we used a logistic regression model to predict if patients met criteria for COPD on at least one spirometry test. This model included age at last recorded spirometry in years, ancestry, and PS group.
The vif function from the car package was used to test for multicollinearity throughout these models, and the variance inflation factor for all variables was below 1.5 in all models20.
Results
Population
There were 6,336 patients with asthma, including 3,441 with recorded male sex and 2,895 with female sex, that had spirometry and PS data available (Table 1). At a genetic ancestry subpopulation level, this population included 3,713 patients with African (AFR) ancestry, 1,536 with EUR ancestry, 107 with South Asian (SAS) ancestry, 81 with East Asian (EAS) ancestry, 23 with Admixed American (AMR) ancestry, and 876 with ‘Other’ ancestry (Table 2)21. For these patients, a total of 25,477 spirometry tests met inclusion criteria. Mean age at time of spirometry was 12.33 ± 3.65 years old. Using the GLI Global race-neutral reference equations the mean FEV1 z-score was − 0.51 ± 1.60 and mean FEV1/FVC z-score was − 0.68 ± 1.25.
Population wide effects
Population wide, patients in the high PS group had significantly lower FEV1 and FEV1/FVC z-scores than those in the low PRS group (Fig. 1). The relationship between FEV1/FVC z-scores and PS remained significant in a linear regression model including PS as a categorical variable and adjusting for sex, age, and race as recorded during spirometry as confounders (P = 0.0018, beta = − 0.12, 95% confidence interval [CI] − 0.19 to − 0.044).
Population wide relationship between polygenic score (PS) and spirometry results. Boxplots of median and interquartile range (IQR) FEV1 (A) or FEV1/FVC (B) z-score by PS group (defined by PS z-score above or below 1), whiskers extend to 1.5 times the IQR, datapoints outside this range are shown as points. P-value reported here was calculated using a two-sided t-test. While a statistically significant difference was observed in both measures, the magnitude of the difference in FEV1 z-score was small and unlikely to be clinically meaningful. Please refer to the text for P-values from the regression model adjusting for sex, age, and self-reported race as confounders.
Effects in sub-populations
When stratified by sex, no differences by age were detected (Supplemental Fig. 1). Conversely, high as opposed to low PS was related with lower FEV1 and FEV1/FVC z-score in patients with male sex and with lower FEV1/FVC z-score in patients with female sex (Fig. 2). When using a mixed-effects linear regression model including age and self-reported race as covariates, only the relationship between PS and FEV1/FVC z-score in male patients remained significant (P = 0.00824, beta = − 0.14, 95% CI − 0.24 to − 0.036). To examine ancestry specific effects, we explored these same associations in our two largest ancestry groups (‘AFR’ and ‘EUR’). In ancestry specific linear regression models including age and self-reported race as covariates, there was a significant relationship between a high PS and lower FEV1 and FEV1/FVC z-score in males of EUR ancestry (P = 0.01, beta = − 0.34, 95% CI − 0.59 to − 0.079 and P = 0.00085, beta = − 0.35, 95% CI − 0.54 to − 0.14 respectively, Fig. 3). No significant relationships were found in females with EUR ancestry or patients with AFR ancestry (Supplemental Fig. 2). The lack of results in patients with AFR ancestry is unsurprising given that the cohorts used to create the PS contained mostly subjects of EUR ancestry.
Sex specific relationship between polygenic score (PS) and spirometry results. Boxplots of median and interquartile range (IQR) FEV1 (A) or FEV1/FVC (B) z-score by PS group (defined by PS z-score above or below 1). Whiskers extend to 1.5 times the IQR, datapoints outside this range are shown as points. P-value reported here was calculated using a two-sided t-test. Please refer to text for P-value from regression model adjusting for age and self-reported race as confounders.
Relationship between age and spirometry results by polygenic score (PS) and sex for patients with European ancestry. Scatterplots showing the relationship between age and FEV1 z-score (A) and FEV1/FVC ratio (B), with points colored by the interaction of sex and PRS status. Trendlines indicate linear relationships between age and results within each group, with confidence intervals around the trendlines included. Colors represent different combinations of sex and PRS status: black for males with High PS, orange for males with Low PS, coral for females with High PS, and blue for females with Low PS.
Finally, we explored if having a high PS increased the odds of having a FEV1/FVC ratio below 0.7. To this end, we created population wide, male only, and male of EUR ancestry logistic regression models (Supplemental Table 2). While significant effects were found when using population wide and male only data (P = 0.024, odds ratio [OR] = 1.26, 95% CI 1.03 to 1.54 and P = 0.017, OR = 1.37, 95% CI 1.05 to 1.77 respectively), the strongest effect was identified in males of EUR ancestry (P = 0.007, OR = 2.01, 95% CI 1.19 to 3.32).
Discussion
Our data are the first to demonstrate that a previously validated COPD PS appears to have sex-specific effects in pediatric patients with asthma. We also redemonstrated that this PS has stronger effects in patients of EUR ancestry, likely due to their overrepresentation in the UK Biobank and SpiroMeta consortium cohorts. Additionally, it is important to note that our study population, which consists predominantly of children with AFR ancestry, differs in ancestry from the EUR-dominated cohorts used in the original GWAS of lung function in COPD patients. This ancestry mismatch may contribute to the observed differences in PS performance and underscores the need for developing ancestry-specific polygenic scores to improve accuracy and equity in genetic risk prediction. We observed that males with a high PS and European ancestry had significantly lower FEV1 and FEV1/FVC than patients with a low PS. For patients in this group, having a high PS increased the odds of having ‘COPD’ as defined by a FEV1/FVC < 0.7.
Sex differences in disease genetics have been well described and understanding these differences is a prerequisite to operationalizing a PS as a biomarker22,23. In our study, the association between COPD PS and spirometry values was specific to males. This is consistent with the previous association between this PS and the male predominant (71% of patients) ‘reduced lung growth’ category in the CAMP pediatric asthma cohort8,10.
Specifically in COPD, sex has been associated with disease severity and phenotype, with women with COPD being generally younger and with better lung function than men, but having more frequent dyspnea and exacerbations24. Some of these differences may be associated with differential gene regulatory patterns, especially of genes involved in the extracellular matrix25. In this context, it is worth noting that sex hormones have been previously linked with pulmonary inflammation and COPD26,27. However, a recent meta-analysis and Mendelian randomization study does appear to support an indirect, rather than direct relationship between sex-hormones and COPD28.
When genetic effects manifest is always an intriguing question. In Fig. 3, we explored the relationship between PS and spirometry stratified by age to assess whether children with a high PS start off with lower lung function early in life or fail to keep up with expected lung function growth over time. Our findings suggest the latter, but prospective studies will be needed to confirm this more definitively.Our study has some limitations. Specifically, retrospective data collected during routine clinical care may result in selection bias. While it appears unlikely that associations between genetic risk and spirometry values were driven by this, future studies validating our results in longitudinal cohorts are required. Furthermore, our center is a quaternary referral center located in an urban center in the North-Eastern United States. Replication studies in lower acuity settings and geographically diverse populations are needed to ensure generalizability of our findings. Additionally, our study relies on an EMR-derived asthma phenotype, which is validated as a binary ‘Yes/No’ variable and does not capture potential changes in asthma status over time. This limits our ability to assess the impact of temporal variations in asthma diagnosis, symptoms, or treatment on lung function trajectories.
Conclusion
Our study demonstrates that a specific sub-population of pediatric patients with asthma, identified by ancestry, sex and PS, appeared to have significantly lower FEV1 and FEV1/FVC. Further studies are needed to assess if and how this early-life sensitivity relates to development of COPD and COPD mortality later in life. Additionally, future studies should incorporate a more dynamic assessment of asthma status over time, utilizing repeated clinical evaluations or EMR-based longitudinal tracking. Furthermore, exploring sex-specific trajectories in lung function with advanced longitudinal modeling approaches may help uncover subtle differences not detectable in our cross-sectional analysis. If follow-up studies support our initial findings, examining how medical and environmental factors interact with lung function in this especially vulnerable sub-set of patients may open up COPD prevention pathways.
Data availability
The data that support the findings of this study are available from the corresponding author, JK, upon reasonable request.
Abbreviations
- AFR:
-
African ancestry
- AMR:
-
Admixed American ancestry
- CAG:
-
Center for Applied Genomics
- CHOP:
-
The Children’s Hospital of Philadelphia
- CI:
-
Confidence interval
- COPD:
-
Chronic obstructive pulmonary disease
- EAS:
-
East Asian ancestry
- EUR:
-
European ancestry
- FEV1:
-
Forced expiratory volume in one second
- FVC:
-
Forced vital capacity
- LD:
-
Linkage disequilibrium
- MAF:
-
Minor allele frequency
- PCA:
-
Principal component analysis
- PS:
-
Polygenic score
- R2 :
-
R-squared
- SAS:
-
South Asian ancestry
References
Safiri S, Carson-Chahhoud K, Noori M, Nejadghaderi SA, Sullman MJM, Ahmadian Heris J, et al. Burden of chronic obstructive pulmonary disease and its attributable risk factors in 204 countries and territories, 1990–2019: results from the Global Burden of Disease Study 2019. BMJ (Clinical research ed). 2022:e069679.
Evaluation IfHMa. Global Burden of Disease 2021: Findings from the GBD 2021 Study Seattle, WA: IHME; 2024.
Drummond, M. B., Buist, A. S., Crapo, J. D., Wise, R. A. & Rennard, S. I. Chronic obstructive pulmonary disease: NHLBI workshop on the primary prevention of chronic lung diseases. Ann. Am. Thorac. Soc. 11(Supplement 3), S154–S160 (2014).
Rennard, S. I. & Vestbo, J. Natural histories of chronic obstructive pulmonary disease. Proc. Am. Thorac. Soc. 5(9), 878–883 (2008).
Rennard, S. I. & Vestbo, J. COPD: The dangerous underestimate of 15%. Lancet (London, England). 367(9518), 1216–1219 (2006).
Wilk, J. B. et al. Evidence for major genes influencing pulmonary function in the NHLBI family heart study. Genet. Epidemiol. 19(1), 81–94 (2000).
Palmer, L. J. et al. Familial aggregation and heritability of adult lung function: Results from the Busselton Health Study. Eur. Respir. J. 17(4), 696–702 (2001).
Moll, M. et al. Chronic obstructive pulmonary disease and related phenotypes: polygenic risk scores in population-based and case-control cohorts. Lancet Respir. Med. 8(7), 696–708 (2020).
Nissen, G., Hinsenbrock, S., Rausch, T. K., Stichtenoth, G., Ricklefs, I. & Weckmann, M. et al. Lung function of preterm children parsed by a polygenic risk score for adult COPD. NEJM Evid. 2(3), 279 (2023).
McGeachie, M. J. et al. Patterns of growth and decline in lung function in persistent childhood asthma. N. Engl. J. Med. 374(19), 1842–1852 (2016).
Gottesman, O. et al. The Electronic Medical Records and Genomics (eMERGE) Network: Past, present, and future. Genet. Med. 15(10), 761–771 (2013).
Lyam Vazquez, Connolly. J. CHOP. Asthma. PheKB; 2013 [Available from: https://phekb.org/phenotype/146.
Initiative GLF. Global Lung Function Initiative calculators for Spirometry, TLCO and Lung volume [Available from: https://gli-calculator.ersnet.org/index.html.
Chang, C. C. et al. Second-generation PLINK: Rising to the challenge of larger and richer datasets. GigaScience. 4, 7 (2015).
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48(10), 1284–1287 (2016).
Fuchsberger, C., Abecasis, G. R. & Hinds, D. A. minimac2: Faster genotype imputation. Bioinformatics (Oxford, England). 31(5), 782–784 (2015).
Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590(7845), 290–299 (2021).
Abraham, G. & Inouye, M. Fast principal component analysis of large-scale genome-wide data. PLoS ONE 9(4), e93766 (2014).
Altshuler, D. M. et al. Integrating common and rare genetic variation in diverse human populations. Nature 467(7311), 52–58 (2010).
Weisberg, JFaS. An R companion to applied regression. Third ed: Sage (2019).
Gaspar, H. A. & Breen, G. Probabilistic ancestry maps: A method to assess and visualize population substructures in genetics. BMC Bioinf. 20(1), 116 (2019).
Gilks, W. P., Abbott, J. K. & Morrow, E. H. Sex differences in disease genetics: Evidence, evolution, and detection. Trends Genet. 30(10), 453–463 (2014).
DeMeo, D. L. Sex and gender omic biomarkers in men and women with COPD: Considerations for precision medicine. Chest 160(1), 104–113 (2021).
Perez, T. A. et al. Sex differences between women and men with COPD: A new analysis of the 3CIA study. Respir. Med. 171, 106105 (2020).
Lopes-Ramos CM, Shutta KH, Ryu MH, Huang Y, Saha E, Ziniti J, et al. Sex-biased Regulation of Extracellular Matrix Genes in COPD. Am J Respir Cell Mol Biol. 72, 72–81 (2024).
Karadag, F., Ozcan, H., Karul, A. B., Yilmaz, M. & Cildag, O. Sex hormone alterations and systemic inflammation in chronic obstructive pulmonary disease. Int. J. Clin. Pract. 63(2), 275–281 (2009).
Reyes-García, J., Montaño, L. M., Carbajal-García, A. & Wang, Y.-X. Sex hormones and lung inflammation. In Lung Inflammation in Health and Disease Vol. II (ed. Wang, Y.-X.) 259–321 (Springer International Publishing, 2021).
Du, D. et al. Sex hormones and chronic obstructive pulmonary disease: A cross-sectional study and Mendelian randomization analysis. Int. J. Chron. Obstruct. Pulmon. Dis. 19, 1649–1660 (2024).
Funding
This work was supported by the Parker B. Francis Fellowship Program. The study was funded by an Institute Development Fund and a K-readiness pilot grant from The Children´s Hospital of Philadelphia. Research reported in this publication was supported by the National Institute of Environmental Health Sciences of the National Institutes of Health under grant number P30ES013508 and by the National Heart, Lung, and Blood Institute under grant number R01HL169859. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author information
Authors and Affiliations
Contributions
JK: Conceptualization, Investigation, Writing- Original draft preparation, Visualization. JMC, HQ, FM: Conceptualization, Investigation, Writing- Reviewing and Editing. SAM, HH: Conceptualization, Supervision, Writing- Reviewing and Editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics approval
All subjects have provided informed consent to both genomic analysis and EMR mining as approved by the CHOP IRB (IRB protocol 16-013278). All research was conducted in accordance with the Declaration of Helsinki.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic Supplementary Material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Kelchtermans, J., Collaco, J.M., Qu, H. et al. Sex-specific spirometry effects of adult COPD polygenic score in children with asthma. Sci Rep 15, 11258 (2025). https://doi.org/10.1038/s41598-025-94804-6
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-94804-6





