Introduction

Chronic obstructive pulmonary disease (COPD) presents a major public health challenge. According to the Global Initiative for Chronic Obstructive Lung Disease (GOLD) 2024, COPD is a heterogeneous pulmonary disorder characterized by persistent respiratory symptoms and structural abnormalities in the airways and/or alveoli, leading to sustained airflow limitation1. Despite advancements in healthcare, COPD remains the third leading cause of mortality worldwide as of 20192. A global analysis of COPD burden across 204 countries revealed that China has one of the highest prevalence rates of the disease. Furthermore, China ranks first in both the absolute number of COPD-related deaths and the economic burden associated with the disease. In Kashi, an epidemiological survey conducted by our team in 2019 identified a COPD prevalence rate of 17.01% among individuals over the age of 403, exceeding the national average of 13.7% reported in 20184. Consequently, COPD represents a significant disease burden in Kashi.

Previously, our team discovered bactericidal/permeability-increasing bold-containing family B member 4 (BPIFB4) by whole exome sequencing and bioinformatics analysis of COPD families in Kashi5. BPIFB4 rs4339026 A > G is located in the exon region of human chromosome 20. Studies have shown that BPIFB4 is highly concentrated in the upper respiratory tract and proximal trachea6, and is also highly expressed in monocytes7. Recent research has increasingly focused on the relationship between BPIFB4 and longevity, particularly the longevity-associated variant (LAV)-BPIFB6. Additionally, BPIFB4 may regulate macrophage-mediated inflammatory responses8, but its role in COPD pathogenesis remains unclear and requires further investigation.

In this study, we evaluated the association between BPIFB4 rs4339026 A > G and COPD susceptibility in a case-control study involving 1,075 individuals. Bioinformatics analyses were performed to predict BPIFB4 expression in COPD patients, specifically in human peripheral blood and bronchoalveolar lavage fluid (BALF), and to identify potential mechanistic pathways involved in COPD pathogenesis. Also, the expression of BPIFB4 and key pathway proteins were validated in a COPD mouse model.

Results

Clinical characteristics of COPD patients and healthy controls

A cohort of 1,075 individuals, including 541 unrelated COPD patients and 534 healthy controls (HCs), were recruited for this study (Table 1). The mean age was 61.11 ± 12.26 years in the COPD group and 54.86 ± 10.73 years in the HC group. The COPD cohort comprised 280 males (51.75%) and 261 females (48.24%), while the HCs group included 234 males (43.82%) and 300 females (56.18%). The mean BMI was significantly lower in COPD patients (23.56 ± 4.22) than in HCs (25.55 ± 4.17). Regarding smoking status, 124 individuals (22.92%) in the COPD group were current or former smokers, compared to 69 (12.92%) in the HCs group. Analysis of biomass fuel exposure revealed that 513 COPD patients (94.82%) and 516 HCs (96.63%) reported coal use, while 519 COPD patients (95.93%) and 500 HCs (93.63%) reported wood use. Pulmonary function tests indicated that FEV1% and FEV1/FVC of were lower in the COPD patients compared to HCs. Significant differences between the two groups were observed in age, sex, BMI, smoking status, FEV1%, and FEV1/FVC (P < 0.05 for all). However, no significant differences were found in coal, and wood consumption.

Table 1 Clinical characteristics of the COPD group and healthy controls.

Table 1 was adapted from Xu et al.5 under a CC-BY license.

Hardy-Weinberg equilibrium of BPIFB4 rs4339026 A > G

BPIFB4 rs4339026 A > G met the criteria for the Hardy-Weinberg equilibrium (P > 0.05, Table 2). Therefore, BPIFB4 rs4339026 A > G could be analyzed further.

Table 2 Hardy-Weinberg equilibrium of rs4339026 A > G.

Genotypic analysis of BPIFB4 rs4339026 A > G in the case-control study

We undertook genetic model analysis (genotype, dominant, recessive, allele, and additive) on BPIFB4 of rs4339026 A > G (Fig. 1a). The call rate for rs4339026 A > G was 99.91% (1,074/1,075), due to the de novo mutations in the primer binding region9 were occurred among a few subjects, possibly. After correcting for sex, age, BMI, smoking status, FEV1% and FEV1/FVC, the G/G genotype of BPIFB4 rs4339026 A > G was significantly elevated the risk of COPD under multiple genetic models: genotype model [adjusted odds ratio (aOR) = 2.52, corresponding 95% confidence interval (95% CI): 1.34–4.71], recessive model (aOR = 2.32, 95% CI: 1.25–4.31), and dominant model (aOR = 1.39, 95% CI: 1.07–1.81). In the allele model, the “G” genotype was related to an increased risk of COPD (aOR = 1.42, 95% CI: 1.13–1.77). For the additive model, an increased tendency of COPD risk was also present (aOR = 1.40, 95% CI: 1.12–1.75).

Stratified analysis of BPIFB4 rs4339026 A > G in the case-control study

A stratified analysis was conducted to evaluate the association between BPIFB4 rs4339026 A > G and COPD risk based on smoking status (Fig. 1b,c) and FEV1%.

Among smokers, the adverse effects of “G/G + A/G” in rs4339026 A > G was more pronounced in the dominant model (aOR = 2.52, 95% CI: 1.23–5.15), and additive model (aOR = 2.61, 95% CI: 1.37–4.97). In the allele model, the presence of the “G” was associated with an increased risk of COPD (aOR = 2.68, 95% CI: 1.41–5.08).

Similarly, in the non-smokers group, the “G/G” or “G/G + A/G” genotypes were also associated with a higher risk of COPD. Specifically, individuals carrying these genotypes showed a significant risk increase in the genotype model (aOR = 1.99, 95% CI: 1.03–3.85) and additive model (aOR = 1.28, 95% CI: 1.01–1.62). The allele model also demonstrated a trend toward increased COPD risk (aOR = 1.29, 95% CI: 1.01–1.64).

However, rs4339026 A > G did not show a significant association with FEV1% (Supplementary Table S1).

Fig. 1
figure 1

Analysis of genotypes of BPIFB4 rs4339026 A > G. (a) Case-control study of BPIFB4 rs4339026 A > G. 541 COPD patients in case group, and 533 healthy people in control group. The call rate for rs4339026 of BPIFB4 was 99.91% (1,074/1,075). a Logistic regression: Corrected for sex, age, BMI, smoking status, FEV1% and FEV1/FVC. *P < 0.05, **P < 0.01. (b) Stratified analysis of smokers in the case–control study. b Logistic regression: Corrected for sex, age, BMI, FEV1% and FEV1/FVC. *P < 0.05, **P < 0.01. (c) Stratified analysis of non-smokers in the case–control study. b Logistic regression: Corrected for sex, age, BMI, FEV1% and FEV1/FVC. *P < 0.05, **P < 0.01. OR Odds Ratio, 95%CI 95% confidence interval, NA Not Available.

Prediction of BPIFB4 expression in human peripheral blood and BALF

In the GSE42057 dataset (gene expression data from 136 human peripheral blood samples), 42 healthy smokers and 94 COPD smokers were analyzed (Fig. 2a). Compared to healthy smokers, BPIFB4 expression was significantly reduced in COPD smokers (P < 0.05).

In the GSE13896 dataset (BALF gene expression data), 12 COPD smokers, 34 healthy smokers and 24 healthy non-smokers were included (Fig. 2b). The findings revealed a significant decrease in BPIFB4 expression in both healthy smokers and COPD smokers compared to healthy non-smokers (P < 0.05). Additionally, BPIFB4 expression was significantly lower in COPD smokers compared to healthy smokers (P < 0.05). These findings collectively indicated that BPIFB4 expression was downregulated in COPD patients and smokers, suggesting its potential role as a key gene in COPD pathogenesis.

Analysis of BPIFB4-related proteins and intersection proteins between BPIFB4 and COPD

Using the GeneCards and DrugBank databases, we identified 1,503 COPD-associated proteins and 51 BPIFB4-related proteins (Fig. 2c). Among these, 20 overlapping proteins were associated with both BPIFB4 and COPD pathogenesis: HSP90AA1, HSPA4, TP53, HIF1A, AKT1, STAT3, NR3C1, RAF1, AR, ERBB2, EGFR, ESR1, SRC, HSPA1A, CFTR, HSPD1, LRRK2, NOS3, TERT, and STK11. Protein-protein interaction (PPI) network analysis of these overlapping proteins revealed that HSP90AA1, AKT1, TP53, HSPA4, and HIF1A play pivotal roles in BPIFB4-mediated regulation of COPD pathogenesis (Fig. 2d). KEGG pathway enrichment analysis further demonstrated that these overlapping proteins were functionally associated with key pathways, including PI3K/AKT and JAK/STAT (Fig. 2e,f).

In summary, BPIFB4 likely modulated PI3K/AKT and JAK/STAT pathways through interactions with key proteins such as AKT1, HSP90AA1, and TP53, contributing to the pathogenesis of COPD.

Fig. 2
figure 2

Results of bioinformatics analysis for BPIFB4. (a) BPIFB4 expression in human peripheral blood. (b) BPIFB4 expression in human BALF. The horizontal axis represented different groups, and the vertical axis represented BPIFB4 expression. The upper left corner represented the statistical test method used to assess significance. *P < 0.05, **P < 0.01. (c) Venn diagram of overlapping proteins associated with both BPIFB4 and COPD. (d) PPI network of overlapping proteins. Circles represented nodes (proteins), and straight lines indicated protein-protein interactions. The size of each node was positively correlated with its degree (the more connections a node had, the higher its degree and the larger its size). The node color was also positively correlated with its degree (the redder the node, the greater its degree; the greener the node, the smaller its degree). (e, f) BPIFB4 participated in key pathways and functional classification in COPD. CC represented cell component, MF represented molecular function, BP represented biological process.

Validation of protein expression in COPD mice

We examined the expression of BPIFB4 and key proteins in the PI3K/AKT pathway in the lung tissues of COPD and control mice. Compared to controls, both BPIFB4 mRNA (Fig. 3d) and protein expression (Fig. 3b,e) were obviously decreased (P < 0.0001). Conversely, the expression of PI3K, p-PI3K, and p-AKT1 was significantly increased in COPD mice compared to controls (P < 0.001). Although AKT1 expression was also elevated, the difference between the two groups was not statistically significant (Fig. 3c,f).

Fig. 3
figure 3

Expression of BPIFB4 and key proteins in the PI3K/AKT pathway in COPD mice. (a) Illustration for COPD mouse models. (b, c) Western blot analysis of BPIFB4 (n = 10/group), PI3K, p-PI3K, AKT1, and p-AKT1 (n = 3/group) in the lung tissues of COPD mice and controls. (d) BPIFB4 mRNA expression assessed by RT-qPCR (n = 10/group). (e, f) Statistical significance: ns represented not significant, ***P < 0.001, and ****P < 0.0001, versus controls.

Discussion

COPD is a highly prevalent respiratory disease worldwide, placing a significant burden on patients. It is primarily caused by complex gene-environment interactions10. In this study, we identified a significant association between the BPIFB4 rs4339026 A > G polymorphism and an increased risk of COPD in Kashi population across multiple genetic models. The association was particularly pronounced among COPD smokers. Bioinformatics analysis indicated a significant reduction in BPIFB4 expression in both the peripheral blood and BALF of COPD patients. Additionally, screening of BPIFB4 and COPD common intersecting proteins revealed that BPIFB4 might participate in inflammatory responses of COPD by regulating pathways, such as PI3K/AKT and JAK/STAT. In COPD mouse models, BPIFB4 expression was decreased, while the levels of p-PI3K and p-AKT1 were elevated, indicating potential involvement of the PI3K/AKT pathway in BPIFB4-mediated COPD pathogenesis.

Cigarette smoke exposure is one of the primary risk factors for COPD11. In this study, after adjusting for population-related confounding factors (including sex, age, BMI, smoking status, FEV1%, and FEV1/FVC), BPIFB4 rs4339026 A > G was found to significantly increase the risk of COPD in the Kashi population. Further stratified analysis based on smoking status revealed that the risk of COPD was higher among smokers (aOR > 2) compared to non-smokers (1 < aOR < 2). The relatively lower number of smokers (current and former) in our study might be attributed to the nearly equal proportion of males and females (51.75% vs. 48.24%) in the study population. However, this distribution was consistent with previous epidemiological findings on COPD in Kashi3. Additionally, studies had indicated that the prevalence of COPD was nearly equal between males and females12. Moreover, passive smoking (second-hand smoke exposure) is also a form of cigarette exposure and a significant risk factor for COPD13. Although non-smokers do not actively smoke, they are still at risk of cigarette smoke exposure14. Notably, compared to COPD patients who actively smoked, non-smoking female COPD patients were more frequently observed1. However, due to the difficulty in accurately defining the extent, frequency, and duration of passive smoking, individuals exposed to second-hand smoke were not classified as smokers in this study. Nevertheless, BPIFB4 rs4339026 A > G significantly increased the risk of COPD in Kashi, regardless of smoking status. More importantly, both active and passive exposure to cigarette smoke can induce oxidative stress and inflammatory responses, leading to airway and alveolar epithelial damage15,16.

BPIFB4 is one of the most abundant proteins in respiratory secretions6 and plays a role in host defense through its antimicrobial, surfactant, and immunomodulatory properties17. In this study, we found that BPIFB4 expression was reduced in both the peripheral blood and BALF of COPD patients. Similarly, BPIFB4 expression was also decreased in lung tissues of COPD mice. Previous studies had demonstrated that high BPIFB4 expression helped suppress inflammation. BPIFB4 could alleviate inflammation by regulating macrophages and reducing the release of pro-inflammatory cytokines17,18. Additionally, elevated BPIFB4 expression could reduce macrophage infiltration and mitigate endothelial cell damage caused by oxidative stress19. These findings suggested that BPIFB4 might have significant potential in macrophage regulation. It is well established that macrophages play a critical role in the chronic inflammation of COPD20,21. Studies had shown that macrophage numbers were significantly increased in the sputum, BALF, and lung tissue of COPD patients, leading to the release of inflammatory mediators such as TNF-α, CXCL8, and reactive oxygen species, which further exacerbated the inflammatory responses in COPD22. Thus, BPIFB4 might have a key role in the occurrence and development of COPD by regulating macrophages.

To further investigate the role of BPIFB4 in the development and progression of COPD, we performed a screening analysis of BPIFB4 and COPD-associated proteins. The analysis suggested a potential association with the PI3K/AKT pathway. Then we validated key proteins of the PI3K/AKT pathway in lung tissues of COPD mice, and observed increased expression of p-PI3K and p-AKT1, indicating that the PI3K/AKT pathway was activated in COPD. Previous studies had demonstrated that activation of the PI3K/AKT pathway promoted macrophage accumulation and was associated with pulmonary inflammation23,24. Additionally, research had shown that activation of the PI3K/AKT pathway in alveolar macrophages of COPD mice led to a significant increase in inflammatory cytokines such as TNF-α, IL-1β, and IL-6, thereby exacerbating the inflammatory responses25. Based on these findings, we hypothesized that BPIFB4 might regulate pulmonary inflammation in COPD through the PI3K/AKT pathway.

We note some limitations of our study. First, the sample size of the smoking subgroup was relatively small. Additionally, although BPIFB4 expression was reduced in both COPD patients and mouse models, while p-PI3K and p-AKT1 levels were elevated, further studies are needed to confirm whether BPIFB4 directly regulates COPD inflammation through the PI3K/AKT pathway.

Conclusion

This study is the first to identify an association between BPIFB4 rs4339026 A > G and COPD in the Kashi population. Our findings suggest that BPIFB4 rs4339026 A > G is a significant risk factor for COPD, particularly among COPD smokers. Furthermore, BPIFB4 may contribute to COPD pathogenesis through the PI3K/AKT pathway.

Methods

Study cohorts

This study included a total of 1,075 individuals, comprising 541 unrelated COPD patients and 534 healthy controls (HCs). The inclusion and exclusion criteria, lung function assessment procedures, and peripheral blood sample collection methods were described by Gong et al. previously26. All participants provided written informed consent prior to their enrollment in the study.

SNV genotyping

BPIFB4 rs4339026 A > G were genotyped utilizing SNPscan™ (Center for Genetic & Genomic Analysis, Genesky Biotechnologies, China). Genotyping and analyses were conducted using a DNA analyzer (ABI3730XL, Applied Biosystems, USA) and GeneMapper™ 4.1 (Applied Biosystems).

Prediction of BPIFB4 expression

We searched the GEO database (www.ncbi.nlm.nih.gov/geo/) using the keywords “COPD,” “cigarettes,” and “tobacco,” with “Homo sapiens” as the species filter. Two gene expression datasets were identified: GSE13896 (n = 70) and GSE42057 (n = 136), both generated using the Affymetrix HG-U133 Plus 2.0 (GPL570) microarray platform. To ensure data consistency, all expression values were log2-transformed and normalized using the normalize quantiles function in the preprocessCore package in R. Probe IDs were then mapped to gene symbols, and for genes with multiple probes, the average expression value was calculated. Common gene symbols across the datasets were extracted, and batch effects were corrected using the remove BatchEffect function in the limma package in R. Finally, BPIFB4 expression levels were compared across different groups.

Prediction of BPIFB4-related proteins and COPD intersection proteins

We searched the GeneCards (www.genecards.org/) and DrugBank (www.drugbank.com/) databases using “obstructive pulmonary emphysema” as the keyword and restricted the search to Homo sapiens. To identify BPIFB4-related proteins, we performed an additional search using “BPIFB4” as the keyword with the same species restriction. A Venn analysis was conducted to identify overlapping proteins between BPIFB4-related proteins and COPD-associated proteins. Protein-protein interaction (PPI) networks of the intersecting proteins were constructed using STRING (www.STRING-db.org/), and hub genes were identified using Cytoscape 3.10.1. Functional pathway enrichment analysis was performed using the clusterProfiler R package27.

Construction of COPD mouse models

The animal experiment was approved by the Ethics Committee of Xinjiang Medical University (IACUC-20231010-05) and conducted in compliance with the ARRIVE guidelines and other relevant protocols. All procedures adhered to relevant ethical regulations.

In this study, 8-week-old, healthy, male, clean-grade adult C57BL/6 mice were obtained from the Experimental Animal Center of Hangzhou Medical College [production license number SCXK (Zhejiang) 2019-0002]. All experiments were performed in the SPF-grade animal laboratory at the Hangzhou Medical College Laboratory Animal Center.

The COPD mouse model was established as illustrated in Fig. 3a. Briefly, C57BL/6 mice were randomly assigned to either the control group or the COPD group. Mice in the COPD group were exposed to cigarette smoke (9 cigarettes/hour, 2 h/session, twice daily, 6 days/week) from day 1 to day 42. In addition, lipopolysaccharide (LPS) was administered intratracheally at a dose of 750 ng/kg in 50 µL saline on day 21 and day 35. In the control group, mice were maintained in a normal air environment and received an intratracheal instillation of 50 µL saline on day 21 and day 35. All mice were sacrificed on day 43 using carbon dioxide overdose, and relevant assessments were performed.

Verification of BPIFB4 and key proteins in PI3K/AKT pathway

Real-time quantitative polymerase chain reaction (RT-qPCR)

Lung tissues from COPD mice were washed with ice-cold phosphate-buffered saline (PBS) and lysed in TRIzol™ reagent for RNA extraction. Total RNA was reverse-transcribed into complementary DNA (cDNA). The expression of BPIFB4 was analyzed using the LightCycler® 480 SYBR Green I Master Mix (Roche, Basel, Switzerland).The primer sequences (forward and reverse, respectively) were as follows:

GAPDH: 5’-AGCCCAAGATGCCCTTCAGT-3’ and 5’-CCGTGTTCCTACCCCCAATG-3’. BPIFB4: 5’-GTGTGGGTGTCTACCTGAGC-3’ and 5’-AAGTTGTCCACCAGGTTGGG-3’.

Relative gene expression was quantified using the 2 -ΔΔCt method, with GAPDH as the internal reference gene.

Western blot

Total protein was extracted from lung tissues using RIPA lysis buffer supplemented with protease and phosphatase inhibitors. Protein concentration was measured using the BCA assay. Equal amounts of protein were separated by SDS-PAGE and transferred onto PVDF membranes. Membranes were blocked with 5% non-fat milk or BSA for 1 h at room temperature, followed by overnight incubation at 4 °C with primary antibodies targeting BPIFB4 (ab168171, Abcam, UK), PI3K (20584-1-AP, Proteintech, China), p-PI3K (20584-1-AP, Proteintech, China), AKT1 (2938, CST, USA), and p-AKT1 (4060, CST, USA). After washing, membranes were incubated with HRP-conjugated secondary antibodies for 1 h at room temperature. Protein bands were detected using the Servicebio imaging system (SCG-W2000, China).

Statistical analysis

Statistical analyses were conducted using GraphPad Prism 9.0 and PLINK v1.07 (pngu.mgh.harvard.edu/purcell/plink/index.shtml). Quantitative data were expressed as mean ± standard deviation (SD) or median (interquartile range), depending on data distribution. Independent t-tests were used to compare age and BMI between groups, while chi-square tests assessed the associations of sex, smoking status, coal consumption, and wood consumption. For variables that did not follow a normal distribution (FEV1% and FEV1/FVC), the Mann-Whitney U test was applied.

The Hardy-Weinberg equilibrium for rs4339026 A > G was evaluated in the case-control study. Akaike’s information criterion (AIC) was used to determine the most suitable genetic model for rs4339026 A > G, including genotype, dominant, recessive, allele, and additive models. Multivariable logistic regression analysis was performed to calculate aORs with 95%CIs, adjusting for sex, age, BMI, smoking status, FEV1%, and FEV1/FVC, to examine the association between rs4339026 A > G and COPD risk. The forestplot package in R was used to generate the forest plot.

The Wilcoxon test was applied to assess BPIFB4 expression differences(GSE42057 dataset) in peripheral blood between COPD smokers and healthy smokers. The Kruskal-Wallis test was used to evaluate BPIFB4 expression differences (GSE13896 dataset) in BALF among COPD smokers, healthy smokers, and healthy non-smokers.

Experimental data from RT-qPCR and Western blot analyses were presented as mean ± SD, and differences were analyzed using an unpaired t-test. Each experiment was performed in at least three independent replicates. P < 0.05 was considered significant.