Introduction

Benign prostatic hyperplasia (BPH) is a prevalent chronic urological condition marked by nonmalignant prostate enlargement, resulting in lower urinary tract symptoms (LUTS) that considerably impact patients’ quality of life and daily activities1. BPH is of vital concern for world health due to its high prevalence and the significant resources it requires from health services. According to estimates from the Global Burden of Disease Study 2019 in Lancet, BPH is among the leading causes of DALYs among elderly males2. The incidence of BPH increases with age, affecting roughly 50% of men between 50 and 60 years and nearly 90% of those men over 80 years of age3,4. The condition imposes considerable economic costs due to the need for medical treatment and surgical interventions, and it also impacts patients’ psychological well-being and social interactions, further underlining the importance of effective management and therapeutic strategies5.

A study on the genetic susceptibility of benign prostatic hyperplasia (BPH) revealed that the lifetime cumulative risk of male relatives of early-onset BPH patients undergoing prostatectomy due to BPH is as high as 66%.The autosomal dominant model indicates that 7% of men with early-onset BPH possess a gene with an 89% lifetime penetrance6. The polygenic effect is considered due to many genetic variants carrying individually small effects, which sum up to influence the disorder. Recent genome-wide association studies (GWAS)estimated that about 60% of variation in phenotypes related to BPH risk is accounted for by genetic factors7. Whereas specific mutations may account for rare hereditary forms of BPH, most cases are mediated by the interplay of many genetic factors. Recent GWAS identified 14 risk loci associated with BPH8. However, many of the identified loci reside in non-coding regions; this implication certainly makes the determination of functional roles of these regions very complicated9. Furthermore, complex LD patterns may mask the identification of causal variants responsible for those associations10.

Transcriptome-wide association studies (TWAS) incorporate eQTL information with summary statistics from GWAS to provide candidate genes and probe gene phenotype associations11. A most recent significant development in this study is the Unified Test for Molecular Signature (UTMOST), a novel gene-level association analysis across multiple tissues12. UTMOST provides a unique solution not captured by other single-tissue approaches and utilizes a “group-lasso penalty” in improving the precision and efficiency of imputation models. This model allows shared eQTL effects across tissues while also identifying significant tissue-specific eQTL effects. In this study, besides the major effect attributed to prostate tissue in BPH, there are associations with other tissue types. (Fine-mapping of causal gene sets) FOCUS is a statistical approach that identifies putative causal genes by leveraging eQTL weights, LD patterns, and GWAS summary statistics through null model prioritization. This method demonstrates particular efficacy in detecting disease-associated genes when regional genetic elements influence downstream phenotypes13.

Herein, we conducted a cross-tissue TWAS in which the summary statistics of the BPH GWAS were combined with eQTL data sourced from the Genotype-Tissue Expression Project (GTEx) v8. In addition, we conducted tissue-specific association tests using FUSION14, FOCUS and further replication was conducted with Multi-marker Analysis of Genomic Annotation (MAGMA)15. Bioinformatic investigations of candidate genes followed MR, after colocalization.

Materials and methods

The study flow is illustrated schematically in Fig. 1.

Fig. 1
figure 1

The flowchart of this study.

Data sources of BPH

The summary statistic GWAS data for BPH were retrieved from the FinnGen R11 dataset (https://r11.finngen.fi/) on of genes across 49 distinguishable tissues16. To verify the stability of the results, the study of Jiang et al. was used as a validation set17.

Cross-tissue TWAS analyses

UTMOST analyses were applied to multi-tissue data to estimate the overall gene-trait associations at an organismal level. This approach allows more gene discoveries in tissues enriched for trait heritability and/or higher imputation accuracy of underlying signal12. For tissue-wise integrated association, we applied the generalized Berk-Jones (GBJ) test which uses covariance obtained from single tissue summary statistics18. Significance was taken as FDR < 0.05 after applying FDR correction.

Single-tissue TWAS analyses

To interrogate disease-gene associations in BPH, we performed a comprehensive TWAS using the FUSION method, where we integrated BPH GWAS with the eQTL data from 49 different tissues in GTEx V819. We first characterized the linkage disequilibrium (LD) between single nucleotide polymorphisms (SNPs) of the prediction model in respect to each GWAS locus. with 1,000 Genomes Project European population data. For each tissue and gene, we combined gene expression predictions based on the following multiple statistical models: BLUP, BSLMM, and LASSO, and Elastic Net models. In the final model, we also used the best 1 model for each tissue and gene. To compute gene expression weights, we used the best weight model in terms of prediction performance. Z-scores from the BPH GWAS and weights derived from the gene expression predictions were used to perform the TWAS. We also tried to find any genetic associations to migraine and conducted the TWAS to this disease. Candidate genes were allowed to pass the significance threshold if they had an FDR < 0.05 in the cross-tissue TWAS and at least one single-tissue TWAS. Candidate genes in such a multi-step process amazing from the cross-fertilization of genes associated with different diseases that are previously done and hence provide a new insight into the genetic base of complex diseases.

Conditional and joint analysis

In FUSION, identifying multiple features within a genetic locus necessitates determining their conditional independence. We employed the COJO analysis module in FUSION for post-processing to isolate independent genetic signals19. The COJO analysis improves our comprehension of genetic architecture by considering LD among markers, providing a more detailed perspective on trait variation20. After conducting this analysis, genes that maintained their significance were classified as jointly significant, indicating strong independent associations. Conversely, genes that lost significance were deemed marginally significant, reflecting their conditional dependency on other genetic factors.

FOCUS for precise gene location

FOCUS software was employed to fine-map transcriptome-wide associations between genetic variants and disease risk regions. The platform integrates GWAS summary statistics with multi-tissue eQTL weights, including GTExv8 PrediXcan data, to identify probable causal genes13. Risk genes were identified using dual criteria: marginal posterior inclusion probability > 0.8 and P < 5 × 10−8.

MAGMA for gene annotation

We used MAGMA with default parameters to aggregate the SNP-level association statistics into gene scores in all our gene analyses. In this way, the extent of association of each separate gene with the investigated phenotype could be appraised21,22. For detailed parameter settings and methodology, follow the original MAGMA documentation15.

Bayesian colocalization and MR

Key genes identified from the convergence of MAGMA, UTMOST, and FUSION analyses underwent further investigation through MR, SMR, FOCUS and Bayesian colocalization. MR study employed cis-eQTL SNPs as instrumental variables (IVs) under the conditions of p < 5 × 10−8, R2< 0.001, and a 10,000 kb window size23. When only a single IV was available, the causal effect was estimated using the Wald ratio method, with significance determined at p < 0.05. it’s worth noting that summary data-based Mendelian randomization (SMR) offers greater statistical power compared to traditional MR analyses by leveraging top-ranked cis-QTLs. We selected the optimal cis-QTL within a 1,000 kb window centered on the gene of interest, using a stringent significance threshold of p < 5.0 × 10−8. Furthermore, to mitigate the potential confounding effects of pleiotropy and linkage, we excluded SNPs with an allele frequency difference greater than 0.2 in paired datasets and employed the (heterogeneity in dependent instrument) HEIDI test. A p-value less than 0.01 was considered indicative of pleiotropic associations24. Bayesian co-localization analysis was performed to verify the posterior probability (PP) of the five possible relationships25, A posterior probability (PPH4) exceeding 0.7 suggests a shared causal variant between GWAS and eQTL25,26. All analyses were carried out using the “TwoSampleMR” and “coloc” R packages27.

Gene expression omnibus (GEO) analysis

To identify differentially expressed genes (DEGs) between BPH and normal prostate tissues, we analyzed the GSE3868 dataset from the Gene Expression Omnibus (GEO) database. The analysis included two BPH samples and two matched normal prostate samples. Following data normalization, DEGs were determined using the GEO2R online tools(https://www.ncbi.nlm.nih.gov/geo/geo2r/). The cut off criterion was set as adj.pval < 0.05 and (log2 fold change [FC]) > 1.

GeneMANIA analysis

The GeneMANIA platform28 (https://genemania.org/) integrates various datasets, including genetic interactions, pathways, and co-expression data, to provide insights into target genes and their functions. By incorporating these diverse gene-function relationships, GeneMANIA enhances our understanding of the biological roles and interactions of the target genes29.

Results

TWAS analyses results of BPH

Of those, a total of 314 genes reached significance in the cross-tissue TWAS analysis at P < 0.05 (Supplemental Table 1), of which 28 genes remained significant after FDR correction at PFDR < 0.05 (Table 1). In the validation dataset, the cross-tissue analysis screened a total of 4 important genes after FDR correction (PFDR < 0.05) (Supplemental Table 2). In the single tissue TWAS analysis, 493 genes across at least one tissue reached PFDR < 0.05 (Supplemental Table 3). In contrast, in the single-tissue analysis of the validation dataset, a total of 54 significant genes were identified after FDR correction (Supplemental Table 4). Given that there was no intersecting genes in the validation set, subsequent analyses were further explored using the discovery cohort. Thus 12 candidate genes passed strict screening thresholds in both cross-tissue and single-tissue analyses in discovery set. The genes included eleven protein-coding genes (FGFR3, INO80B, TACC3, PTPN13, OXER1, SPARCL1, SMIM43, MTMR3, BCL11A, FEZ2, NT5C1B) and one non-coding gene (CHKB-DT) (Supplementary Table 5).

Table 1 The significant genes for BPH risk in cross-tissue UTMOST analysis.

COJO analysis

COJO analysis was performed on the 12 candidate genes, mainly found on chromosomes 2, 4, and 22, in their respective tissues to eliminate false positives caused by linkage disequilibrium (LD) (Supplementary Table 6). in Adipose_Subcutaneous tissue, the regulation of FGFR3 expression resulted in a significant reduction of TWAS signaling in TACC3, a result that was validated in Artery_Coronary tissue. Meanwhile, BCL11A, SMIM43, OXER1, CHKB-DT, INO80B, and SPARCL1 did not exhibit false-positive results (Supplementary Fig. 1). due to their significance in TWAS results from only a single tissue and the potential impact of LD, NT5C1B, PTPN13, FEZ2, and MTMR3 were excluded from further analyses.

Gene analysis of MAGMA

The gene-based scan by MAGMA gave 443 genes associated with the BPH phenotype after FDR correction (p < 0.05) (Supplemental Table 7). Next, the ten most significant genes in the Bonferroni correction were labeled on the Manhattan map (Fig. 2A). In terms of tissue specific enrichment, after FDR correction, a total of 6 tissues showed positive results (p < 0.05), including bladder, prostate and urethra. MAGMA pathway enrichment analysis revealed significant enrichment in developmental processes (prostate gland branching and mammary gland formation), transcriptional regulation (RNA biosynthesis and nucleobase metabolism), and cellular processes (Golgi vesicle tethering and biosynthetic regulation) (Supplementary Fig. 2).

Fig. 2
figure 2

(A) Manhattan plot of the MAGMA results for BPH; (B) Venn diagram.

MR and colocalization results

The intersection of nominally significant genes from three analytical approaches identified eight key genes: BCL11A, FGFR3, NT5C1B, INO80B, MTMR3, SMIM43, PTPN13 and TACC3 (Fig. 2B). MR analyses demonstrated a causal relationship between the eight genes associated with BPH (Fig. 3 and Supplemental Table S8), in addition, we conducted SMR analysis on the potential eight genes and corresponding tissues, and HEIDI test showed no obvious heterogeneity and pleiotropy, and the results of SMR analysis were consistent with those of MR (Supplemental Table S9). while co-localization provided stronger evidence of causality for the INO80B gene (posterior probability PPH4 > 0.7) (Fig. 4 and Supplemental Table S10). the INO80B gene is located on human chromosome 2, specifically at the 2q14.2 position. The rs11695896 variant was identified as the most significant co-localization locus for BPH across 18 tissues.

Fig. 3
figure 3

The forest map of MR results.

Fig. 4
figure 4

The colocalization results were obtain in Adipose_Subcutaneous.

The results of FOCUS precision positioning

Using FOCUS software, we performed fine-mapped TWAS analysis on European ancestry data across 49 tissues. With threshold criteria of model credible parameter = 1 and PIP > 0.8, we identified 321 candidate genes (Supplementary Table S11). FOCUS generated regional plots showing predicted expression correlations, with TWAS statistics and PIPs for each gene illustrated in Supplementary Fig. 3.

DEGs and GeneMANIA analysis

After normalizing the dataset GSE3868, we found that all eight previously identified genes were down-regulated in BPH tissues, with significant differences in INO80B (Supplemental Table S12 and Figure S4). Figure 5 illustrates the gene interaction network with INO80B as the central linker. INO80B is mainly associated with the formation of the INO80-type complex, DNA helicase complex, and SWI/SNF superfamily-type complex within the gene network (Supplementary Table 13).

Fig. 5
figure 5

GeneMANIA gene network.

Discussion

We integrated the GWAS summary statistics for BPH with eQTL summary data obtained from GTEx V8 to investigate how genetic susceptibility influences gene expression in the development of BPH. TWAS analysis, followed by MR and co-localization results, suggested that INO80B is a susceptibility gene for BPH.

Giri et al.30 explored the genetic links between metabolism-related traits and benign prostatic hyperplasia (BPH).They used the Illumina Cardio-MetaboChip platform for genotyping and constructed genetic risk scores (GRS) for height, BMI, and waist-to-hip ratio (WHR).This research highlighted how metabolic disorders might influence BPH development. Additionally, a separate GWAS employed the Sequenom iPLEX system to genotype specific SNPs. following Bonferroni correction, a significant association was established between GATA3 (rs17144046) and the mechanisms driving BPH31. This study utilizes UTMOST’s cross-tissue TWAS to improve the identification of genes associated with complex traits. Integrating gene expression data from multiple tissues provides a comprehensive view of gene-trait relationships, enhancing the detection of associations that single-tissue analyses might overlook. Importantly, the study identifies INO80B as a gene linked to BPH risk, a novel association not previously documented.

The INO80 protein complex is integral to chromatin remodeling, where it regulates chromatin structure by repositioning or reorganizing nucleosomes. This action directly influences gene expression by modulating the accessibility of chromatin32. Furthermore, INO80 is essential for repairing DNA double-strand breaks (DSBs).it facilitates the recruitment and localization of repair proteins to the site of damage, thereby aiding the homologous recombination repair (HRR) process. Additionally, the INO80 complex is vital in managing replication stress at the replication fork, ensuring the continuity of the replication process and preventing the accumulation of DNA damage33,34. Previous studies have found that certain INO80 subunits are overexpressed in various cancers, including breast cancer35, neuroendocrine prostate cancer36, and melanoma37. Moreover, elevated expression levels of these subunits are positively correlated with poor prognosis38.

On the other hand, INO80B is a crucial subunit of the INO80 chromatin remodeling complex, which plays essential roles in transcription regulation, DNA replication, and DNA repair39. The INO80 complex has been implicated in various cellular processes, including cell growth and proliferation control, making its potential involvement in prostatic hyperplasia biologically plausible. The negative correlation observed suggests that decreased INO80B expression might contribute to BPH development. This aligns with previous studies showing that chromatin remodeling complexes can act as regulatory factors in prostate tissue homeostasis. For instance, the SWI/SNF chromatin remodeling complex has been shown to influence prostate development and disease40.

Interestingly, we discovered that INO80B is involved in the inhibition of benign prostatic hyperplasia (BPH). The development of BPH is typically associated with various factors, including abnormal androgen signaling, inflammatory responses, and overactivation of cell proliferation signals41. BPH is characterized by the excessive proliferation of prostate cells and the abnormal accumulation of extracellular matrix42. The INO80 complex potentially curbs excessive cell proliferation by modulating the expression of cell cycle-related genes, especially those linked to the androgen receptor43. Further research is needed to validate these findings.

Protein-protein interaction network analysis demonstrates that INO80B functions within a complex molecular framework, involving multiple cellular processes relevant to prostatic tissue regulation. The INO80B interaction network reveals several key functional modules that may explain its protective effect against BPH. At the core of this network, INO80B strongly interacts with other INO80 complex components (INO80, INO80E) and the ATP-dependent DNA helicases RUVBL1 and RUVBL2, suggesting its fundamental role in chromatin remodeling activities39. The presence of YY1, a crucial transcription factor, indicates INO80B’s involvement in promoter regulation and enhancer-promoter interactions, potentially influencing prostate-specific gene expression patterns.

The network also reveals connections to growth factor signaling through IGFBP2, suggesting a role in cellular proliferation control, which is particularly relevant to BPH pathogenesis. The interaction with ZNHIT family proteins (ZNHIT1, ZNHIT2, ZNHIT3, ZNHIT6) implies involvement in histone modification and nucleosome positioning, potentially affecting global gene expression patterns in prostatic tissue. These interactions align with previous observations of chromatin remodeling complexes influencing prostate development and disease40. The presence of DDX59 and ACTR5 in the network suggests INO80B’s involvement in RNA metabolism, DNA repair, and nuclear organization, processes crucial for maintaining tissue homeostasis. This comprehensive interaction network supports the observed negative correlation with BPH by demonstrating INO80B’s role as a master regulator of multiple cellular processes relevant to prostate tissue maintenance. The interaction with the SWI/SNF chromatin remodeling complex components further supports its role in prostate tissue regulation, as these complexes have been shown to influence prostate development and disease progression44. The complex also interacts with androgen receptor (AR) signaling pathways, as demonstrated by previous studies showing chromatin remodeling complexes modulating AR-dependent transcription45. The inflammatory response regulation through chromatin remodeling, as suggested by Wu et al.46, may represent another mechanism by which INO80B influences BPH development.

Several limitations are to be pointed out concerning our study. First, the samples obtained from European populations may reduce the generalizability of our conclusions. Of course, such discrepant results for the discovery and validation sets should be interpreted with caution. there are genetic background differences between the UKB and Finnish populations, the pattern of LD may be different in different populations, and allele frequencies may be significantly different in different populations. Second, although we employed FDR correction to reduce false positive rates, the lack of independent validation sets presents a limitation in verifying our findings. Remaining studies are also needed in the future to involve more in vitro and in vivo experiments, which would explain the mechanisms at work. Our study provides new perspectives and insights into the pathophysiological mechanisms of BPH, despite its limitations.

Conclusion

In summary, our cross-tissue TWAS analyses revealed a correlation between INO80B expression and the risk of BPH, offering new insights into the genetic architecture of this condition. However, this is only a preliminary exploratory finding, further functional studies are necessary to elucidate the potential biological activity of these significant signals.