SR-TWAS: leveraging multiple reference panels to improve transcriptome-wide association study power by ensemble machine learning

Parrish, Randy L.; Buchman, Aron S.; Tasaki, Shinya; Wang, Yanling; Avey, Denis; Xu, Jishu; De Jager, Philip L.; Bennett, David A.; Epstein, Michael P.; Yang, Jingjing

doi:10.1038/s41467-024-50983-w

Download PDF

Article
Open access
Published: 05 August 2024

SR-TWAS: leveraging multiple reference panels to improve transcriptome-wide association study power by ensemble machine learning

Nature Communications volume 15, Article number: 6646 (2024) Cite this article

5515 Accesses
8 Citations
17 Altmetric
Metrics details

Subjects

Abstract

Multiple reference panels of a given tissue or multiple tissues often exist, and multiple regression methods could be used for training gene expression imputation models for transcriptome-wide association studies (TWAS). To leverage expression imputation models (i.e., base models) trained with multiple reference panels, regression methods, and tissues, we develop a Stacked Regression based TWAS (SR-TWAS) tool which can obtain optimal linear combinations of base models for a given validation transcriptomic dataset. Both simulation and real studies show that SR-TWAS improves power, due to increased training sample sizes and borrowed strength across multiple regression methods and tissues. Leveraging base models across multiple reference panels, tissues, and regression methods, our real studies identify 6 independent significant risk genes for Alzheimer’s disease (AD) dementia for supplementary motor area tissue and 9 independent significant risk genes for Parkinson’s disease (PD) for substantia nigra tissue. Relevant biological interpretations are found for these significant risk genes.

Alternative polyadenylation transcriptome-wide association study identifies APA-linked susceptibility genes in brain disorders

Article Open access 03 February 2023

Multi-omics analysis reveals the genetic aging landscape of Parkinson’s disease

Article Open access 28 December 2024

Isoform-level transcriptome-wide association uncovers genetic risk mechanisms for neuropsychiatric disorders in the human brain

Article Open access 30 November 2023

Introduction

Two-stage transcriptome-wide association studies (TWAS) have been widely used in genetic studies of complex traits due to the convenience of using publicly available transcriptomic reference panels and summary-level genome-wide association study (GWAS) datasets^1,2,3,4,5. The standard two-stage TWAS method^6,7 first trains gene expression imputation models (per gene per tissue) using a transcriptomic reference panel in Stage I, taking quantitative gene expression traits as response variables and nearby (cis-) or genome-wide (cis- and trans-) genetic variants as predictors. The non-zero genetic effect sizes estimated in the gene expression imputation models are considered to be effect sizes of a broad sense of expression quantitative trait loci (eQTL), which are taken as variant weights to conduct gene-based association tests with GWAS data (individual-level or summary-level) in Stage II.

Various TWAS techniques have been developed, employing diverse regression methods to train models for imputing gene expression. Additionally, multiple transcriptomic reference panels are made available to the public and could be used in TWAS. Consequently, it is possible to train multiple gene expression imputation models by employing distinct regression methods, employing multiple transcriptomic reference panels of the same tissue type, or utilizing transcriptomic data from multiple tissues within a given reference panel. For example, multiple regression methods, such as penalized regression with Elastic-Net penalty (used by PrediXcan^7,8) and nonparametric Bayesian Dirichlet process regression (DPR) model (used by TIGAR⁹), have trained gene expression imputation models using the same Genotype-Tissue Expression (GTEx)¹⁰ V8 reference data of 48 human tissue types. The Religious Orders Study (ROS)¹¹, Rush Memory and Aging Project (MAP)¹¹, and the GTEx¹⁰ V8 project all profile transcriptomic data of prefrontal cortex (PFC) brain tissue and genome-wide genetic data of the same samples, providing multiple reference panels of PFC tissue for TWAS. Thus, leveraging multiple trained gene expression imputation models of the same target gene across multiple regression methods, multiple reference panels, and multiple tissue types is expected to improve TWAS power, for more robustly modeling the unknown genetic architecture of the target gene expression by multiple regression models, having an increased training sample size with multiple reference panels, or borrowing strength across multiple tissue types with correlated gene expression.

Multiple approaches that can take advantage of transcriptomic reference data for multiple tissues and/or multiple reference panels have been developed. For example, UTMOST uses group LASSO-penalized multivariate regression to impute cross-tissue expression¹². TisCoMM uses the same multivariate regression model for gene expression prediction models for leveraging gene expression across multiple tissues, but utilizes a unified probabilistic model to test the overall and tissue-specific gene-trait associations¹³. SWAM estimates a vector of weights for input expression imputation models such that the weighted average of the input models will give the lowest mean squared error with respect to individual-level reference expression of the target tissue¹⁴. However, these approaches have drawbacks, such as requiring individual-level reference data, being computationally expensive, and user-unfriendly. For example, UTMOST and TisCoMM require individual-level reference data for all tissues to train gene expression imputation models. In order to control for multicollinearity, SWAM considers a regularization parameter which requires fine-tuning based on the covariance structure of Genetically Regulated gene eXpression (GReX) of all considered tissues, which must be derived using individual-level reference transcriptomic data¹⁴. Additionally, SWAM requires that the input of trained gene expression imputation models in the same SQL database format as used for PrediXcan output¹⁴.

To fill this gap, we develop a novel TWAS method to leverage multiple summary-level gene expression imputation models (i.e., base models) trained for the same target gene by the ensemble machine learning technique of stacked regression^15,16. We refer to this novel TWAS method as stacked regression-based TWAS (i.e., SR-TWAS). SR-TWAS first uses a validation transcriptomic dataset of the target tissue type to optimally train a set of weights for the multiple expression imputation base models per target gene (Stage I), by optimizing the gene expression prediction R² (i.e., the squared correlation between observed and predicted gene expression levels) in the validation dataset. Then SR-TWAS takes the weighted average eQTL effect sizes as the corresponding variant weights for gene-based association tests in Stage II. The trained expression imputation models by SR-TWAS are specific for the tissue type of the validation data, and the identified TWAS risk genes are interpreted with potential genetic effects mediated through their gene expression of the validation tissue type.

In this work, we present a novel TWAS tool (SR-TWAS) for leveraging multiple gene expression imputation base models. By simulation studies, we show that SR-TWAS has higher power than TWAS based on gene expression imputation base models, and that the average of the SR-TWAS model and base models of the validation data has the highest power in most scenarios. By validation studies with real transcriptomic data, we show that SR-TWAS achieves higher accuracy for prediction gene expression than base models. By real application studies of Alzheimer’s disease (AD) dementia and Parkinson disease (PD), we demonstrate that the average of the SR-TWAS model and base models of the validation transcriptomic data detect the greatest number of independently significant risk genes with disease-relevant biological interpretations. In the following sections, we first briefly describe the stacked regression method used by SR-TWAS and the GTEx V8 and ROS/MAP reference transcriptomic datasets used in this study. Then we describe the results of our simulation studies, validation studies using the real transcriptomic data, as well as application TWAS of AD dementia and PD. Last, we end with a discussion.

Results

Overview of SR-TWAS

In the framework of TWAS^7,9,17, a multiple linear regression model is assumed for training gene expression imputation models, taking quantitative gene expression levels ${{{\bf{E}}}}_{g}$ of the target gene and tissue as the response variable and cis-acting genetic variants nearby the target gene region (genotype matrix ${{\bf{G}}}$) as predictors, as shown in the following formula:

$${{{\bf{E}}}}_{g}={{\bf{Gw}}}+{{\boldsymbol{\epsilon }}},{{\boldsymbol{\epsilon }}} \sim N\left({{\bf{0}}},{{\bf{I}}}\right).$$

(1)

The eQTL effect sizes ${{\bf{w}}}$ could be trained by different regression methods and/or using different reference panels with matched expression and genotype data (${{{\bf{E}}}}_{g}{{\boldsymbol{,}}} \, {{\bf{G}}}$).

Assume there are a total of $K$ base gene expression imputation models trained for the same target gene, with ${\widehat{{{\bf{w}}}}}_{k},k=1,\ldots,K$. Let ${{{\bf{E}}}}_{{vg}}$ denote the gene expression levels of the target gene $g$ for the target tissue type in the validation data, and ${{{\bf{G}}}}_{v}$ denote the genotype matrix of genetic predictors in the validation data. Then the predicted GReX of the validation samples by the kth base model are given by ${{{\bf{G}}}}_{v}{\widehat{{{\bf{w}}}}}_{k}$. The stacked regression method^15,16 will solve for a set of optimal base model weights ${\zeta }_{1},\ldots,{\zeta }_{K}$, by maximizing the regression ${R}^{2}$ between the profiled gene expression ${{{\bf{E}}}}_{{vg}}$ and the weighted average GReX, ${\sum }_{k=1}^{K}{\zeta }_{k}{{{\bf{G}}}}_{v}{\widehat{{{\bf{w}}}}}_{k}$, of K base models, i.e., minimizing the following loss function of $1-{R}^{2}$:

$$\begin{array}{cc}{{\mbox{minimize}}}_{\left({\zeta }_{k}{{\rm{;}}}k=1,\ldots,K\right)}\frac{{{||}{{{\bf{E}}}}_{{vg}}- {\sum }_{k=1}^{K}{\zeta }_{k}{{{\bf{G}}}}_{v}{\widehat{{{\bf{w}}}}}_{k}{||}}^{2}}{{{||}{{{\bf{E}}}}_{{vg}}-{\bar{E}}_{{vg}}{||}}^{2}},& {\mbox{s.t.}}{\sum }_{k=1}^{K}{\zeta }_{k}=1,{\zeta }_{k}\in \left[0,1\right].\end{array}$$

(2)

As a result, we will obtain a set of model weights ${\zeta }_{k}$ for $k=1,\ldots,K$ base models, and a set of eQTL effect sizes $\widetilde{{{\bf{w}}}}$ given by the weighted average of the eQTL effect sizes of K base models, $\widetilde{{{\bf{w}}}}={\sum }_{k=1}^{K}{\zeta }_{k}{\widehat{{{\bf{w}}}}}_{k}$ (Stage I). Then, the final predicted GReX for test genotype data ${{{\bf{G}}}}_{t}$ is given by ${\widehat{{\mbox{GReX}}}}_{g}={{{\bf{G}}}}_{t}\widetilde{{{\bf{w}}}}$, and $\widetilde{{{\bf{w}}}}$ will be taken as variant weights in the gene-based association tests by SR-TWAS in Stage II. Genes with fivefold cross-validation (CV) ${R}^{2} \, > \, 0.5\%$ in the validation dataset by SR-TWAS are considered as having a valid imputation model and will be tested in Stage II⁹. Here, $\widetilde{{{\bf{w}}}}$ is the trained eQTL effect sizes by SR-TWAS (Stage I) for the target gene of the tissue of the validation data and identified significant genes from Stage II have potential genetic effects mediated through the transcriptome of the tissue of the validation data.

GTEx V8 reference transcriptomic data

The Genotype-Tissue Expression project version 8 (GTEx V8) release provides a comprehensive reference dataset of Whole Genome Sequencing (WGS) genotype data with matched RNA-seq transcriptomic data from 54 non-diseased tissues of 838 postmortem donors of European ancestry¹⁰. For our real data analysis, we used publicly available GReX imputation models trained by TIGAR-V2⁹ and PrediXcan^7,8,18 with GTEx V8 reference data for European subjects as base models. Supplementary Table 1 shows the disease status of GTEx subjects for tissues used in the AD or PD TWAS. Subjects in the GTEx V8 cohort are generally healthy. AD or other dementia was reported for only 4% of subjects, with missing data for ~1% of subjects. The majority of GTEx subjects used in PD TWAS were reported to be without PD, with <1% reported with PD and with missing data for 13% of training and 6% of validation subjects.

ROS/MAP reference transcriptomic data

The Religious Orders Study (ROS) and Rush Memory and Aging Project (MAP) are two longitudinal prospective clinical-pathologic cohort studies of aging and AD, which are collectively referred to as ROS/MAP¹¹. ROS recruits participants from religious orders across the United States while MAP recruits lay persons in the northeastern Illinois area¹¹. All participants in both studies are recruited without known dementia and agree to annual clinical evaluations and brain donation upon death¹¹. Only subjects of European ancestry were used in this study. The disease status of ROS/MAP participants used in this study is shown in Supplementary Table 1. Of the ROS/MAP subjects used in the training data of AD TWAS, 57% had normal cognitive function, 40% had AD or other dementia, and 4% had missing disease status. Of the ROS/MAP subjects used in the validation data of AD TWAS, 75% had normal cognitive function and 25% had AD or other types of dementia.

Simulation study design

We used the real genotype data of gene ABCA7 from ROS/MAP and GTEx V8 to simulate gene expression and phenotypes, and considered multiple scenarios with varying proportions of causal SNPs (${p}_{{causal}}=({\mathrm{0.001,0.01,0.05,0.1}})$) and gene expression heritability (i.e., the proportion of gene expression variation due to genetics, ${h}_{e}^{2}=({\mathrm{0.1,0.2,0.5}})$). We randomly selected n = 465 training samples with whole genome sequencing (WGS) genotype data from the ROS/MAP cohort and GTEx V8 cohort, respectively. We randomly selected n = 400 and n = 800 samples with WGS genotype data from ROS/MAP as our validation and test cohorts, respectively. ROS/MAP training, validation, and test samples were simulated with the same causal SNPs (i.e., eQTL), while training samples from the GTEx V8 cohort were simulated with true causal SNPs that were 50% overlapped with the ones for ROS/MAP samples. The simulated expression heritability was the same for both ROS/MAP and GTEx V8 samples.

We compared the performance of SR-TWAS with a Naïve approach (see Methods) which takes the average of base models as the trained gene expression imputation model, that is, taking ${\zeta }_{k}=\frac{1}{K},\, k=1,\ldots,\, K$. Two base models per gene were respectively trained by PrediXcan (penalized regression with Elastic-Net penalty; PrediXcan-GTEx) with the GTEx training samples (n = 465), and by TIGAR (nonparametric Bayesian Dirichlet process regression; TIGAR-ROSMAP) with the ROS/MAP training samples (n = 465). SR-TWAS and Naïve models were then obtained by using these trained base models. Validation data ($n=400$) were used to train SR-TWAS models and an additional TIGAR base model (TIGAR_ROSMAP_valid). The SR-TWAS and TIGAR_ROSMAP_valid models were then averaged to create a new model (Avg-valid + SR, see Methods). Gene expression imputation models (by SR-TWAS and Naïve methods) with fivefold cross-validation ${R}^{2} \, > \, 0.5\%$ in the validation cohort were considered valid models, which were used to produce the Avg-valid-SR models and used in the follow-up gene-based association tests. Test data (n = 800) were used for assessing GReX prediction performance and TWAS power, with 1000 repeated simulations per scenario. We compared the performance of three base models of the training and validation data (PrediXcan-GTEx, TIGAR-ROSMAP, and TIGAR-ROSMAP_valid), Naïve method, SR-TWAS, and Avg-valid + SR models.

Simulation study results

As shown in Fig. 1, we showed that Avg-valid+SR obtained the highest test ${R}^{2}$ for gene expression imputation across 11 of 12 scenarios, and the SR-TWAS models had the second-best performance. SR-TWAS obtained the highest test ${R}^{2}$ for gene expression imputation in the ${p}_{{\mbox{causal}}}=0.01$ and ${h}_{e}^{2}=0.5$ scenario, where it slightly outperformed the averaged models. The base models trained by TIGAR with ROS/MAP training and validation samples (TIGAR-ROSMAP, TIGAR-ROSMAP_valid) performed similarly well, because the test data were generated under the same model assumptions as the ROS/MAP training and validation data. Here, the Avg-valid + SR models performed best for leveraging the predictive information provided by all three base models. The Naïve method and PrediXcan-GTEx base models did not perform well because the PrediXcan-GTEx base models were trained using GTEx training data which only shared half of the true causal SNPs as the validation and test data. The Naïve approach of taking averages of the PrediXcan-GTEx and TIGAR-ROSMAP base models had poor performance because of the heterogeneous genetic architecture between the GTEx training cohort and test cohort.

Fig. 1: Boxplots of gene expression prediction R² for simulations with varying proportions of true causal SNPs p_causal = (0.001, 0.01, 0.05, 0.1) and true expression heritability ${{{ h}}}_{{{ e}}}^{{{2}}}=\left({{0.1,0.2,0.5}}\right)$.

As expected, model performance improved with increasing true expression heritability ${h}_{e}^{2}$ with the same training sample size. For all considered scenarios, the highest test ${R}^{2}$ were obtained under a sparse causality model with ${p}_{{\mbox{causal}}}=0.001$, where true causal SNP effect sizes would be relatively larger given the same ${h}_{e}^{2}$. The comparison of CV ${R}^{2}$ and training ${R}^{2}$ for Naïve and SR-TWAS approaches (Supplementary Figs. 1,2) also showed that SR-TWAS outperformed the Naïve approach under all scenarios. Because the averaging step used to obtain Avg-valid+SR models does not include training and cross-validation steps, no training ${R}^{2}$ or CV ${R}^{2}$ are obtained for comparison.

In order to assess TWAS power, phenotypes were simulated with a certain proportion of variance due to simulated gene expression (${h}_{p}^{2}$). We considered a series of ${h}_{p}^{2}$ values in the range of $({\mathrm{0.05,0.875}})$. The TWAS power comparison with three base models of the training and validation data (PrediXcan-GTEx, TIGAR-ROSMAP, and TIGAR-ROSMAP_valid), Naïve method, SR-TWAS, and Avg-valid + SR models were shown in Fig. 2, where the results were consistent with the test ${R}^{2}$ comparison as in Fig. 1. The Avg-valid + SR approach performed best, followed by SR-TWAS, TIGAR-ROSMAP training base models, and TIGAR-ROSMAP_valid validation base models, while the Naïve method and the PrediXcan_GTEx training base models performed poorly in comparison. In the ${p}_{{\mbox{causal}}}=0.01$ and ${h}_{e}^{2}=0.5$ scenario, SR-TWAS slightly outperformed Avg-valid + SR and had a noticeable advantage over the TIGAR-ROSMAP training base models and TIGAR-ROSMAP_valid validation models. The results showed the SR-TWAS approach indeed gained power by leveraging base models trained with multiple reference panels and by multiple statistical methods.

Fig. 2: Power comparison for simulations with varying proportions of true causal SNPs p_causal = (0.001, 0.01, 0.05, 0.1), true expression heritability ${{{ h}}}_{{{ e}}}^{{{2}}}=\left({{0.1,0.2,0.5}}\right)$, and phenotype heritability ${{{h}}}_{{{ p}}}^{{{2}}}\in\left({{0.05,0.875}}\right)$.

Although desirable TWAS power ~80% was only obtained in simulation scenarios with a relatively high ${h}_{p}^{2}$ that might be higher than the value in real studies, simulation power would increase along with increased test sample sizes. Because real GWAS test data would have a larger sample size than the 800 considered in our simulations, we expect desirable power for our SR-TWAS method in real studies.

Additional simulation studies

Additionally, we conducted similar simulation studies for two other settings, where samples from ROS/MAP and GTEx cohorts have the same set of true causal SNPs (i.e., the same genetic architecture), and (i) the expression heritability was the same for both ROS/MAP and GTEx V8 cohorts or (ii) the expression heritability for GTEx V8 cohort was half that of the ROS/MAP cohort. The results of these settings were similar to that of the previously described setting in which the GTEx V8 cohort had the same heritability and a 50% overlap of true causal SNPs compared to the ROS/MAP cohort. In these two additional simulation scenarios, Avg-valid + SR still had the best performance, while SR-TWAS, TIGAR-ROSMAP training base models, and TIGAR-ROSMAP_valid validation base models outperformed the PrediXcan-GTEx training base models and Naïve models.

Comparisons of CV ${R}^{2}$ and training ${R}^{2}$ for Naïve and SR-TWAS approaches for these scenarios (Supplementary Figs. 3–6) showed that SR-TWAS outperformed the Naïve approach under all scenarios. For all considered scenarios, again the highest test ${R}^{2}$ was obtained under a sparse causality model with high expression heritability (Supplementary Figs. 7, 8). Power comparison results show that the Avg-valid + SR models obtained the highest power in most scenarios, while SR-TWAS, TIGAR-ROSMAP training base models, and TIGAR-ROSMAP_valid validation base models generally outperformed the PrediXcan-GTEx training base models and the Naïve models (Supplementary Figs. 9, 10). The SR-TWAS approach gained more power in the setting in which the expression heritability for GTEx V8 cohort was only half that of ROS/MAP (Supplementary Fig. 10). SR-TWAS once again had the best performance in the ${p}_{{\mbox{causal}}}=0.01$ and ${h}_{e}^{2}=0.5$ scenario under both of these additional settings.

Simulation study model weights estimated by SR-TWAS

Plots of the weights (zeta values) of base models that were estimated by SR-TWAS in all three simulation settings and 12 scenarios (Supplementary Fig. 11) showed that the SR-TWAS training consistently estimated higher weights for the TIGAR-ROSMAP training base models compared to the PrediXcan-GTEx training base models, with many models selecting only the TIGAR-ROSMAP training base model (i.e., models in which the zeta value estimate for the TIGAR-ROSMAP training base model is 1 and the zeta value estimate for the PrediXcan-GTEx model is 0). This makes sense because the validation data were generated under the same model assumptions as the ROSMAP training data. In particular, when the GTEx training data were also generated under the same model as the validation data (orange bars in Supplemental Fig. 11), zeta value estimates for the TIGAR-ROSMAP models are more evenly distributed in [0, 1] for scenarios with sparse causality model (p_causal = 0.001) and high expression heritability (${h}_{e}^{2}$ = 0.2, 0.5). When the GTEx training data were generated under a different setting where half of the true causal eQTL in the validation data were also causal in the ROSMAP training data (black bars in Supplemental Fig. 11), the zeta value for the TIGAR-ROSMAP base model is more frequently estimated to be 1.

Even when both ROSMAP and GTEx training data were generated under the same model assumptions, the SR-TWAS method is still shown with gained power because the two base models trained by TIGAR and PrediXcan have complementary properties. In particular, PrediXcan uses a parametric penalized regression model with Elastic-Net penalty which is preferred for sparse genetic architecture of gene expression quantitative traits. Whereas TIGAR uses a nonparametric Bayesian Dirichlet process regression model which assumes an infinitesimal model for the underlying genetic architecture of gene expression quantitative traits. As shown in previous studies⁹, PrediXcan will perform better when the true causal eQTL is sparse, while TIGAR will perform better as the true causal eQTL proportion increases. Our simulation studies showed that SR-TWAS had improved performance across various scenarios with (p_causal = 0.001, 0.01, 0.05, 0.1; ${h}_{e}^{2}$ = 0.1, 0.2, 0.5; Supplementary Fig. 7).

Type I error assessment simulation study

We also assessed type I error under the example scenario with ${p}_{{\mbox{causal}}}=0.1$, ${h}_{e}^{2}=0.1$. Base model weights for TIGAR-ROSMAP, PrediXcan-GTEx, and TIGAR-ROSMAP-valid were permuted ${10}^{6}$ times. The TIGAR-ROSMAP and PrediXcan-GTEx training base models with permuted weights were then used to obtain SR-TWAS and Naïve models. The Avg-valid + SR was obtained by averaging the SR-TWAS models and TIGAR-ROSMAP_valid training base models with permuted weights. All models were then used to conduct gene-based association tests with a phenotype generated randomly from $N({\mathrm{0,1}})$. As shown in Supplementary Table 2, all models controlled well for type I errors for significance thresholds $({10}^{-4},\, {10}^{-5},\, 2.5\times {10}^{-6},\, {10}^{-6})$. The Quantile-Quantile (QQ) plots of the TWAS p values in these null simulations are also shown in Supplementary Fig. 12.

Real validation studies

To compare the GReX prediction accuracy with real gene expression data, we considered three sets of base models that were trained by TIGAR with ROS samples (n = 237, TIGAR_ROS_DLPFC) of dorsolateral prefrontal cortex (DLPFC) tissue, trained by TIGAR with GTEx V8 data of brain frontal cortex tissue (n = 157, TIGAR_GTEx_BRNCTXB)⁹, and trained by PrediXcan with the same GTEx reference data of brain frontal cortex tissue (n = 157, PrediXcan_GTEx_BRNCTXB)^7,8,18. SR-TWAS (SR-TWAS_MAP_DLPFC) and Naïve (Naive_MAP_DLPFC) models were trained from these three sets of base models with respect to a validation dataset with half of the MAP samples (n = 114, randomly selected) of DLPFC tissue. Valid gene expression imputation models trained by SR-TWAS and Naïve methods with fivefold CV ${R}^{2} \, > \, 0.5\%$ in validation data were tested using the other half of the MAP samples (n = 114) of DLPFC tissue.

By comparing test ${R}^{2}$ obtained by SR-TWAS, Naïve, and three sets of base models in the test MAP samples (Supplementary Table 3), we showed that PrediXcan_GTEx_BRNCTXB^7,8,18 had the highest median (0.070) and mean (0.113) test ${R}^{2}$ but only for 867 valid gene expression imputation models, SR-TWAS had the second highest median (0.026) and mean (0.068) test ${R}^{2}$ for 8425 valid genes expression imputation models, and Naïve model performed similarly to SR-TWAS but with a slightly lower median (0.025) and mean (0.065) test ${R}^{2}$ and fewer valid genes expression imputation models (8360). By pair-wise comparison of test ${R}^{2}$ for all genes with valid expression imputation models (Supplementary Fig. 13), SR-TWAS (y-axis) performed noticeably better than the Naïve and three sets of base models (x-axis).

Training expression imputation models of SMA tissue by SR-TWAS

We considered four sets of base models––TIGAR and PrediXcan models trained with 465 ROS/MAP samples of DLPFC tissue (TIGAR_ROSMAP_DLPFC, PrediXcan_ROSMAP_DLPFC), TIGAR and PrediXcan models trained with 157 GTEx V8 samples of prefrontal cortex tissue (TIGAR_GTEx_BRNCTXB⁹, PrediXcan_GTEx_BRNCTXB^7,8,18). An additional 76 ROS/MAP samples of the supplementary motor area (SMA) brain tissue were used as the validation dataset to train SR-TWAS models and to calculate the fivefold CV ${R}^{2}$ that was used to select genes with valid imputation models. Plots of zeta weights estimated by SR-TWAS for each set of training base models were presented in Supplementary Fig. 14. We observed that the weights of the TIGAR_GTEx_BRNCTXB⁹ training base models were estimated to be 1 more often than the other training base models. These results showed that the base models trained by TIGAR with the GTEx data of BRNCTXB tissue (TIGAR_GTEx_BRNCTXB⁹) contributed solely to SR-TWAS models for almost half of the genes, which was consistent with the numbers of valid gene expression imputation models as shown in Table 1.

Table 1 Comparison of CV R² of SMA tissue for valid gene expression imputation models given by training base models with ROS/MAP and GTEx V8 reference panels of DLPFC tissue, validation base models, SR-TWAS models, and Avg-valid+SR models

Full size table

PrediXcan and TIGAR models trained using the 76 ROS/MAP samples of the SMA validation dataset (PrediXcan_ROSMAP_SMA, TIGAR_ROSMAP_SMA) were also included for comparison and were averaged with SR-TWAS models to obtain an additional set of models (Avg-valid + SR). Here, we compared Avg-valid+SR and SR-TWAS models to training base models, as well as validation base models trained by PrediXcan and TIGAR using the validation data of the target SMA brain tissue, to show the advantages of SR-TWAS for leveraging multiple regression models, multiple reference panels, and multiple tissues.

By comparing the CV ${R}^{2}$ and numbers of genes with valid expression imputation models (Table 1), we found that gene expression imputation models trained by SR-TWAS for the SMA tissue (SR-TWAS_ROSMAP_SMA) had the highest median CV ${R}^{2}$ (0.072) and second highest mean CV ${R}^{2}$ (0.09) for ~20 K genes with valid expression imputation models. Although the PrediXcan_GTEx_BRNCTXB^7,8,18 training base models had the third highest median CV ${R}^{2}$ (0.061) and highest mean CV ${R}^{2}$ (0.10), only 4563 genes had valid expression imputation models. These results showed that SR-TWAS obtained improved CV ${R}^{2}$ in a real validation cohort of SMA tissue by leveraging multiple regression methods from two reference panels of multiple relevant tissues.

TWAS results of AD dementia

By using the eQTL weights obtained by training base models, validation base models, SR-TWAS models, and Avg-valid + SR models, we conducted TWAS with the summary-level data of the recent GWAS of AD dementia (n = ~762 K)¹⁹. Since Avg-valid + SR models were shown to have the best performance in our simulation studies, we focused on summarizing the results by Avg-valid + SR here. In particular, Avg-valid + SR identified a total of 89 significant TWAS risk genes of AD dementia with p values <$2.5\times {10}^{-6}$. Of these, 19 are known GWAS risk genes, 70 are within 1 Mb of a known GWAS risk gene, and 14 have been previously observed as TWAS risk genes of AD dementia^{20,21,22,23,24,25} (Supplementary Table 4).

Validation of significant TWAS risk genes of AD by PMR-Egger

In order to account for potential horizontal pleiotropy effects (genetic effects on phenotype that are not mediated by the considered GReX), we applied the PMR-Egger tool to the 89 significant TWAS risk genes of AD obtained by the Avg-valid + SR models. This analysis was performed using the same validation transcriptomic dataset and GWAS summary data as in the application TWAS of AD dementia. Of these 89 analyzed genes, 61 (68.5%) had a significant causal p value by PMR-Egger after Bonferroni correction for multiple testing.

Independent significant TWAS risk genes of AD dementia

Because TWAS considers genotype data within a ±1 Mb region of the test gene, nearby significant TWAS genes with overlapping test regions often have correlated GReX values and might not represent independent associations. We curated 6 independent TWAS risk genes of AD from the 61 significant genes that were validated by PMR-Egger (Table 2 and Fig. 3) by selecting the most significant gene in a cluster of significant genes with overlapped test regions as the independent risk gene. We found that one of these independent risk genes, HLA-DRA, was a known GWAS^22,26 risk gene and was also previously observed as a TWAS risk gene²¹. The other five independent risk genes were near known GWAS risk genes^{19,21,22,26,27,28} and near previously observed TWAS risk genes^20,22. Compared to the TWAS results using training base models (Supplementary Fig. 15) and validation base models (Fig. 3), Avg-valid + SR models identified the greatest number of independent risk genes.

Table 2 Independent TWAS risk genes of AD dementia identified by Avg-valid + SR models of SMA tissue

Full size table

**Fig. 3: Manhattan plots of TWAS results by validation base models, SR-TWAS models, and Avg-valid + SR models of SMA tissue for studying AD dementia.**

Protein–protein interaction network and enrichment analysis with risk genes of AD dementia

To investigate the underlying biological mechanisms of our identified TWAS risk genes of AD, we conducted protein–protein interaction network and enrichment analysis with our identified TWAS risk genes by the STRING²⁹ tool (Methods). As shown in Fig. 4A, we identified a major network consisting of 23 TWAS risk genes, including the well-known AD risk genes TOMM40³⁰, APOC1³¹, APOC2³¹, and TNF³². Our identified TWAS risk genes are enriched with known risk genes for AD-related phenotypes such as family history of AD, lipoprotein measurements, mental or behavioral disorder biomarkers, inflammatory biomarker measurement, and beta-amyloid 1–42 measurement (Fig. 4B).

**Fig. 4: Protein–protein interaction network and enrichment analyses with TWAS risk genes of AD dementia by Avg-valid + SR models.**

Training expression imputation models of brain substantia nigra tissue by SR-TWAS

We considered six sets of base models trained by TIGAR on six different tissues from GTEx V8––brain anterior cingulate cortex BA24 (BRNACC) (n = 136), brain caudate basal ganglia (BRNCDT) (n = 173), brain cortex (BRNCTXA) (n = 184), brain nucleus accumbens basal ganglia (BRNNCC) (n = 182), brain putamen basal ganglia (BRNPTM) (n = 154), and whole blood (BLOOD) (n = 574). With these six sets of training base models, an additional 101 GTEx samples of brain substantia nigra (BRNSNG) tissue were used as the validation data to train SR-TWAS models. We presented the plots of zeta weights of training base models that were estimated by SR-TWAS in Supplementary Fig. 16. For all six sets of considered training base models, zeta weights were similarly distributed with 0’s for most genes and other values distributed over (0, 1). Fivefold CV ${R}^{2}$ was calculated and used to select genes with valid expression imputation models for TWAS. PrediXcan and TIGAR models trained on the validation data of brain substantia nigra tissue (PrediXcan_GTEx_BRNSNG^7,8,18, TIGAR_GTEx_BRNSNG⁹) were used to obtain average models (Avg-valid+SR) of these two sets of validation base models and SR-TWAS models.

By comparing the CV ${R}^{2}$ and number of genes with valid expression imputation models (Table 3), we found that gene expression imputation models trained by SR-TWAS for the brain substantia nigra tissue (SR-TWAS_GTEx-BRNSNG⁹) had the highest median CV ${R}^{2}$ (0.068) and highest mean CV ${R}^{2}$ (0.094) for ~23 K genes with valid expression imputation models. These results showed that SR-TWAS obtained improved CV ${R}^{2}$ in a real validation cohort of substantia nigra tissue by leveraging multiple regression methods from two reference panels of multiple relevant tissues.

Table 3 Comparison of CV R² of BRNSNG tissue for valid gene expression imputation models given by training base models with GTEx V8 reference panel of multiple tissues, validation base models, SR-TWAS models, and Avg-valid + SR models

Full size table

TWAS results of PD

We conducted TWAS using GWAS summary statistics by the recent GWAS of PD (n = ~33K cases, ~18K UK Biobank proxy-cases, and ~828K controls)³³, using eQTL weights estimated by the above six sets of training base models of multiple tissues, validation base models of the BRNSNG tissue, SR-TWAS models, and Avg-valid + SR models (Supplementary Fig. 17). Here, we also focused on the results by using the Avg-valid + SR models (Fig. 5; Table 4; and Supplementary Table 5), including a total of 60 significant TWAS risk genes of PD. Of these, 11 are known GWAS risk genes, 47 are within 1 Mb of a known GWAS risk gene, and 11 have been previously observed as TWAS risk genes of PD.

**Fig. 5: Manhattan plots of TWAS results by validation base models, SR-TWAS models, and Avg-valid + SR models of substantia nigra tissue for studying Parkinson’s disease.**

Table 4 Independent TWAS risk genes of Parkinson’s disease identified by Avg-valid + SR models of SMA tissue

Full size table

Validation of significant TWAS risk genes of PD by PMR-Egger

We applied the PMR-Egger tool to the 60 significant TWAS risk genes of PD obtained by the Avg-valid + SR models. This analysis was performed using the same validation transcriptomic dataset and GWAS summary data as in the application TWAS of PD. Of these genes, 46 (76.6%) had a significant causal p value by PMR-Egger after Bonferroni correction for multiple testing.

Independent significant PD TWAS risk genes

Similarly, from these 46 replicated risk genes with significant causal p values by PMR-Egger, we curated nine independent TWAS risk genes of PD (Fig. 5 and Table 4), including six novel TWAS risk genes (LA16c-431H6.7, ADORA2B, AC005082.12, MAPK8IP1P2, MYLPF, and PARL). Of these novel TWAS risk genes, four (AC005082.12, MAPK8IP1P2, MYLPF, and PARL) are near known GWAS risk genes (GPNMB^33,34, MAPT³³, MCCC1³³, SETD1A³³, ZSWIM7³⁵). The other 3 previously observed TWAS risk genes were also known GWAS risk genes (CD38^34,35, MMRN1^33,35, and NDUFAF2³³). Compared to the TWAS results using these six training base models (Supplementary Fig. 17) and validation base models (Fig. 5), Avg-valid + SR models still identified the greatest number of independent risk genes that were validated by PMR-Egger.

Protein–protein interaction network and enrichment analysis with risk genes of PD

Similarly, we conducted protein–protein interaction network and enrichment analysis with our identified TWAS risk genes of PD by the STRING²⁹ tool (Methods). As shown in Fig. 6A, we identified four networks with at least two connected genes, including a major one with nine genes connected to the well-known PD risk gene MAPT³³, and another network with six genes connected to gene PRSS53, a mapped PD risk gene in the GWAS Catalog³⁶. Interestingly, our identified TWAS risk genes of PD are enriched with known risk genes for mental traits such as anxiety, white matter microstructure measurement, handedness, and neuroticism measurement (Fig. 6B).

**Fig. 6: Protein–protein interaction network and enrichment analyses with TWAS risk genes of PD by Avg-valid + SR models.**

Discussion

We present a novel TWAS tool (SR-TWAS) using the ensemble machine learning technique of stacked regression^15,16,37 for leveraging multiple gene expression imputation models (i.e., base models) trained by different regression methods and/or using different transcriptomic reference panels of different tissue types. We also constructed a set of average models (Avg-valid + SR) of the SR-TWAS models and validation base models that were trained using the validation data. Different from existing methods such as UTMOST¹² and SWAM¹⁴, SR-TWAS requires only summary-level base models, providing the flexibility of using publicly available base models.

With comprehensive simulation studies, we compared the Avg-valid+SR models to SR-TWAS models, the naïve approach of averaging all training base models, training base models, and validation base models. We showed that the Avg-valid+SR expression imputation models had the best prediction accuracy and led to the best TWAS power across 11 out of 12 considered scenarios, and that SR-TWAS models performed the best in the remaining scenario.

In the real data validation and application studies using ROS/MAP and GTEx V8 reference panels and GWAS summary data of Alzheimer’s disease (AD) dementia and Parkinson’s disease (PD), Avg-valid + SR models also outperformed base models trained using single reference panels and tissue types. Avg-valid+SR models identified a greater number of total independent risk genes that were replicated by PMR-Egger than any of the base training or validation models. Besides known GWAS/TWAS risk genes that were identified by Avg-valid + SR models, we also found five novel independent TWAS risk genes for AD dementia and six novel independent TWAS risk genes for PD with known functions in respective disease pathology (Tables 2 and 4). Most of these novel findings are located within the 1 Mb region of previously known GWAS and TWAS risk genes. Additionally, we found interesting biological interpretations relevant to AD dementia and PD for our identified TWAS risk genes.

Important findings by application TWAS of AD dementia

The results of the TWAS of AD include genes with known associations with AD dementia, or with relevant biological processes like immune response and regulation of AD-associated genes. Among these 6 curated independent significant genes by Avg-valid + SR models (Table 2), HLA-DRA is located in the major histocompatibility complex region that is expressed in glial cells³⁸ and which has also been previously identified by eQTL analysis²². The other five independent significant TWAS AD risk genes of AD (AC092849.1, AL110118.2, FOSB, RN7SL225P, and SRD5A3P1) are within 1 Mb of known GWAS risk genes of AD^{19,21,22,26,27,28} and previously observed TWAS risk genes^20,21. FOSB was the most significant of a cluster of 38 identified significant TWAS risk genes with test region overlapped with the well-known GWAS risk gene APOE^{19,21,22,28,39,40} (Fig. 3). An alternatively-spliced product of the FOSB gene has been implicated in the regulation of gene expression and cognitive dysfunction in mouse models of AD⁴¹. SRD5A3P is located in the AD-associated MS4A gene cluster²⁷, which contains multiple known GWAS risk genes^19,21,22,23 as well as TWAS risk gene MS4A2²⁰ of AD. The MS4A gene cluster is notable due to its role in the regulation of soluble TREM2, which is encoded by the known AD risk gene TREM2, in cerebrospinal fluid in AD²⁷.

Important findings by application TWAS of PD

Similarly, results of the TWAS of PD include genes with known associations with PD, with related conditions, and with relevant biological processes like inflammation. Among these nine curated independent significant genes (Table 4), PARL plays a role in regulating cellular processing of the mitochondrial kinase protein encoded by PINK1, mutations in which are a known cause of recessively-inherited, early-onset PD⁴². NDUFAF2 encodes for a component of mitochondrial complex I and loss of its functionality results in a rare mitochondrial encephalopathy with frequent substantia nigra pathology and motor symptoms⁴³. NDUFAF2 was also identified as a potential drug target in a Mendelian randomization study of potential drug targets for PD treatment⁴⁴. A study of PD-associated GPNMB found that it is upregulated with the lncRNA gene AC005082.12⁴⁵.

Additionally, ADORA2B encodes adenosine receptor A2B, which is an important cell receptor involved in numerous pathways and implicated in a broad variety of diseases including asthma, sepsis, inflammatory bowel disease, cancer, renal disease, diabetes, vascular diseases, and lung disease^46,47. The immunomodulatory effects and role in inflammatory processes have made A2B a target for pharmacological therapeutics⁴⁶and antagonists the similar adenosine receptor A2A were the first non-dopaminergic drug therapy for PD⁴⁸. MAPK8IP1P2 is a pseudogene near a known TWAS PD risk gene LRRC37A2³⁵ (also identified by Avg-valid+SR models). CD38 is involved in neurodegeneration, neuroinflammation, and aging^35,49. MMRN1 is a carrier protein for platelet factor V and lies ∼84KB downstream of a well-established GWAS PD risk locus found in multiple populations³³.

Tool for implementing SR-TWAS

The SR-TWAS tool, including options for constructing SR-TWAS models, models by the Naïve method, and Avg-valid + SR models, is publicly available on GitHub. The SR-TWAS tool implements user-friendly features, including accepting genotype data in standard VCF format as input, enabling parallel computation, and using efficient computation strategies to reduce time and memory usage. The most computationally expensive part is to train all base models with different reference panels, which is subject to the regression method. For example, with training sample size n = 465, PrediXcan (Elastic-Net) costs ~1 CPU minute, and TIGAR (DPR) costs ~3 CPU minutes on average per gene. Publicly available trained models can also be used as base models by the SR-TWAS tool. The process of training SR-TWAS models from base models and validation data is quite computationally efficient. For example, with the ROS/MAP SMA tissue validation dataset (n = 76) and four base models in our real studies SR-TWAS model training costs ~15 CPU seconds per gene. With the GTEx substantia nigra tissue validation dataset (n = 101) and six base models in our real studies SR-TWAS model training costs ~103 CPU seconds per gene.

Limitations

SR-TWAS still has its limitations. For example, SR-TWAS only considers cis-eQTL during model training, uses the standard two-stage TWAS, requires an additional validation dataset of the target tissue independent of those used for base model training¹⁶, and assumes samples of the reference panels and validation dataset are of the same ancestry⁵⁰.

First, previous studies have illustrated the importance of considering both cis- and trans- eQTL in TWAS⁵¹, and joint modeling of the gene expression imputation and the gene-based association test^52,53. The stacked regression technique used by SR-TWAS also applies to scenarios considering both cis- and trans- eQTL, when base models trained with both cis- and trans- eQTL are available.

Second, the standard two-stage TWAS framework (implemented by SR-TWAS) has limited power due to not considering the uncertainty of estimated eQTL weights and possible inflated false positives due to not considering potential horizontal pleiotropy (i.e., genetic effects on the phenotype of interest that are not mediated by GReX). Alternatively, a collaborative mixed model implemented by TWAS tools of CoMM⁵² and CoMM-S2⁵⁴ that jointly model the reference transcriptomic and GWAS datasets (instead of two separate stages) is an effective approach to improve TWAS power by considering the uncertainty of estimated eQTL weights. The recently proposed PMR-Egger⁵³ tool (for probabilistic mendelian randomization) can test the genetic effects mediated through the GReX term (equivalent to TWAS association tests) while controlling for horizontal pleiotropy could be used to validate the findings by SR-TWAS.

Third, in this study we only evaluated scenarios where reference samples used to train base models are of the same ancestry as the validation data. Although the method can be generalized to scenarios where base and validation data are of different ancestries, having at least one set of training base models with the same ancestry as the validation data would be a requirement for promising TWAS results. Evaluating the performance of SR-TWAS with base models of multiple ancestries is beyond the scope of this study, and is part of our ongoing study.

Summary

Overall, the SR-TWAS tool provides a useful resource for researchers to take advantage of the publicly available gene expression imputation models by using multiple regression methods (e.g., PrediXcan^7,8, FUSION¹⁷, and TIGAR⁹) and different reference panels of multiple tissue types (e.g., ROS/MAP¹¹ and GTEx V8¹⁰). In particular, the final trained gene expression imputation model by SR-TWAS will be with respect to the same tissue type as the validation dataset. Because multiple base models would not only increase the robustness of the gene expression imputation model but also increase the total effective training sample size, SR-TWAS is expected to further increase TWAS power for studying complex human diseases. The approach of constructing average models of the SR-TWAS models and validation base models (Avg-valid + SR) provides a set of optimal gene expression imputation models that can leverage both training base models and validation base models to achieve the best TWAS performance.

Methods

SR-TWAS using stacked regression

Stacked regression is a machine learning method for forming optimal linear combinations of different predictors to improve prediction accuracy¹⁶. The theoretical background for combining predictors rather than selecting a single best predictor is well-established and has been developed since the 1970s^16,55,56. The stacking method of combining predictors originated in a 1992 paper¹⁵ by Wolpert, who described the concept as any scheme for feeding information from a set of cross-validated models to another before forming the final prediction in order to reduce prediction error¹⁵. The idea is further expanded with stacked regression, a specific framework for combining the initial predictors by weighted average with coefficient constraints to control for multicolinearity¹⁶.

In standard two-stage TWAS, we need to first fit a gene expression imputation model, which is assumed as a multivariable linear regression model, with quantitative gene expression levels E_g for the target gene and tissue type as the response variable, and genotype matrix G of nearby/genome-wide SNPs as predictors,

$$\begin{array}{cc}{{{\bf{E}}}}_{g}={{\bf{GW}}}+{{\boldsymbol{\in }}},& {\in }_{i} \sim N\left(0,\, 1\right).\end{array}$$

(3)

This gene expression imputation model can be trained per gene per tissue type, using a transcriptomic reference panel which profiles both transcriptomic and genetic data of the same training cohort. SNPs with non-zero effect sizes w are referred to as a broad sense of eQTL. The eQTL effect sizes w will be estimated from each trained model by different regression methods and/or using different reference data of multiple tissue types.

Assume there are a total of K base gene expression imputation models that are trained for the same target gene and tissue type, with ${\widehat{{{\bf{w}}}}}_{k},k=1,\ldots,K,$ as the trained eQTL effect sizes per base model. Let ${{{\bf{E}}}}_{{vg}}$ denote the gene expression levels of the same target gene g and tissue type in the validation data, and ${{{\bf{G}}}}_{v}$ denote the genotype matrix of the same genetic predictors in the validation data. Then the predicted Genetically Regulated gene eXpression (GReX) of the validation samples are given by ${{{\bf{G}}}}_{v}{\widehat{{{\bf{w}}}}}_{k}$, by the kth base model. The stacked regression method^15,16 will solve for a set of optimal model weights ${\zeta }_{1},\ldots,{\zeta }_{K}$, by maximizing the regression ${R}^{2}$ between the profiled gene expression ${{{\bf{E}}}}_{{vg}}$ and the weighted average GReX, ${\sum }_{k=1}^{K}{\zeta }_{k}{{{\bf{G}}}}_{v}{\widehat{{{\bf{w}}}}}_{k}$, of K base models, i.e., minimizing the following loss function of $1-{R}^{2}$:

$$\begin{array}{cc}{{\mbox{minimize}}}_{\left({\zeta }_{k}{;k}=1,\ldots,K\right)}\frac{{{||}{{{\bf{E}}}}_{{vg}}-{\sum }_{k=1}^{K}{\zeta }_{k}{{{\bf{G}}}}_{v}{\widehat{{{\bf{w}}}}}_{k}{||}}^{2}}{{{||}{{{\bf{E}}}}_{{vg}}-{\bar{E}}_{{vg}}{||}}^{2}},& {\mbox{s.t.}}{\sum }_{k=1}^{K}{\zeta }_{k}=1,{\zeta }_{k}\in \left[{\mathrm{0,1}}\right].\end{array}$$

(4)

As a result, we will obtain a set of model weights ${\zeta }_{k}$ for $k=1,\ldots,K$ base models, and a set of eQTL effect sizes $\widetilde{{{\bf{w}}}}$ given by the weighted average of the eQTL effect sizes of K base models, $\widetilde{{{\bf{w}}}}={\sum }_{k=1}^{K}{\zeta }_{k}{\widehat{{{\bf{w}}}}}_{k}$ (Stage I). Then the final predicted GReX for test genotype data ${{{\bf{G}}}}_{t}$ is given by ${\widehat{{\mbox{GReX}}}}_{g}={{{\bf{G}}}}_{t}\widetilde{{{\bf{w}}}}$, and $\widetilde{{{\bf{w}}}}$ will be taken as variant weights in the gene-based association tests by SR-TWAS in Stage II.

Genes with fivefold CV ${R}^{2} \, > \,0.5\%$ in the validation dataset by SR-TWAS are considered as having a valid imputation model and will be tested in Stage II. That is, the validation dataset will be randomly split into 5 folds. For each fold of data, SR-TWAS model will be trained using the other fourfold data and then use to calculate prediction ${R}^{2}$ with the current fold. The average prediction ${R}^{2}$ across all 5 folds of data is considered as the fivefold CV ${R}^{2}$. Here, we use a more liberal threshold (0.005) than the threshold 0.01 used by previous studies^17,57,58 to allow more genes to be tested in follow-up TWAS. Because the follow-up gene-based association Z-score test statistic is essentially a weighted average of single variant GWAS Z-score statistics with variant weights provided by the eQTL effect sizes⁹, poorly estimated eQTL weights would only reduce power but will not increase false positive rate under the null hypothesis.

Naïve method

In this paper, we compared SR-TWAS to a Naïve approach which just takes the average of base models as the trained gene expression imputation model, that is, takes ${\zeta }_{k}=\frac{1}{K},\, k=1,\ldots,\, K$. Using a validation dataset, we can still evaluate the validation ${R}^{2}$ which can be used to select valid genes with validation ${R}^{2} \, > \, 0.5\%$.

Avg-valid + SR models

We further constructed average models of the SR-TWAS models and validation base models trained using the validation dataset, which are referred to as Avg-valid+SR models. Because SR-TWAS and validation base models are averaged directly, training ${R}^{2}$ and CV ${R}^{2}$ are not obtained for Avg-valid + SR models. We compared the Avg-valid + SR models to the SR-TWAS models and validation base models in both simulation and real studies.

SR-TWAS tool framework

SR-TWAS tool was designed to be compatible with the TIGAR-V2 tool framework⁹; it accepts models trained by TIGAR-V2 as input, imports utility functions from TIGAR-V2, and outputs model files which can be used as input for TIGAR-V2 GReX prediction and summary-level TWAS. Much of the structure of the SR-TWAS code was derived from existing TIGAR-V2 scripts and it shares dependencies on TABIX⁵⁹ and the Python libraries of numpy^60,61, pandas⁶⁰, scipy⁶², statsmodels⁶³, and scikit-learn^64,65.

The SR-TWAS script utilizes scikit-learn’s consistent, extensible interfaces for defining estimators and predictors and for initializing objects⁶⁵. The script trains a stacked regression model using a modified version of scikit-learn’s StackingRegressor class, which trains a final estimator from cross-validated predictions from base estimators fitted on the full design matrix. The script defines two custom classes to be used as input for the stacking regressor object: a base estimator class (WeightEstimator) which converts trained GReX prediction models into scikit-learn-compatible estimator objects and a final estimator class (ZetasEstimator) which obtains the values of ${\zeta }_{1},\ldots,{\zeta }_{K}$ that minimize the loss function under the constraints ${\zeta }_{k}\ge 0$ and ${\sum }_{k=1}^{K}{\zeta }_{k}=1$¹⁶.

During the stacked regression, SNP minor allele frequencies and effect sizes for the specified target are first read from each of the K user-specified weight files. The SNPs are then matched to SNPs in the validation genotype data and filtered to exclude effect sizes of SNPs for which the difference between the MAF of the genotype data and the MAF from the corresponding weight file exceeds a user-specified MAF difference threshold. The effect sizes from each weight file are used to initialize K separate instances of the WeightEstimator class. These K WeightEstimator objects are used as base estimators and fit on genotype and expression data from the validation data.

A separate script allows users to specify one or more base models trained on the validation data to average with the SR-TWAS model produced in the previous step. The tool will read SNP effect sizes for the specified target from each of these models and output an Avg-valid + SR model averaged from the validation database model(s) and the model obtained by stacked regression.

Only SR-TWAS models trained from K = 2,4,6 base models are presented in this paper. The code was designed to accept any K ≥ 2, and while the stacked regression script has been primarily tested using K = 2,4,6 base models, preliminary testing with dummy weight files confirms it can train stacked regression models from K > 6 base models.

ROS/MAP reference panel

The Religious Orders Study (ROS) and Rush Memory and Aging Project (MAP) are two ongoing longitudinal, epidemiologic clinical-pathologic cohort studies of aging and Alzheimer’s disease collectively referred to as ROS/MAP¹¹. ROS enrolls Catholic nuns, priests, and brothers from religious groups across the United States, primarily from communal living settings¹¹. While the similar adult lifestyle of participants allows for more control of potential confounders such as education and socioeconomic status, it simultaneously limits the ability to study such variables¹¹.

MAP was designed to complement and extend studies like ROS by including subjects from a wider range of life experiences, socioeconomic status, and educational attainment and recruits participants primarily from retirement communities in the Chicago area, but also subsidized housing, retirement homes, and through organizations serving minorities and low-income elderly¹¹. All participants in both studies are without known dementia and agree to annual clinical evaluations and brain donation upon death¹¹. Similarity in study design and data collection procedures allows the ROS and MAP datasets to be merged for use in joint analyses^11,66.

Quality-controlled ROS/MAP WGS data for European subjects⁶⁶ were used for both the real data application and simulation studies. Transcriptomic data of ROS/MAP samples of brain PFC were profiled by RNA-sequencing (RNA-seq). Gene expression data of Transcripts Per Million (TPM) per sample were provided by Rush Alzheimer’s Disease Center. Genes with >0.1 TPM in ≥10 samples were considered. Raw gene expression data (TPM) were then log2 transformed and adjusted for age at death, sex, postmortem interval, study (ROS or MAP), batch effects, RNA integrity number scores, cell type proportions (with respect to oligodendrocytes, astrocytes, microglia, neurons), top five genotype principal components, and top probabilistic estimation of expression residuals (PEER) factors⁶⁷ by linear regression models. SNPs with minor allele frequency (MAF) >1%, Hardy–Weinberg p value > 10⁻⁵ were analyzed. For each gene, cis-SNPs within 1 Mb of the flanking 5’ and 3’ ends were used in the imputation models as predictors.

GTEx V8 reference panel

The genotype-tissue expression (GTEx) project V8 profiles both whole genome sequencing (WGS) genotype data and RNA-seq transcriptomic data of 54 human tissues¹⁰. The fully processed, filtered, and normalized transcriptomic data used in the GTEx eQTL analysis were downloaded from the GTEx portal and used in this study. For each tissue, samples with <10 million mapped RNA-seq reads were excluded. For samples with replicates, the replicate with the greatest number of reads was selected. Gene read counts from each sample were normalized using size factors calculated by DESeq2 and log-transformed with an offset of 1. Genes with log-transformed values >1 in >10% of samples were considered. The resulting gene expression values were centered with mean 0 and standardized with standard deviation 1. The resulting matrix was then hierarchically clustered (based on average and cosine distance), and a chi2 p value was calculated based on Mahalanobis distance. Clusters with ≥60% samples with Bonferroni-corrected p values <0.05 were marked as outliers, and their samples were excluded. Genetic variants with missing rate <20%, minor allele frequency >0.01, and Hardy–Weinberg equilibrium p value >10⁻⁵ were considered for fitting the gene expression prediction models.

The fully processed, filtered, and normalized transcriptomic data were adjusted for the top five genotype principal components, top probabilistic estimation of expression residuals (PEER) factors⁶⁷, sequencing protocol (PCR-based or PCR-free), sequencing platform (Illumina HiSeq 2000 or HiSeq X), and sex, as suggested by the GTEx eQTL data analysis guidelines¹⁰. The number of top PEER factors used to adjust the gene expression traits depends on the sample size (n) in the reference transcriptomic data cohort—15 factors for n < 150, 30 factors for 150 ≤ n < 250, 45 factors for 250 ≤ n < 350, and 60 factors for n ≥ 350. Only samples with complete data of these covariates were included in the analyses. Adjusted gene expression quantitative traits were then taken as response variables in the gene expression prediction model. For each gene, cis-SNPs within 1 Mb of the flanking 5’ and 3’ ends were used in the imputation models as predictors.

Simulation study design

We conducted in-depth simulation studies under various scenarios to assess the performance of SR-TWAS, Avg-valid + SR, a Naïve method, and training base models by PrediXcan and TIGAR. We used the real genotype data of gene ABCA7 from ROS/MAP and GTEx V8 to simulate gene expression and phenotypes. We considered three different settings: (i) Samples from ROS/MAP and GTEx cohorts have the same set of true causal SNPs (i.e., the same genetic architecture). The expression heritability was the same for both ROS/MAP and GTEx V8 cohorts. (ii) Samples from ROS/MAP and GTEx cohorts have the same set of true causal SNPs (i.e., the same genetic architecture). The expression heritability for GTEx V8 cohort is only half of the one for ROS/MAP. (iii) Samples from the ROS/MAP cohort were simulated with the same causal SNPs (i.e., eQTL), while samples from the GTEx V8 cohort were simulated with true causal SNPs that were 50% overlapped with the ones for ROS/MAP. The expression heritability was the same for both ROS/MAP and GTEx V8 cohorts.

Under each setting, we considered multiple scenarios with varying proportions of causal SNPs (${p}_{{causal}}=({\mathrm{0.001,0.01,0.05,0.1}})$) and gene expression heritability (i.e., the proportion of gene expression variation due to genetics, ${h}_{e}^{2}=({\mathrm{0.1,0.2,0.5}})$). We randomly selected n = 465 training samples with WGS genotype data from ROS/MAP and GTEx V8, respectively. We randomly selected n = 400 and n = 800 samples with WGS genotype data from ROS/MAP as our validation and test cohorts, respectively. We considered a series of ${h}_{p}^{2}$ values, the proportion of phenotype variance due to simulated gene expression, in the range of (0.05, 0.875).

For each scenario, gene expression ${{{\bf{E}}}}_{i}$ for the ith simulation iteration is generated using the following formula

$$\begin{array}{ccc}{{{\bf{E}}}}_{i}={\gamma }_{i}{{{\bf{G}}}}^{*}{{{\boldsymbol{\beta }}}}_{i}+{{{\boldsymbol{\varepsilon }}}}_{i},& {\gamma }_{i}=\sqrt{\frac{{h}_{e}^{2}}{{\mbox{Var}}\left({{{\bf{G}}}}^{*}{{{\boldsymbol{\beta }}}}_{\bullet i}\right)}},& {{{\boldsymbol{\varepsilon }}}}_{i}\sim {\mbox{N}}\left(0,\sqrt{1-{h}_{e}^{2}}\right),\end{array}$$

(5)

where ${{{\bf{G}}}}^{*}$ denotes the genotype matrix of ${N}_{{\mbox{causal}}}$ randomly chosen true causal SNPs for all samples, effect size vector ${{{\boldsymbol{\beta }}}}_{i}$ was generated from $N(0,I)$, and ${\gamma }_{i}$ is a scale factor chosen to ensure the targeted ${h}_{e}^{2}$ value. The phenotype vector ${{{\bf{Y}}}}_{i}$ for the ith simulation iteration was generated using the following formula

$$\begin{array}{ccc}{{{\bf{Y}}}}_{i}={\varphi }_{i}{{{\bf{E}}}}_{i}+{{{\boldsymbol{\varepsilon }}}}_{i},& {\varphi }_{i}=\sqrt{\frac{{h}_{p}^{2}}{{\mbox{Var}}\left({{{\bf{E}}}}_{i}\right)}},& {{{\boldsymbol{\varepsilon }}}}_{i}\sim {\mbox{N}}\left(0,\sqrt{1-{h}_{p}^{2}}\right)\end{array}$$

(6)

where ${{{\bf{E}}}}_{i}$ is the simulated gene expression, and ${\varphi }_{i}$ is a scale factor to ensure the targeted ${h}_{p}^{2}$ value.

Two base models per gene were trained by PrediXcan with the GTEx training samples (n = 465) (PrediXcan-GTEx), and by TIGAR with the ROS/MAP training samples (n = 465) (TIGAR-ROSMAP). SR-TWAS and Naïve models were then obtained by using these trained base models. Validation data (n = 400) were used to train SR-TWAS models and filter out gene expression imputation models with fivefold cross-validation R² < 0.5% in the validation cohort for both SR-TWAS and Naïve models. A validation database model was trained by TIGAR on the validation data (TIGAR-ROSMAP_valid) to compare results with that of the ensemble models and to obtain a model from the average of the the validation database models and SR-TWAS (Avg-valid + SR). Test data (n = 800) were used for assessing GReX prediction performance and TWAS power. Each causal simulation scenario was repeated 1000 times. We compared the performance by SR-TWAS, Avg-valid + SR, the Naïve method, training base models, and validation base models with respect to prediction imputation ${R}^{2}$ in the test data and the power of TWASs.

The predicted ${\widehat{{{\bf{GReX}}}}}_{i}$ by each trained gene expression imputation model was used to calculate expression prediction ${R}^{2}$, which is equivalent to the regression ${R}^{2}$ between profiled and predicted gene expression, given by the squared Pearson correlation coefficient,

$${R}_{{{{\bf{E}}}}_{i}}^{2}={\mbox{Cor}}{\left({{{\bf{E}}}}_{i},{\widehat{{{\bf{GReX}}}}}_{i}\right)}^{2}$$

(7)

The power will be given by the proportion of simulation iterations that have TWAS p value < 2.5 × 10⁶ out of a total of 1000 simulation iterations.

Protein–protein interaction network and enrichment analysis

STRING (version 12.0)²⁹ is a bioinformatics web tool that provides information on protein–protein interactions and networks, as well as functional characterization of genes and proteins. The tool integrates different types of evidence from public databases, such as genomic context, high-throughput experiments, and previous knowledge from other databases, to generate reliable predictions of protein interactions and build networks and pathways. Provided with a list of gene names, STRING will construct networks based on the protein–protein interactions of the corresponding proteins, as well as identify phenotypes that have risk genes enriched in the provided list. Proteins corresponding to provided genes are considered nodes in the protein–protein interaction network. Protein–protein edges represent the predicted functional associations, and their color denotes one of seven different evidence categories––computational interaction predictions from co-expression, text-mining of scientific literature, databases of interaction experiments (biochemical/genetic data), known protein complexes or pathways from curated resources, gene co-occurrence, gene fusion, and gene neighborhood. Gene co-occurrence, fusion, and neighborhood represent association predictions based on whole-genome comparisons. Interactions from these resources are critically assessed, scored, and subsequently automatically transferred to less well-studied organisms using hierarchical orthology information²⁹.

Particularly, the text-mining channel is the result of parsing full-text articles from the PMC Open Access Subset (up to April 2022), PubMed abstracts (up to August 2022), as well as summary texts from OMIM⁶⁸ and Saccharomyces genome database⁶⁹ entry descriptions. These texts are all parsed for co-mentions of protein pairs and assessed against the frequencies of all separate mentions of the respective proteins. An improved deep learning-based relation extraction text-mining model was used by STRING v12²⁹. The text-mining channel significantly increases the number of protein–protein interactions.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All ROS/MAP data analyzed in this study are de-identified and available to any qualified investigator with application through the Rush Alzheimer’s Disease Center Research Resource Sharing Hub, [https://www.radc.rush.edu], which has descriptions of the studies and available data¹¹. GTEx V8 data are available from dbGaP with accession phs000424.v8.p2 and GTEx Portal [https://www.gtexportal.org/home/]¹⁰. TIGAR DPR base models trained from GTEx V8 are available from SYNAPSE with SynID syn16804296 [https://doi.org/10.7303/syn16804296]. PrediXcan Elastic-Net base models trained from GTEx V8 are available from the PredictDB Data Repository [https://predictdb.org/]¹⁸. GWAS summary data of AD are available from the Vrije Universiteit Research Drive, [https://vu.data.surfsara.nl/index.php/s/jVlyt1m9Bb2mAki]¹⁹. GWAS summary data of PD are available from [https://bit.ly/2ofzGrk]³³. TIGAR DPR and PrediXcan Elastic-Net base models of ROS/MAP tissues (DLPFC, SMA), SR-TWAS and Avg-valid+SR models trained from ROS/MAP SMA tissue and GTEx brain substantia nigra tissue in this study, and all TWAS summary statistics generated in this study are freely available from SYNAPSE with SynID syn53437281 [https://doi.org/10.7303/syn53437281].

Code availability

The SR-TWAS tool, including the Naïve and Avg-valid + SR methods, is publicly available on GitHub, [https://github.com/yanglab-emory/SR-TWAS], with [https://zenodo.org/doi/10.5281/zenodo.12574019]. Code for replicating analyses described in this paper is available at [https://github.com/rndparr/SR-TWAS_analysis], with DOI.

References

Feng, H. et al. Transcriptome‐wide association study of breast cancer risk by estrogen‐receptor status. Genet. Epidemiol. 44, 442–468 (2020).
Article PubMed PubMed Central Google Scholar
Kar, S.P. et al. Pleiotropy-guided transcriptome imputation from normal and tumor tissues identifies new candidate susceptibility genes for breast and ovarian cancer. HGG Adv. 2, 3 (2021)
Strunz, T., Lauwen, S., Kiel, C., Hollander, Aden & Weber, B. H. F. A transcriptome-wide association study based on 27 tissues identifies 106 genes potentially relevant for disease pathology in age-related macular degeneration. Sci. Rep. 10, 1584 (2020).
Article ADS PubMed PubMed Central Google Scholar
Wu, C. et al. Transcriptome-wide association study identifies susceptibility genes for rheumatoid arthritis. Arthritis Res. Ther. 23, 38 (2021).
Article ADS PubMed PubMed Central Google Scholar
Wainberg, M. et al. Opportunities and challenges for transcriptome-wide association studies. Nat. Genet. 51, 592–599 (2019).
Article PubMed PubMed Central Google Scholar
Nagpal, S. et al. TIGAR: an improved Bayesian tool for transcriptomic data imputation enhances gene mapping of complex traits. Am. J. Human Genet. 105, 258–266 (2019).
Article Google Scholar
Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 9, 1–20 (2018).
Article Google Scholar
Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).
Article PubMed PubMed Central Google Scholar
Parrish, R. L., Gibson, G. C., Epstein, M. P. & Yang, J. TIGAR-V2: efficient TWAS tool with nonparametric Bayesian eQTL weights of 49 tissue types from GTEx V8. HGG Adv. 3, 100068 (2022).
PubMed Google Scholar
GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
Article Google Scholar
Bennett, D. A. et al. Religious orders study and rush memory and aging project. J. Alzheimers Dis. 64, S161–S189 (2018).
Article PubMed PubMed Central Google Scholar
Hu, Y. et al. A statistical framework for cross-tissue transcriptome-wide association analysis. Nat. Genet. 51, 568–576 (2019).
Article PubMed PubMed Central Google Scholar
Shi, X. et al. A tissue-specific collaborative mixed model for jointly analyzing multiple tissues in transcriptome-wide association studies. Nucleic Acids Res. 48, e109 (2020).
Article PubMed PubMed Central Google Scholar
Liu, A. E. & Kang, H. M. Meta-imputation of transcriptome from genotypes across multiple datasets by leveraging publicly available summary-level data. PLoS Genet. 18, e1009571 (2022).
Article PubMed PubMed Central Google Scholar
Wolpert, D. H. Stacked generalization. Neural Netw. 5, 241–259 (1992).
Article Google Scholar
Breiman, L. Stacked regressions. Mach. Learn. 24, 49–64 (1996).
Article Google Scholar
Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).
Article PubMed PubMed Central Google Scholar
Barbeira, A. N. et al. Exploiting the GTEx resources to decipher the mechanisms at GWAS loci. Genome Biol. 22, 49 (2021).
Article PubMed PubMed Central Google Scholar
Wightman, D. P. et al. A genome-wide association study with 1,126,563 individuals identifies new risk loci for Alzheimer’s disease. Nat. Genet. 53, 1276–1282 (2021).
Article PubMed PubMed Central Google Scholar
Mancuso, N. et al. Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. Am. J. Hum. Genet. 100, 473–487 (2017).
Article PubMed PubMed Central Google Scholar
Marioni, R. E. et al. GWAS on family history of Alzheimer’s disease. Transl. Psychiatry 8, 99 (2018).
Article PubMed PubMed Central Google Scholar
Jansen, I. E. et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat. Genet. 51, 404–413 (2019).
Article PubMed PubMed Central Google Scholar
Nazarian, A., Yashin, A. I. & Kulminski, A. M. Genome-wide analysis of genetic predisposition to Alzheimer’s disease and related sex disparities. Alzheimers Res. Ther. 11, 1–21 (2019).
Article Google Scholar
Gockley, J. et al. Multi-tissue neocortical transcriptome-wide association study implicates 8 genes across 6 genomic loci in Alzheimer’s disease. Genome Med. 13, 76 (2021).
Article PubMed PubMed Central Google Scholar
Jing, Q. et al. A comprehensive analysis identified hub genes and associated drugs in Alzheimer’s disease. Biomed. Res. Int. 2021, e8893553 (2021).
Article Google Scholar
Schwartzentruber, J. et al. Genome-wide meta-analysis, fine-mapping, and integrative prioritization implicate new Alzheimer’s disease risk genes. Nat. Genet. 53, 392–402 (2021).
Article PubMed PubMed Central Google Scholar
Deming, Y. et al. The MS4A gene cluster is a key modulator of soluble TREM2 and Alzheimer’s disease risk. Sci. Transl. Med. 11, eaau2291 (2019).
Article PubMed PubMed Central Google Scholar
Shigemizu, D. et al. Ethnic and trans-ethnic genome-wide association studies identify new loci influencing Japanese Alzheimer’s disease risk. Transl. Psychiatry 11, 1–10 (2021).
Article Google Scholar
Szklarczyk, D. et al. The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 51, D638–D646 (2022).
Article PubMed Central Google Scholar
Honea, R. A. et al. Alzheimer’s disease cortical morphological phenotypes are associated with TOMM40’523-APOE haplotypes. Neurobiol. Aging 132, 131–144 (2023).
Article PubMed PubMed Central Google Scholar
Guo, P. et al. Pinpointing novel risk loci for Lewy body dementia and the shared genetic etiology with Alzheimer’s disease and Parkinson’s disease: a large-scale multi-trait association analysis. BMC Med. 20, 214 (2022).
Article PubMed PubMed Central Google Scholar
McCusker, S. M. et al. Association between polymorphism in regulatory region of gene encoding tumour necrosis factor α and risk of Alzheimer’s disease and vascular dementia: a case-control study. Lancet 357, 436–439 (2001).
Article PubMed Google Scholar
Nalls, M. A. et al. Identification of novel risk loci, causal insights, and heritable risk for Parkinson’s disease: a meta-analysis of genome-wide association studies. Lancet Neurol. 18, 1091–1102 (2019).
Article PubMed PubMed Central Google Scholar
Kia, D. A. et al. Identification of candidate Parkinson disease genes by integrating genome-wide association study, expression, and epigenetic data sets. JAMA Neurol. 78, 464–472 (2021).
Article PubMed Google Scholar
Yao, S. et al. A transcriptome-wide association study identifies susceptibility genes for Parkinson’s disease. npj Parkinsons Dis. 7, 1–8 (2021).
Article Google Scholar
Pankratz, N. et al. Meta-analysis of Parkinson disease: identification of a novel locus, RIT2. Ann. Neurol. 71, 370–384 (2012).
Article PubMed PubMed Central Google Scholar
Sagi, O. & Rokach, L. Ensemble learning: a survey. WIREs Data Min. Knowl. Discov. 8, e1249 (2018).
Article Google Scholar
Tang, H. & Harte, M. Investigating markers of the NLRP3 inflammasome pathway in Alzheimer’s disease: a human post-mortem study. Genes 12, 1753 (2021).
Article PubMed PubMed Central Google Scholar
Lambert, J.-C. et al. Genome-wide association study identifies variants at CLU and CR1 associated with Alzheimer’s disease. Nat. Genet. 41, 1094–1099 (2009).
Article PubMed Google Scholar
Kunkle, B. W. et al. Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nat. Genet. 51, 414–430 (2019).
Article PubMed PubMed Central Google Scholar
Corbett, B. F. et al. ΔFosB regulates gene expression and cognitive dysfunction in a mouse model of Alzheimer’s disease. Cell Rep. 20, 344–355 (2017).
Article PubMed PubMed Central Google Scholar
Shi, G. et al. Functional alteration of PARL contributes to mitochondrial dysregulation in Parkinson’s disease. Hum. Mol. Genet. 20, 1966–1974 (2011).
Article PubMed Google Scholar
Subrahmanian, N. & LaVoie, M. J. Is there a special relationship between complex I activity and nigral neuronal loss in Parkinson’s disease? A critical reappraisal. Brain Res. 1767, 147434 (2021).
Article PubMed PubMed Central Google Scholar
Storm, C. S. et al. Finding genetically-supported drug targets for Parkinson’s disease using Mendelian randomization of the druggable genome. Nat. Commun. 12, 7342 (2021).
Article ADS PubMed PubMed Central Google Scholar
Murthy, M. N. et al. Increased brain expression of GPNMB is associated with genome wide significant risk for Parkinson’s disease on chromosome 7p15.3. Neurogenetics 18, 121–133 (2017).
Article PubMed PubMed Central Google Scholar
Haskó, G., Linden, J., Cronstein, B. & Pacher, P. Adenosine receptors: therapeutic aspects for inflammatory and immune diseases. Nat. Rev. Drug Discov. 7, 759–770 (2008).
Article PubMed PubMed Central Google Scholar
Sun, Y. & Huang, P. Adenosine A2B receptor: from cell biology to human diseases. Front. Chem. 4, 37 (2016).
Jenner, P. in International Review of Neurobiology (ed. Mori, A.) Ch. 3 (Academic Press, 2014).
Guerreiro, S., Privat, A.-L., Bressac, L. & Toulorge, D. CD38 in neurodegeneration and neuroinflammation. Cells 9, 471 (2020).
Article PubMed PubMed Central Google Scholar
Mogil, L. S. et al. Genetic architecture of gene expression traits across diverse populations. PLoS Genet. 14, e1007586 (2018).
Article PubMed PubMed Central Google Scholar
Luningham, J. M. et al. Bayesian genome-wide TWAS method to leverage both cis- and trans-eQTL information through summary statistics. Am. J. Hum. Genet. 107, 714–726 (2020).
Article PubMed PubMed Central Google Scholar
Yang, C. et al. CoMM: a collaborative mixed model to dissecting genetic contributions to complex traits by leveraging regulatory information. Bioinformatics 35, 1644–1652 (2019).
Article PubMed Google Scholar
Yuan, Z. et al. Testing and controlling for horizontal pleiotropy with probabilistic Mendelian randomization in transcriptome-wide association studies. Nat. Commun. 11, 3861 (2020).
Article ADS PubMed PubMed Central Google Scholar
Yang, Y. et al. CoMM-S2: a collaborative mixed model using summary statistics in transcriptome-wide association studies. Bioinformatics 36, 2009–2016 (2020).
Article PubMed Google Scholar
Rao, J. N. K. & Subrahmaniam, K. Combining independent estimators and estimation in linear regression with unequal variances. Biometrics 27, 971–990 (1971).
Article Google Scholar
Efron, B. & Morris, C. Combining possibly related estimation problems. J. R. Stat. Soc. 35, 379–421 (1973).
Article MathSciNet Google Scholar
Wu, L. et al. A transcriptome-wide association study of 229,000 women identifies new candidate susceptibility genes for breast cancer. Nat. Genet. 50, 968–978 (2018).
Article PubMed PubMed Central Google Scholar
Bhattacharya, A., Li, Y. & Love, M. I. MOSTWAS: multi-omic strategies for transcriptome-wide association studies. PLoS Genet. 17, e1009398 (2021).
Article PubMed PubMed Central Google Scholar
Li, H. Tabix: fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics 27, 718–719 (2011).
Article PubMed PubMed Central Google Scholar
McKinney, W. Data structures for statistical computing in Python. In Proc. 9th Python in Science Conference (SciPy 2010) 56–61 (2010).
Harris et al. Array programming with NumPy. Nature 585, 357–362 (2020).
Article ADS PubMed PubMed Central Google Scholar
Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).
Article PubMed PubMed Central Google Scholar
Seabold, S. & Perktold, J. Statsmodels: econometric and statistical modeling with Python. In Proc. 9th Python in Science Conference (SciPy 2010) 92–96. (2010).
Pedregosa, F. et al. Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
MathSciNet Google Scholar
Buitinck, L. et al. API design for machine learning software: experiences from the scikit-learn project. In Proc. European Conference on Machine Learning and Principles and Practices of Knowledge Discovery in Databases (ECMPKDD’13) 108–122 (2013).
De Jager, P. L. et al. A multi-omic atlas of the human frontal cortex for aging and Alzheimer’s disease research. Sci. Data 5, 180142 (2018).
Article PubMed PubMed Central Google Scholar
Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–507 (2012).
Article PubMed PubMed Central Google Scholar
Amberger, J. S., Bocchini, C. A., Scott, A. F. & Hamosh, A. OMIM.org: leveraging knowledge across phenotype–gene relationships. Nucleic Acids Res. 47, D1038–D1043 (2019).
Article PubMed Google Scholar
Cherry, J. M. et al. Saccharomyces genome database: the genomics resource of budding yeast. Nucleic Acids Res. 40, D700–D705 (2012).
Article PubMed Google Scholar

Download references

Acknowledgements

RLP and JY are supported by the National Institutes of Health (NIH/NIGMS) grant award R35GM138313. MPE was supported by NIH/NIGMS grant award R01GM117946 and NIH/NIA grant award RF1AG071170. ROS/MAP study data were provided by the Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL. Data collection was supported through funding by NIA grants P30AG10161, R01AG15819, R01AG17917, R01AG30146, R01AG36836, R01AG56352, U01AG32984, U01AG46152, U01AG61356, the Illinois Department of Public Health, and the Translational Genomics Research Institute.

Author information

Authors and Affiliations

Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
Randy L. Parrish, Michael P. Epstein & Jingjing Yang
Department of Biostatistics, Emory University School of Public Health, Atlanta, GA, 30322, USA
Randy L. Parrish
Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL, 60612, USA
Aron S. Buchman, Shinya Tasaki, Yanling Wang, Denis Avey, Jishu Xu & David A. Bennett
Center for Translational and Computational Neuroimmunology, Department of Neurology and Taub Institute for Research on Alzheimer’s Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY, 10032, USA
Philip L. De Jager

Authors

Randy L. Parrish
View author publications
Search author on:PubMed Google Scholar
Aron S. Buchman
View author publications
Search author on:PubMed Google Scholar
Shinya Tasaki
View author publications
Search author on:PubMed Google Scholar
Yanling Wang
View author publications
Search author on:PubMed Google Scholar
Denis Avey
View author publications
Search author on:PubMed Google Scholar
Jishu Xu
View author publications
Search author on:PubMed Google Scholar
Philip L. De Jager
View author publications
Search author on:PubMed Google Scholar
David A. Bennett
View author publications
Search author on:PubMed Google Scholar
Michael P. Epstein
View author publications
Search author on:PubMed Google Scholar
Jingjing Yang
View author publications
Search author on:PubMed Google Scholar

Contributions

RLP wrote all source code for the tool, conducted all data analyses, and drafted the manuscript; JY conceived the idea, supervised the project, and edited the manuscript; ASB, ST, and MPE interpretated data analysis results and edited the manuscript; YW, DA, JX, PLDJ, and DAB generated the ROSMAP transcriptomic data and edited the manuscript.

Corresponding author

Correspondence to Jingjing Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Adam Naj, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Parrish, R.L., Buchman, A.S., Tasaki, S. et al. SR-TWAS: leveraging multiple reference panels to improve transcriptome-wide association study power by ensemble machine learning. Nat Commun 15, 6646 (2024). https://doi.org/10.1038/s41467-024-50983-w

Download citation

Received: 03 July 2023
Accepted: 26 July 2024
Published: 05 August 2024
DOI: https://doi.org/10.1038/s41467-024-50983-w

Subjects

Abstract

Similar content being viewed by others

Alternative polyadenylation transcriptome-wide association study identifies APA-linked susceptibility genes in brain disorders

Multi-omics analysis reveals the genetic aging landscape of Parkinson’s disease

Isoform-level transcriptome-wide association uncovers genetic risk mechanisms for neuropsychiatric disorders in the human brain

Introduction

Results

Overview of SR-TWAS

GTEx V8 reference transcriptomic data

ROS/MAP reference transcriptomic data

Simulation study design

Simulation study results

Additional simulation studies

Simulation study model weights estimated by SR-TWAS

Type I error assessment simulation study

Real validation studies

Training expression imputation models of SMA tissue by SR-TWAS

TWAS results of AD dementia

Validation of significant TWAS risk genes of AD by PMR-Egger

Independent significant TWAS risk genes of AD dementia

Protein–protein interaction network and enrichment analysis with risk genes of AD dementia

Training expression imputation models of brain substantia nigra tissue by SR-TWAS

TWAS results of PD

Validation of significant TWAS risk genes of PD by PMR-Egger

Independent significant PD TWAS risk genes

Protein–protein interaction network and enrichment analysis with risk genes of PD

Discussion

Important findings by application TWAS of AD dementia

Important findings by application TWAS of PD

Tool for implementing SR-TWAS

Limitations

Summary

Methods

SR-TWAS using stacked regression

Naïve method

Avg-valid + SR models

SR-TWAS tool framework

ROS/MAP reference panel

GTEx V8 reference panel

Simulation study design

Protein–protein interaction network and enrichment analysis

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links