Fig. 1: Schematic of the PhenoPLIER framework.

a High-level schematic of PhenoPLIER (a gene module-based method) in the context of TWAS (single-gene) and GWAS (single-variant). In GWAS, we identify variants associated with traits. In TWAS, first, we identify variants that are associated with gene expression levels (eQTLs); then, prediction models based on eQTLs are used to impute gene expression, which is used to compute gene-trait associations. Resources such as LINCS L1000 provide information about how a drug perturbs gene expression; at the bottom-right corner, we show how a drug downregulates two genes (A and C). In PhenoPLIER, these data types are integrated using groups of genes co-expressed across one or more conditions (such as cell types) that we call gene modules or latent variables/LVs. Created with BioRender.com. b The integration process in PhenoPLIER uses low-dimensional representations (matrices Z and B) learned from large gene expression datasets (top). We used gene-drug information L from LINCS L1000 and gene-trait associations M from TWAS: PhenomeXcan was used as the discovery cohort, and eMERGE as replication (middle). PhenoPLIER provides three computational components (bottom): 1) an LV-based regression model that associates an LV j (Zj) with a trait i (Mi), 2) a clustering framework that learns groups of traits from TWAS associations projected into the LV space (\(\hat{{{{{{{{\bf{M}}}}}}}}}\)), and 3) an LV-based drug-repurposing approach that uses the projection of TWAS (\(\hat{{{{{{{{\bf{M}}}}}}}}}\)) and LINCS L1000 (\(\hat{{{{{{{{\bf{L}}}}}}}}}\)) into the LV space. c Genes that are part of LV603, termed as a neutrophil signature44, were expressed in relevant cell types (top), with 53 independent samples expressed in Neutrophils, 59 in Granulocytes, and 20 in Whole blood, 56 in PBMC, 8 in mDCs, 29 in Monocytes, and 5 in Epithelial cells (the boxplot shows the 25th, 50th and 75th percentiles while the whiskers extend to the minimum/maximum values). LV603 was associated in PhenoPLIER with neutrophil counts and other white blood cells (bottom, showing the top 10 traits for LV603 after projecting gene-trait associations in PhenomeXcan). eQTLs expression quantitative trait loci, MVN multivariate normal distribution, PBMC peripheral blood mononuclear cells, mDCs myeloid dendritic cells.