Analyses of GWAS signal using GRIN identify additional genes contributing to suicidal behavior

Sullivan, Kyle A.; Lane, Matthew; Cashman, Mikaela; Miller, J. Izaak; Pavicic, Mirko; Walker, Angelica M.; Cliff, Ashley; Romero, Jonathon; Qin, Xuejun; Mullins, Niamh; Docherty, Anna; Coon, Hilary; Ruderfer, Douglas M.; Garvin, Michael R.; Pestian, John P.; Ashley-Koch, Allison E.; Beckham, Jean C.; McMahon, Benjamin; Oslin, David W.; Kimbrel, Nathan A.; Jacobson, Daniel A.; Kainer, David

doi:10.1038/s42003-024-06943-7

Download PDF

Article
Open access
Published: 21 October 2024

Analyses of GWAS signal using GRIN identify additional genes contributing to suicidal behavior

Communications Biology volume 7, Article number: 1360 (2024) Cite this article

4045 Accesses
5 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Genome-wide association studies (GWAS) identify genetic variants underlying complex traits but are limited by stringent genome-wide significance thresholds. We present GRIN (Gene set Refinement through Interacting Networks), which increases confidence in the expanded gene set by retaining genes strongly connected by biological networks when GWAS thresholds are relaxed. GRIN was validated on both simulated interrelated gene sets as well as multiple GWAS traits. From multiple GWAS summary statistics of suicide attempt, a complex phenotype, GRIN identified additional genes that replicated across independent cohorts and retained biologically interrelated genes despite a relaxed significance threshold. We present a conceptual model of how these retained genes interact through neurobiological pathways that may influence suicidal behavior, and identify existing drugs associated with these pathways that would not have been identified under traditional GWAS thresholds. We demonstrate GRIN’s utility in boosting GWAS results by increasing the number of true positive genes identified from GWAS results.

Genome-wide association studies

Article 26 August 2021

Modeling regulatory network topology improves genome-wide analyses of complex human traits

Article Open access 14 May 2021

Transcriptome-wide association study identifies new susceptibility genes and pathways for depression

Article Open access 21 May 2021

Introduction

Genome-wide association studies (GWAS) have become a crucial tool for the discovery of the genetic basis of complex traits. Complex traits are governed by a set of genes, each of which may be influenced by allelic variation affecting gene expression or the resulting product. In a GWAS, one tests the effects of single nucleotide polymorphisms (SNPs) on the trait of interest to isolate genomic regions that can then be linked to putatively relevant genes. Ultimately, the goal of GWAS is to accurately detect this “true positive” set of genes that affect a given trait. Since the true positive gene set is typically unknown, this sets up a struggle between precision (the ratio of actual true positive genes to all genes identified as positive) and recall (the ratio of the true positive genes in a gene set compared to all true positive genes that could possibly be discovered). When performing GWAS it is often precision that takes precedence over recall because downstream experiments to validate true positive genes are labor intensive and expensive. Thus, producing a shorter yet more reliable gene set may be more worthwhile than a longer set containing a higher proportion of false positives to enhance biological discovery.

Precision and recall from GWAS results are governed by many factors, only some of which are in control of the researcher. Balance between the two is primarily controlled by the use of multiple-testing correction of p-values, which puts a threshold on which SNPs are considered genome-wide significant and consequently which genes are considered relevant to the trait. Stringent genome-wide significance thresholds (e.g., p < 5e⁻⁸) provide increased confidence in SNPs that make it over the threshold, but can severely limit the recall of relevant genes. By using less stringent significance thresholds (e.g., p < 1e⁻⁵), one can access more heritability (higher recall) at the expense of introducing a higher proportion of unknown false positives (lower precision).

GWAS can identify the genetic basis of complex phenotypes but interpreting the results of these studies is challenging. SNP-level results from GWAS often need gene assignment for understanding downstream functions, but SNP-to-gene assignment is particularly challenging for intergenic SNPs^1,2,3,4 which often reside within large blocks of linkage disequilibrium encompassing many genes. This makes it difficult to accurately determine which specific genes are relevant to the trait, increasing the likelihood of introducing false positives and reducing precision. One simple approach is to map SNPs to nearby genes⁵, which can be error prone because many non-coding SNPs regulate genes at greater distances. Additional methods such as expression quantitative trait loci (eQTL) mapping⁶ and H-MAGMA¹ are designed to improve the SNP-to-gene mapping process by leveraging further biological evidence from Hi-C or RNA-seq data respectively, but it is often difficult to obtain these context-relevant omics and false positive genes may still remain. Thus, the difficult problem of distinguishing false positives from true positives in GWAS results remains.

Of critical importance, however, is the fact that polygenic traits are typically governed by genes from multiple pathways working in concert with each other in a non-random manner⁷. Given enough information from diverse experimental sources, it should be possible to find functional lines of biological evidence connecting any pair of the true positive, causal GWAS genes, as they are very likely to be more functionally connected with each other relative to other random pairs of genes. Conversely, false positives will likely be random and therefore far less functionally connected to other genes in the set. Previous efforts^8,9,10 have therefore utilized networks in an attempt to boost GWAS signals, albeit without specifying which genes are likely to be true or false positives.

To this end, we present GRIN – Gene set Refinement through Interacting Networks – an approach that uses biological network topology to remove false positive genes from a gene set in order to improve its precision. Starting from a network representation of system-wide gene-to-gene interactions and a user-defined gene set (e.g., after SNP-to-gene assignment), GRIN explores network topology to determine how strongly these genes are interconnected, and compares the connectivity among these genes to the connectivity found among random genes to determine which of the user’s genes are likely to be false positives. We tested GRIN’s ability to separate well-characterized, functionally related “gold standard” genes from random genes using a large multiplex network⁸. We then tested GRIN on several published GWAS of human traits and diseases, showing that GRIN is able to improve the precision of the gene set when measured against a gold standard comprising genes discovered in a much higher powered GWAS of the same trait. GRIN was also compared to state-of-the-art, network-based GWAS boosting methods - NAGA⁹ and GWAB¹⁰.

We then applied GRIN to suicide attempt GWAS summary statistics, a complex psychiatric phenotype. Even though its heritability estimates range from 17–55%^11,12, this psychiatric disorder has elicited few genome-wide significant variants to date^12,13,14,15. We analyzed independent suicide attempt GWAS results from the Million Veteran Program (MVP¹⁵) and the International Suicide Genetics Consortium (ISGC¹³). As few variants were significant at traditional genome-wide significance, we explored SNPs at a less stringent threshold and used multiple SNP-to-gene assignment methods to elucidate the underlying mechanisms of the heritable components of suicidality. We then aimed to reduce the impact of possible false positive genes by applying GRIN, and identified potential drug targets for future suicide prevention studies that targeted the suicide attempt genes.

Results

GRIN workflow

A summary of the GRIN workflow is presented in Fig. 1a. GRIN inputs include a gene set and a previously generated multiplex network. In Stage 1, all experimentally-derived (e.g., GWAS) “seed” genes are ranked based upon biological network connectivity, which is compared to a null rank distribution generated from random gene sets of equivalent size. In Stage 2, a sliding window is used to compare the ordered ranks of experimentally-derived genes to the equivalent ordered ranks within the null distribution using the Mann–Whitney U test, and the p-value of the Mann–Whitney U test is plotted for each window to form a curve. The elbow of this curve indicates the cutoff point at which the rank distributions are equivalent between the seed gene set and the null distribution (indicating low functional interrelatedness), and genes following this cutoff point are filtered out.

**Fig. 1: Overview of GRIN – Gene set Refinement through Interacting Networks.**

When conceiving GRIN, we identified its potential utility in identifying true positive candidate genes from genome-wide significant loci identified by GWAS (Fig. 1b). One real world GWAS case involves a situation in which one or more genome-wide significant SNPs are in proximity to multiple candidate genes, making the causal gene difficult to identify (Fig. 1b, Loci 1 and 2). This would result in false positive genes being associated with the trait of interest. However, true positive genes that contribute to a phenotype of interest are likely to share in unified biological pathways, which can be represented in gene-gene network connections (Fig. 1b). In cases where there is a higher degree of confidence of linking a genome-wide significant locus to a gene (Fig. 1b, Locus 3), leveraging genes that are highly interconnected within a network may identify which genes implicated by GWAS are more likely to truly contribute to the trait of interest.

GRIN accurately retains true positive genes from simulated, “noisy” gene sets

We first tested GRIN’s ability to distinguish between true positive and true negative genes using three thousand simulated gene sets where the truth is known. Each gene set was constructed from one of 30 biologically-interrelated, “gold standard” gene sets mixed with an equivalent amount of randomly drawn, “noisy” genes (1:1 ratio), and repeated 100 times per gold standard gene set (Fig. 2a, Supplementary Table 1). Stage 1 of GRIN ranked gold standard genes more highly than most random genes in any given test set, achieving median area under the receiver operating characteristic curve (AUROC) and median area under the precision-recall curve (AUPRC) values of 0.950 \(\pm\) 0.059 and 0.896 \(\pm\) 0.084 respectively across the 30 gold standard gene sets (Fig. 2b, c). This indicated highly accurate gene ranking in GRIN Stage 1. GRIN Stage 2 effectively classified true and false positive genes at the cutoff point (median precision 0.810 \(\pm\) 0.134; recall 0.914 \(\pm\) 0.167; specificity 0.880 \(\pm\) 0.197; Fig. 2d, Supplementary Fig. 1). On average, 46.0% of the gene set was discarded as noise (0.460 \(\pm\) 0.143), and less than 9% of true positive gold genes were discarded (median recall 0.914).

**Fig. 2: GRIN retains biologically interrelated genes.**

Next, we tested GRIN’s ability to refine gene sets containing varying proportions of random genes in order to simulate real world scenarios where the signal to noise ratio in a gene set is not known. When gold standard genes outnumbered random genes by 2:1, precision (0.967 \(\pm\) 0.060), recall (0.735 \(\pm\) 0.331), and specificity (0.971\(\pm\) 0.088) were consistently high compared to expected values from random classification (precision = 0.66, specificity = 0.66). Precision, recall, and specificity values decreased as the ratio of gold genes to random genes in the gene sets decreased (1:1, 1:2, 1:4, 1:10) but were consistently better than random classification (Fig. 2e, Supplementary Figs. 2–3).

Finally, we tested whether GRIN could retain multiple distinct functional groups when mixed with random genes, as a real-world gene set for a complex trait is likely to contain multiple functional groups. When given gene sets with four distinctive biological functions (“Alzheimer’s disease”, “central nervous system myelination”, “dopaminergic synaptic signaling”, and “major depressive disorder”) at 2:1, 1:1, 1:2, 1:4, or 1:10 ratios of gold standard genes to random genes, GRIN generated much better precision and specificity compared to random chance while retaining high values for recall (Fig. 2F).

Evaluating GRIN using real-world GWAS results

We evaluated GRIN’s ability to control for a loss of precision in real-world GWAS results by relaxing significance thresholds from a lower-powered GWAS to identify genes that reach genome-wide significance in a higher-powered GWAS of the same trait. We considered true positive genes as genes assigned from SNPs with p < 5e⁻⁸ from a high-powered GWAS, and then measured the precision and recall before and after GRIN for the corresponding lower-powered GWAS at progressively less stringent significance thresholds.

For the vast majority of the combinations of traits and significance thresholds tested, GRIN improved precision compared to no GWAS boosting (78.47% of all combinations) or was equivalent to no GWAS boosting when precision was equal to 1.00 (8.33% of all combinations; Fig. 3a, Supplementary Table 2). The improvement in precision after using GRIN was especially evident for traits like smoking initiation (no boosting: 0.30–0.58; GRIN: 0.40–0.70), schizophrenia first release (SCZ1: no boosting: 0.33–0.84; GRIN: 0.38–1.00), and type 2 diabetes first release (T2D1: no boosting: 0.26–0.89; GRIN: 0.28–1.00; Fig. 3a). In contrast, no GWAS boosting only had higher values of precision for very few combinations of traits and thresholds tested (13.19% of all combinations, Fig. 3a, Supplementary Table 2). As expected, as the significance threshold became less stringent in the lower-powered GWAS, the recall of the higher-powered GWAS results improved without GWAS boosting, at the expense of precision, often drastically (Fig. 3). However, while the difference in recall between no GWAS boosting and GRIN was more evident as less stringent genome-wide significance thresholds, there was a notably smaller difference in recall when comparing the second release of height at p < 5e⁻⁵ (Fig. 3b, Supplementary Table 3; no GWAS boosting: 0.44 recall, 0.951 precision; GRIN: 0.37 recall, 0.954 precision).

**Fig. 3: GRIN increases precision compared to other GWAS boosting methods.**

GRIN boosts precision compared to other GWAS boosting methods

We compared GRIN to two existing network biology-based GWAS boosting methods (NAGA⁹ and GWAB¹⁰), which work by re-ranking the p-values of genes using network topology, similarly to GRIN stage 1 (see Methods). However, unlike GRIN, neither method classifies whether a ranked gene is a true or false positive GWAS gene, so it is up to the user to make that decision based on an arbitrary rank position. For a fair comparison between methods, we set a genome-wide p-value significance threshold and retained the re-ranked genes above that threshold for NAGA and GWAB, while allowing GRIN’s stage 2 to determine its own retained gene set given the same initial threshold.

For each trait and threshold tested, we found that GRIN almost always obtained the best precision of the three methods, which reflects its focus on improving precision by discarding potential false positive genes (Fig. 3a, Supplementary Table 2). For multiple traits and thresholds, NAGA and/or GWAB obtained higher recall than GRIN, which was due to GRIN discarding some true positive genes. However, there were numerous scenarios where GRIN maintained similar or even higher levels of recall than NAGA or GWAB despite discarding genes (see coronary artery disease releases 1 and 2 [CAD1, CAD2], SCZ1, SCZ2; Fig. 3b, Supplementary Tables 2–4).

GRIN retains a majority of genes from the union of multiple suicide attempt GWAS results at less stringent significance thresholds

After benchmarking GRIN on gold standard gene sets and other GWAS results, we applied it to suicide attempt GWAS summary statistics from the Million Veteran Program (MVP¹⁵) to understand which genes contribute to this psychiatric pathophysiology. We utilized both conventional MAGMA⁴ and H-MAGMA¹ solely for gene assignment from SNPs at different genome-wide significance thresholds (Fig. 3a). Only five SNPs were significant below the traditional threshold of genome-wide significance (p < 5e⁻⁸) which were assigned to three genes (SPATA17, TSHZ2, and ENSG00000227705; Supplementary Tables 5 and 6). In order to explore additional genes contributing to suicide attempt pathophysiology, we examined SNPs at a threshold of p < 1e⁻⁵ and assigned them to 122 genes, since less stringent thresholds resulted in an unwieldy number of genes to interpret and we anticipated precision would still remain high based on real-world GWAS results (Supplementary Tables 5 and 7). We applied GRIN to this gene set in order to limit the number of potential false positive genes introduced at this threshold, resulting in 65 retained genes (57 genes removed; Supplementary Fig. 4, Supplementary Table 8).

We then sought to identify which genes retained by GRIN from MVP were replicated in two independent, civilian suicide attempt GWAS compiled by the International Suicide Genetics Consortium (ISGC¹³). This study contained GWAS summary statistics from: (1) suicide attempt in European-ancestry civilians (SA-EUR) and (2) results from the SA-EUR population conditioned by major depressive disorder (MDD) diagnosis status (SA-MDD). At a threshold of p < 5e⁻⁸, SNPs were assigned to only seven SA-EUR genes and three SA-MDD genes (Supplementary Tables 5–6). At p < 1e⁻⁵, we assigned SNPs to 252 genes from the SA-EUR results and 62 genes from SA-MDD results (Supplementary Tables 5 and 7), and applied GRIN to remove potential false positives. Prior to GRIN, 25 genes were in both SA-EUR and SA-MDD sets (Supplementary Fig. 5). Following GRIN, 11 genes were retained from both sets of summary statistics, 8 genes were removed from both sets, and 6 genes were removed or retained from one set only (Supplementary Fig. 4; Supplementary Tables 9–10). Moreover, the genes retained by GRIN from MVP and ISGC summary statistics were not simply those associated with lowest p-values (Supplementary Fig. 6).

In a process similar to a meta-analysis, we applied GRIN to the union of MVP, SA-EUR, and SA-MDD genes to identify replicated genes across cohorts and identify unified biological mechanisms. Prior to GRIN, at p < 1e⁻⁵ one gene (PDE4B) was shared among all results, three genes were shared by only MVP and SA-EUR, and 24 genes were shared by only SA-EUR and SA-MDD (Fig. 4b). After applying GRIN to the union of all genes, over 50% of the 17 genes shared by multiple summary statistics were retained, including PDE4B, 2 out of 3 shared genes between MVP and SA-EUR, and 14 out of 24 shared genes between SA-EUR and SA-MDD (Fig. 4b, Supplementary Table 11). Conversely, 8 genes between SA-EUR and SA-MDD and 1 gene between MVP and SA-EUR were removed at this threshold (Fig. 4b, Supplementary Table 9).

**Fig. 4: GRIN retains a majority of genes common among distinct suicide attempt summary statistics.**

GRIN enhances functional gene set enrichment from MVP and ISGC suicide attempt summary statistics

To determine if GRIN successfully retained biologically interrelated genes, we performed gene set enrichment analysis of suicide attempt GWAS genes before and after GRIN. Using a statistical significance threshold of FDR-corrected p-value < 0.05, this included gene set enrichments for: Gene Ontology (GO¹⁶) Biological Processes, Cellular Processes, and Molecular Functions; Human Phenotypes; biological pathways; transcription factor binding Sites; drug targets; and diseases (Methods). When examining only the MVP summary statistics before GRIN, the 122 unfiltered genes were not significantly enriched for any GO terms but were significantly enriched for 203 combined drug and disease terms, including “substance dependence” (DisGeNET C0038580; Supplementary Table 12). After refining the gene set with GRIN, 1449 terms were significantly enriched from just 65 retained genes, including five GO molecular functions (e.g., “adenyl ribonucleotide binding,” Supplementary Table 13). Conversely, the 57 genes removed by GRIN returned zero significantly enriched terms. Thus, filtering MVP genes at a far less stringent GWAS significance threshold with GRIN resulted in a more functionally enriched gene set by improving the signal-to-noise ratio.

Next, we compared gene set enrichments of the ISGC summary statistics before and after GRIN to investigate if GRIN functionally refined these gene sets. At p < 1e⁻⁵, 243 terms were significantly enriched from the unfiltered SA-EUR gene set, including 33 transcription factor binding site terms and 114 GO terms (Supplementary Table 14). GRIN retained highly interrelated genes from SA-EUR as demonstrated by 274 significantly enriched terms, including 132 significantly enriched GO terms and 44 transcription factor binding site terms (Supplementary Table 15) compared to only two significantly enriched terms obtained from removed SA-EUR genes (Supplementary Table 16). Similarly, 65 enriched terms were identified from SA-MDD before GRIN, whereas 247 enriched terms were identified in the GRIN retained gene set, including a significant enrichment for “schizophrenia,” (DisGeNET C0036341; Supplementary Tables 17–18). Conversely, only 31 enriched terms were identified in the removed gene set (Supplementary Table 19). This strongly indicated that GRIN-filtered SA-EUR and SA-MDD gene sets contained highly interrelated genes that were relevant to neurobiological pathways.

Next, we assessed GRIN’s ability to improve gene set enrichment analysis using genes common to two or more sets of summary statistics. Prior to GRIN there were 1324 significant enriched terms from the 28 genes common to multiple data sets (Fig. 5, Supplementary Table 20). Notably, 1443 significant enriched terms were obtained from the 17 genes commonly retained by GRIN in multiple data sets, including 126 GO terms such as “dopaminergic synapse” (Fig. 5, Supplementary Table 21). Conversely, only 358 enriched terms were significant from the 11 intersecting genes removed by GRIN, indicating lower interrelatedness among these genes compared to the retained set (Supplementary Table 22). In addition to the 1268 terms that remained significantly enriched, an additional 175 terms were significantly enriched from the filtered gene set, indicating that retained genes constituted a more biologically cohesive set (Fig. 5, Supplementary Tables 20–21). The enrichments from the gene set of GRIN-retained genes far exceeded the median number of significantly enriched terms when retaining random genes of an equivalent gene set size (Supplementary Fig. 7).

**Fig. 5: GRIN enhances gene set enrichment analysis.**

Similarly, when applying GRIN to each set of summary statistics separately, 13 intersecting genes were commonly retained and 9 intersecting genes were commonly removed (Supplementary Fig. 8, Supplementary Table 23). This resulted in more numerous gene set enrichments using intersecting GRIN-retained genes compared to the unfiltered gene set or GRIN-removed genes (Supplementary Fig. 9, Supplementary Tables 24–25).

Retained genes from GRIN identify putative pathophysiological pathways involved in suicide attempt

Using genes retained by GRIN, we identified biological pathways implicated by suicide attempt GWAS (Fig. 6). Only one of the three genes identified from MVP GWAS at p < 5e⁻⁸ (TSHZ2) was in the set of 65 genes retained by GRIN at p < 1e⁻⁵. Multiple genes identified in multiple cohorts were relevant to dopaminergic signaling, including the dopamine D2 receptor subunit (DRD2) and phosphodiesterase 4-beta (PDE4B) as well as a protein kinase A subunit (PRKAR2A) from MVP, which can subsequently modulate cAMP/CREB-mediated transcription of genes important for synaptic plasticity¹⁷. Additionally, SGIP1 was retained by GRIN in MVP and SA-EUR summary statistics, which affects presynaptic vesicle release and emotional state^18,19. Furthermore, genes involved in neurotransmitter release (BRSK1) and glutamatergic synapses (CELSR3²⁰) were identified along with ICE2, which is induced by NMDA receptor activity^21,22. NCAM1 was also retained by GRIN, which is a crucial mediator of synaptic plasticity and memory processes^23,24. Multiple genes involved in cytoskeletal reorganization were also identified including CDC42BPB, MAP4, and MARK3. Moreover, TSHZ2, SMARCC1, and ZNF589 were retained by GRIN and have been implicated in neurodevelopmental processes while RCOR1 is important for neural progenitor differentiation into neuronal and glial subtypes^{17,25,26,27,28}. Finally, a number of genes involved in global translation processes were identified (DALRD3, DHX30, and EIF5), two of which have been previously implicated in neurodevelopmental disorders arising from missense variants^29,30 (Fig. 6).

**Fig. 6: Neurobiological mechanisms implicated in suicide attempt pathophysiology as determined by GWAS summary statistics and GRIN.**

Drug candidates identified as suicide attempt GWAS gene targets

Finally, we identified drugs that may modulate suicidal behavior based on GWAS-implicated genes retained by GRIN. Multiple drugs target the dopamine D2 receptor subunit, including the FDA-approved drugs clozapine (used to prevent suicidal behavior in schizoaffective individuals³¹) and amisulpride (Fig. 7, Supplementary Table 26). Roflumilast and a number of other molecular compounds also directly affect PDE4B. Furthermore, fostamatinib is known to affect 8 genes implicated in both MVP and SA-EUR suicide attempt GWAS. These drug-gene target links warrant future studies to ensure that they do not present increased risk for suicidality as a side effect, and to evaluate candidates for drug repurposing for suicide prevention.

**Fig. 7: Network association of drugs and drug targets identified by suicide attempt GWAS genes retained by GRIN.**

Discussion

Here, we introduced GRIN, a software based on networks of biological relationships to enable the relaxing of GWAS thresholds while reducing the impact of false positives. GWAS are subject to statistical challenges that have historically made it difficult to identify a large proportion of trait-relevant genes. Genes that may contribute to disease have traditionally been identified from SNP associations at a genome-wide significance threshold of p < 5e⁻⁸, but for complex traits controlled by many small-effect loci GWAS often fails to find a comprehensive signal despite the fact that much of the SNP-based heritability lies below this threshold. Relaxing the stringency gives access to more SNPs and genes associated with the trait, at the risk of introducing an increasing proportion of false positives that will confound downstream analyses. Therefore, GRIN operates as a filter as it identifies true and false positives according to biological network connectivity.

The first stage of GRIN requires a representation of known relationships between all genes in a network format. Here we used a biological multiplex network which captures a wide variety of relationship types across its 10 layers, as GRIN’s capacity to accurately refine gene sets is contingent on the connectivity represented in the network. A network lacking sufficient biological relationships would result in reduced ability to distinguish between functionally related genes and random noise. We therefore included various experimental data sources and generated a frontal cortex-specific predictive gene expression network using an explainable-AI methodology (iRF-LOOP³²), which can define relationships not present in the literature. Notably, we have tested GRIN on both a conventional laptop and a high-performance computing cluster and it is computationally tractable for gene sets containing hundreds of genes (Supplementary Table 27).

We demonstrated that GRIN works with simulated noisy gene sets similar to what is obtained from a GWAS. GRIN successfully partitioned curated gene sets spiked with random genes into signal and noise subsets, even when given multiple functional groups or a high noise ratio. The results confirm that RWR indeed ranks the functional groups of genes highly while random genes mostly receive poor rankings. When applied to real-world data, it is up to GRIN’s second stage to determine the optimal cutoff point that divides functional genes from false positive genes. The strong simulated test results provide confidence that when GRIN is applied to GWAS results, the true positive genes should rise to the top of the rankings as long as they are more functionally related to each other than random genes are, thus providing a retained set that has a higher signal-to-noise ratio than if GRIN was not used at all. However, as precision becomes lower as the signal-to-noise ratio increases, it is important to consider that when many genes can be assigned from a single GWAS locus that precision may decrease even when applying GRIN.

We also validated GRIN’s performance on real world GWAS results. First, we defined GWAS gold standard genes as genes reaching genome-wide significance in a higher-powered GWAS, and then measured precision and recall when applying GRIN to genes identified from a lower-powered GWAS at less stringent significance thresholds. We chose this definition despite the challenge of defining a “gold standard” for GWAS, as other groups have previously noted³³. GRIN resulted in equivalent or higher precision compared to simply lowering the threshold of statistical significance. This validates that GRIN’s use of network topology hones in on biologically-relevant genes in the retained gene set, improving precision despite more false positives being introduced at less stringent genome-wide significance thresholds. Moreover, while recall was reduced among GRIN-retained genes compared to no GWAS boosting applied, this difference was often quite minimal except at low stringency thresholds (e.g., p < 1e⁻³) with the notable exception of the second release of human height (GRIN recall of 0.37, no boosting recall of 0.44 at p < 5e⁻⁵). We conclude that GRIN may be particularly useful in identifying additional trait-relevant genes from lower-powered GWAS.

GRIN was compared to two other GWAS boosting methods to evaluate its performance: GWAB¹⁰ and NAGA⁹. Both GWAB and NAGA are designed to boost recall of GWAS results (limiting the number of false negative genes) by ranking genes via network topology, but GRIN is specifically designed to boost precision while maintaining or increasing recall. Compared to GWAB and NAGA, GRIN exhibited higher precision for nearly every trait at every significance threshold, with the exception of GWAB applied to type 2 diabetes at low-stringency thresholds. Furthermore, GRIN does not share some of the same limitations of GWAB and NAGA. GWAB is particularly limited by: (1) the user is limited in choice of networks applied to GWAS results; and (2) a list of a priori disease-relevant genes are required for input. Thus, we were not able to generate GWAB results for height due to a lack of gold-standard genes. While NAGA permits users to provide their own networks, only monoplex networks can be used, which have been previously shown to decrease performance compared to multiplex networks³⁴. NAGA also ranks all genes in the network, irrespective of whether genes in these networks were implicated by GWAS, and therefore does not provide an optimal statistical threshold for GWAS results. In this study, we attempted to make a fair comparison between GRIN, GWAB, and NAGA by comparing precision and recall of the top ranked genes by GWAB and NAGA with the equivalent number of genes input to GRIN at each genome-wide significance threshold tested. However, in practice GRIN allows users to boost precision of GWAS results at less stringent genome-wide significance thresholds without having to identify which proportion of top-ranked genes are trait-relevant, nor being constrained to using a predetermined gene set.

While SNP-to-gene assignment is an important aspect of interpreting GWAS results, assigning genes from SNPs is not a step within the GRIN software. When comparing GRIN to other methods, we chose to use a similar method of SNP-to-gene assignment used by NAGA and GWAB to make a fair comparison across all methods, as SNP-to-gene assignment is part of the workflow when using these two methods. However, compared to these alternate methods, users have the flexibility to choose one or more methods of assigning SNPs to genes when using GRIN. As SNPs may be incorrectly assigned to genes, particularly for intergenic SNPs, GRIN may be used to help remove potential false positive genes based on the underlying premise that genes causal to a trait will be more highly interconnected within networks representing known biological relationships. We, therefore, used multiple SNP-to-gene assignment methods when interpreting suicide attempt summary statistics in order to refine the genes most likely to be associated with this trait based on multiple possible genes assigned from SNPs: conventional MAGMA as a means of assigning SNPs to nearby genes, and H-MAGMA with dorsolateral prefrontal cortex Hi-C data to leverage 3D chromatin architecture from this brain region relevant to suicide attempt. This approach allows the user the flexibility of using their preferred method(s) of SNP-to-gene assignment while leveraging GRIN’s ability to reduce false positives identified by GWAS.

While we demonstrated that GRIN achieves high accuracy using gold standard gene sets, GRIN sometimes discarded true positive genes. This indicates that not all genes removed by GRIN are necessarily irrelevant to the trait or disease, and the removed set should be considered but with lower confidence than the retained set. It is also important to consider that false positives and false negatives could be re-classified as new experimental data sources become available. For example, some genes removed by GRIN from the suicide attempt summary statistics currently have few experimental gene-gene network relationships (e.g., the non-coding RNA RP11-839D17.3). However, future experiments may identify their capacity to modulate transcriptional or post-transcriptional processes with pathophysiological implications. Therefore, GRIN output should be considered as guidance rather than a definitive determination of what is a true or false positive. In an attempt to limit a bias against genes with lower evidence, we included a network we constructed using explainable artificial intelligence derived from RNA-seq data from the dorsolateral prefrontal cortex as an additional data-driven source of gene-gene relationships (see Methods).

After applying GRIN to expanded suicide attempt GWAS results at p < 1e⁻⁵, we obtained more gene set enrichments in the retained set compared to the original unfiltered gene set based on the removal of false positives. The fact that the removed gene set was scarcely enriched supports this argument. By separating a gene list into retained and removed subsets, users can identify additional biologically relevant pathways that may be missed by enrichment analyses on the whole set alone due to dilution with noise.

Combined with multiple SNP-to-gene assignment approaches, incorporating SNPs at a less stringent significance threshold and applying GRIN elucidated additional suicide-associated genes and pathways. While certain variants have been previously described (e.g., variants in DRD2, PDE4B, and SPATA17)^12,13,15, the present study characterizes additional genes (CELSR3, PRKAR2A) contributing to dendritic structure and multiple key neurotransmitter pathways associated with suicidality, including the previously associated dopaminergic pathway^35,36. Among these additional genes, missense variants in CDC42BPB³⁷, DALRD3³⁰, DHX30²⁹, SMARCC1²⁶, and ZNF589³⁸ are known to impair behavioral and neurodevelopmental processes. While the genetic variants in the present study did not include these missense or loss-of function variants, it is possible that the variants implicated in the present suicide attempt summary statistics may alter the transcriptional regulation of these genes. In addition, multiple genes were implicated in cytoskeletal reorganization. CDC42BPB encodes MRCKbeta, a protein kinase that is induced by long-term potentiation in rodent models and mediates dendritic spinogenesis by actin-myosin filament phosphorylation^39,40. MAP4 is a microtubule-associated protein (MAP) and MARK3 has been shown to phosphorylate tau (MAPT), another MAP which accumulates in multiple neurodegenerative disorders^41,42. Moreover, RCOR1 is a subunit of the REST/CoREST complex and has been shown to affect CELSR3 and SMARCC1 transcription in mouse models^20,26, and SMARCC1 has been implicated in autism as a core component of the SWI/SNF complex^27,43. These findings point to the possible pleiotropic nature of these genes being associated with multiple psychiatric disorders.

By lowering the significance threshold and applying GRIN to refine suicide GWAS gene sets, we identified previously characterized drug targets (DRD2, MARK3, and PDE4B) and drug repurposing/side effect candidates. This includes genes that would not have been detected without lowering the genome-wide significance threshold that were retained by GRIN. Notably, the DRD2 antagonist clozapine is the only FDA-approved drug with on-label use to prevent suicidal behavior³¹. Amisulpride is also a DRD2 antagonist that has been shown to exhibit antipsychotic and antidepressant activities⁴⁴. Intriguingly, fostamatinib targets 8 genes implicated by suicide attempt GWAS including MARK3 and PDE5B, a different phosphodiesterase than the PDE4B gene implicated in suicide attempt GWAS⁴⁵. Moreover, it is important to understand if drugs can present adverse side effects modulating suicidal behavior. For example, the PDE4B inhibitor roflumilast has a rare adverse side effect of increased suicidality in some individuals⁴⁶. Further studies are warranted to understand how pharmacological manipulation of these GWAS-implicated drug targets affect the propensity of suicidal behaviors in at-risk individuals.

GRIN is a powerful tool for identifying biologically interrelated genes and for identifying true positive variants and associated genes from GWAS. In effect, GRIN facilitates post-GWAS investigation by synthesizing multiple lines of evidence to determine which genes should be investigated further. By applying this tool to multiple GWAS results, we identify new genes involved in suicide pathophysiology that may lead to important clinical insights.

Methods

Multiplex biological network generation

The function of GRIN is contingent on the user supplying networks that represent known gene-gene biological relationships. In order to capture these relationships from diverse types of biological evidence, a multiplex network was assembled from weighted network connections (edges) from a combination of publicly available and newly generated monoplex (single layer) networks. A multiplex network has an advantage over aggregate multilayer networks in that the unique topology of each layer is maintained, resulting in generally higher functional predictive ability³⁴. Multiple component networks from HumanNet v2⁴⁷ were used (co-functional links by co-citation, co-essentiality⁴⁸, co-expression, molecular pathway databases, gene neighborhood, phylogenetic profile associations, and orthologous protein-protein interactions transferred from model organisms [CC, CE, CX, DB, GN, PG, IL]), and a protein-protein interaction (PPI) network was generated by merging the following networks into a single monoplex layer: HumanNet v2 component PPI networks (HT, LC), and high-confidence physical protein-protein interactions from STRING version 11.0⁴⁹ (taxa = 9606, protein.actions.v11.0, mode = binding, min score = 700).

As the dorsolateral prefrontal cortex (dlPFC) is a key brain region involved in processes disrupted in individuals with a history of suicide attempt (e.g., deficits in executive function and impulsivity^50,51), we included multiple dlPFC-specific networks to gain tissue type-specific perspective on gene-gene relationships from this brain region. This included a dlPFC-specific transcription factor-gene network layer from a previously published transcription factor binding site network⁵². A newly generated dlPFC (Brodmann area 9) Predictive Expression Network (PEN) was obtained using the Iterative Random Forest - Leave One Out Prediction (iRF-LOOP) method^32,53 using individual-level RNA-seq expression data from the Genotype-Tissue Expression (GTEx) project⁵⁴. The resulting multiplex network was built using RWRtoolkit⁵⁵ (https://github.com/dkainer/RWRtoolkit), which incorporates command-line scripts and an R library for generating multiplex networks and running the network exploration algorithm random walk with restart (RWR) by building upon the RandomWalkRestartMH R package³⁴. The multiplex network used for all analyses comprises 10 layers, 51,183 unique genes, and 3,419,975 edges using δ = 0.5, where δ is the probability of the random walker remaining in the current network layer or moving to a different layer. The multiplex network used for all analyses is publicly available at https://github.com/sullivanka/GRIN/tree/main/test/suicide_weighted_Multiplex_0.5Delta.RData.

GRIN process

GRIN leverages the hypothesis that false positive genes in a user’s gene set, such as from SNP-to-gene assignment from GWAS, are likely to be functionally random with respect to the rest of the gene set, while true positive genes are likely to share function with other members of the gene set. GRIN is a classifier that uses information captured in biological networks from diverse lines of evidence to determine which genes in a gene set are functionally related to each other (and therefore belong together) and which ones appear to be randomly included and are likely false positives. GRIN achieves this by first scoring every gene in the network (including those in the user’s gene set) according to how topologically accessible it is from each gene in the user’s gene set as determined by the network propagation algorithm Random Walk with Restart (RWR). GRIN then classifies the genes in the user’s gene set as true or false positives based on their RWR rankings. GRIN runs its RWR process for 100 random gene sets of the same dimension as the user’s gene set to build a null ranking distribution, so that GRIN can learn what false positive gene set ranks should look like under the assumption they are random. The ordered gene rankings for the user’s gene set is compared to the ordered null rankings to find the position where the distribution of rankings in the user’s gene set no longer diverges significantly from the null distribution. Using this theory, GRIN partitions the user’s gene set, such as from SNP-to-gene assignment from GWAS, into a Retained gene set and Removed gene set in a two-stage process.

In Stage 1, every gene in the network is ranked according to how connected it is to the genes in the user-specified gene set (e.g., GWAS-derived genes). This includes ranking the user-specified genes themselves by using leave-one-out cross-validation (LOOCV). RWR provides each gene with a rank that is a proxy for how easily each gene in the network can be reached from the starting set of GWAS genes, including a rank for the GWAS genes themselves. Genes with many paths and interactions to one or more of the GWAS genes rank strongly, while genes that are isolated or distant from the GWAS genes rank poorly. In the current implementation, this RWR-based ranking occurs based on network propagation of probabilities of visiting a given gene in the multiplex network, which is based on a matrix representation of the edge weights between genes in the multiplex network (i.e., the supra-adjacency matrix composed of all intra- and inter-layer connections). Random walks are then simulated many times by propagating the probability of the random walker exploring a given gene beginning from the seed genes, and this process continues until the combined network probabilities no longer change between simulated random walks by a given threshold (1e⁻¹⁰), thereby achieving convergence based upon an asymptotic number of simulated random walks. The advantage of using a propagation algorithm like RWR is that genes that are not direct neighbors of GWAS genes may still rank highly due to indirect paths. Additional parameters can be used to tune RWR to favor certain network layers (τ) or adjust the probability of restart (r) at seed genes. In all analyses in the present study, we used r = 0.7, equivalent τ values for all network layers, and a multiplex network with δ = 0.5 based on previous work that achieved good performance using these parameters³⁴.

To obtain accurate rankings for each gene in a gene set of size n, we chose to implement random walk with restart leave-one-out cross validation (RWR-LOOCV) n times, where in each run one gene is left out and the other n-1 genes are used as seed genes (starting points) for the random walker in the multiplex network. Each run of RWR-LOOCV generates a ranking of every non-seed gene in the multiplex, including the left-out gene from the original seed gene set, so that each gene in the user’s obtains n-1 rank values after n runs. Stage 1 then orders the genes in the set from best to worst according to their median rank values. GRIN also needs a representation of what Stage 1 results should look like for purely random gene sets of size n. This empirical null distribution is generated by running RWR-LOO for 100 gene sets, each containing n randomly sampled genes from the multiplex. The median rank at each position in the order from 1 to n thus represents the empirical null distribution of ranks for this specific multiplex and gene set size.

In Stage 2, a cutoff C between 1 and n is determined below which all gene set members are considered the equivalent of random and can be discarded. A two-sided Mann–Whitney U test from the R stats base package (“wilcox.test”) is performed over a sliding window of size \({winsize}=0.15\times n\) to see if the RWR-LOOCV ranks for the gene set members come from the same distribution as the null distribution RWR-LOOCV ranks. The expectation is that a gene set window containing functional groups of genes will have a very different ranking distribution to the random genes in the equivalent null window, resulting in very small (significant) p-values. On the other hand, if the window contains genes with little functional relatedness, the ranking distribution will appear to be drawn from the null distribution and the p-value tends towards 1. This test is run for each window sliding by 1, producing a p-value vector of length n-winsize. The cutoff C is chosen by finding an elbow in the p-values using the open source R package “Knee Arrower” with the method = “first” parameter set (https://github.com/agentlans/KneeArrower). The output is a Retained gene set and a Removed gene set.

Validation of GRIN using well-characterized gene sets

To determine the ability of GRIN to effectively remove noise genes from a gene set, we obtained a variety of well-characterized biologically interrelated gene sets (“gold” sets) and spiked them with random genes drawn from the full multiplex network. Given our application of this method to suicide GWAS summary statistics, we chose 20 gene sets related to diverse brain functions. We included an additional 10 gene sets related to other organ systems (lung and kidney) in order to demonstrate that GRIN can be used in other biological contexts. These thirty “gold standard” gene sets of functionally interrelated genes (see Supplementary Table 1), ranging in size from 10 to 225 genes, were derived from the following sources: Gene Ontology (GO¹⁶); Online Mendelian Inheritance in Man (OMIM⁵⁶); and DisGENET⁵⁷. Random genes were added to the full list of genes in each gold set to create gene sets with a 1:1 signal-to-noise ratio (i.e., N_gold : N_random). For each of the 30 gold sets we generated 100 test gene sets using varying samples of random genes. GRIN was then used to filter out random genes from each test gene set and the effectiveness of the filter was evaluated using receiver operator characteristics (ROC) and precision/recall (PR) measured at every possible cutoff point, C, in each rank-ordered gene set.

For evaluation purposes, “true positive” genes were labeled as genes belonging to a gold gene set that were correctly retained by GRIN; “true negative” genes were randomly added genes that were correctly removed by GRIN; “false positive” genes were randomly added genes that were incorrectly retained by GRIN; and “false negative” genes were gold genes that were incorrectly removed by GRIN. ROC (false positive rate vs true positive rate), and PR curves (precision vs recall) were generated and area under ROC (AUROC) and area under PRC (AUPRC) values were calculated for each test gene set. Median AUROC and AUPRC were calculated for each of the 30 gold standard gene sets to indicate whether Stage 1 of GRIN ranked gold genes more highly than random genes in general. After estimating the optimal cutoff C at Stage 2, precision, recall, and specificity (true negatives / true negatives + false positives) were calculated for the genes removed and the genes retained by Stage 2. Median precision, recall, and specificity values were calculated across the 100 test gene sets for each of the 30 gold standard gene sets. Values are presented as median +/- interquartile range (IQR).

GRIN was also tested on unequal ratios of gold standard genes and random genes using the dopaminergic synaptic signaling gene set from GO (GO:0001963) and Acute Kidney Failure gene set from DisGeNET (C0022660) – 2:1 gold genes to random genes, 1:2 gold genes to random genes, 1:4 gold genes to random genes, and 1:10 gold genes to random genes. For each ratio of gold standard genes to noise, 100 test sets were generated. Finally, to test whether GRIN could remove random genes from gene sets containing multiple groups belonging to biological processes that were functionally distinct, multiple gold gene sets were combined and random noise also added. Dopaminergic synaptic transmission (GO:0001963; 23 genes), central nervous system myelination (GO:0022010; 20 genes), Alzheimer’s disease (OMIM #: 607822, 104300, 606889, 608907, 602192, 615590; 12 genes), and major depressive disorder (DisGeNET gene-disease association score ≥ 0.5; 14 genes) were mixed with five ratios of random to gold standard genes – 2:1 gold genes:noise, 1:1 gold genes:noise, 1:2 gold genes:noise, 1:4 gold genes:noise, and 1:10 gold genes:noise. This process was repeated to generate 100 gene sets of gold standard and random genes for each ratio examined.

Evaluating GRIN with low- and high-powered GWAS results

We evaluated GRIN’s ability to control for the loss of precision when using a lower powered GWAS with relaxed significance thresholds to detect genes that were genome-wide significant in a higher powered GWAS. To do this, we obtained published GWAS summary statistics from multiple studies of the same trait and defined the highest powered study (the one with largest sample size) as the gold GWAS for that trait. To fairly compare GRIN to GWAB and NAGA, SNP-to-gene mapping was performed in the same way for each GWAS using the method used by GWAB and NAGA. SNPs were therefore assigned to protein-coding genes within a +/− 10 kbp window, with the SNP with the lowest p-value assigned to the nearest gene (or multiple genes in this window if the SNP was intergenic), and the genes identified in the gold GWAS at a stringent threshold of p < 5e⁻⁸ were labeled as true positives for the trait. We then ran GRIN on the gene sets from the lower powered studies at progressively more relaxed thresholds from p < 5e⁻⁸ by half orders of magnitude down to p < 1e⁻³ and measured the precision and recall of the GRIN-retained set of genes at each threshold. Using the higher powered GWAS as gold standard results, we calculated precision and recall using the total number of genes of the lower powered GWAS (No GWAS Boosting) and compared this to the values from GRIN-retained genes at each significance threshold. We did this for 10 human traits or diseases: coronary artery disease (CAD1-3^58,59,60); number of alcohol-containing drinks consumed per week (DrinksPerWeek1-2^61,62; HDL cholesterol (HDL1-3^63,64,65; height1-3^66,67,68; LDL cholesterol (LDL1-3^63,64,65); schizophrenia (SCZ1-3^69,70,71); smoking initiation1-2^61,62; type 2 diabetes (T2D1-2^72,73); total cholesterol (TC1-3^63,64,65); and total triglycerides (TG1-3^63,64,65). For CAD^58,59, height^66,67, SCZ^69,70, and blood lipids traits (HDL, LDL, TC, TG^63,64), two earlier, lower powered GWAS results were compared to the later, higher powered GWAS^60,65,68,71 (e.g., CAD1 and CAD2 were used to measure precision and recall of CAD3). For traits where under 2000 genes were assigned at a given threshold, we did not proceed with lower thresholds with the exception of the second set of height results (Height2), as 2710 genes were assigned at p < 5e-8 (went down by half orders of magnitude to p < 1e⁻⁵).

Comparing GRIN to other GWAS boosting methods

We sought to compare the performance of GRIN to two other GWAS-boosting methods, NAGA and GWAB, using the same approach of considering higher powered GWAS results as a gold set from which to evaluate precision and recall of lower powered GWAS results.

NAGA works by first assigning to each protein-coding gene the p-value of the best GWAS SNP in near proximity (i.e., +/− 10kbp), which produces a ranking of all protein-coding genes. We used NAGA’s SNP-gene assignment to assign SNPs to protein-coding genes within +/− 10kbp for the genome assembly that was used to run the GWAS: hg18/GRCh36 (CAD1⁵⁸, HDL1⁶³, Height1⁶⁶, LDL1⁶³, SCZ1⁶⁹, TC1⁶³, TG1⁶³, T2D1⁷²), hg19/GRCh37 (CAD2⁵⁹, CAD3⁶⁰, DrinksPerWeek1⁶¹, HDL2⁶⁴, HDL3⁶⁵, Height2⁶⁷, LDL2⁶⁴, LDL3⁶⁵, SCZ2⁷⁰, SmokingInitiation1⁶¹, TC2⁶⁴, TC3⁶⁵, TG2⁶⁴, TG3⁶⁵, SCZ3⁷¹, T2D2⁷³), or hg38/GRCh38 (DrinksPerWeek2⁶², Height3⁶⁸, SmokingInitiation2⁶²). NAGA then propagates the p-values over a gene-to-gene functional network using either the random walk with restart algorithm or heat diffusion, which is intended to boost the recall of relevant (true positive) genes for the trait by re-ranking the genes and increasing their rank position. Thus, while the traditional full NAGA output is a full list of ranked genes, it is up to the user to determine a cutoff in the ranked list for further investigation. In order to make a fair comparison between NAGA and GRIN, we used the top n ranked genes from NAGA to calculate precision and recall, where n is the total number of genes that were input to GRIN at a given statistical threshold. All NAGA rankings were performed using a Jupyter Notebook, random walk with restart network propagation, and the NAGA-supplied monoplex network “Original PCNet” (http://www.ndexbio.org/#/network/f93f402c-86d4-11e7-a10d-0ac135e8bacf).

We ran GWAB (located at https://www.inetbio.org/gwab/gwab_query.php) using SNP-nearest gene assignment within +/− 10kbp similarly to NAGA. GWAB then uses a gene-to-gene functional network (HumanNet v2⁴⁷) to calculate a new score for each gene based on both its own p-value and the p-values of its network neighbors, modulated by the weights of the edges between the gene and those neighbors. This re-ranks the genes, which are evaluated against a user-provided list of true positive genes (e.g., a literature curated list of disease-relevant genes). The traditional output of GWAB is a list of genes assigned from SNPs at an optimized, less stringent genome-wide significance threshold based upon recall of an a priori list of literature-curated disease-relevant traits. Thus, we ran GWAB for the same traits as GRIN and NAGA where literature-curated genes were available from the DISEASES⁷⁴ database, with the exception of Height1, Height2, and SmokingInitiation1. We ran GWAB at the same genome-wide significance thresholds as GRIN and compared precision and recall of GWAB boosted genes at each threshold.

Million Veteran Program (MVP) suicide attempt genome wide association study (GWAS) summary statistics

Suicide attempts were identified from United States veterans as described previously¹⁵. Suicide attempts were characterized by using a combination of Veterans Healthcare Administration (VHA) databases from the VA: the Suicide Prevention Application Network (SPAN) database, electronic health record (EHR) information from the VA Corporate Data Warehouse (CDW), and the CDW Mental Health Domain survey. For the MVP diagnosis, suicide attempt was determined by the presence of one or more of the following International Statistical Classification of Diseases and Related Health Problems (ICD) − 9 and ICD-10 diagnostic codes in a subject’s EHR: ICD-9: E950-959; ICD-10: T14.91, X60-62, X64, X66-X83, Y87.0, Z91.5. Control patients were obtained from veterans enrolled in MVP without a history of suicide attempt or suicidal ideation as determined by a combination of SPAN survey, Mental Health Domain survey, and ICD diagnostic codes in the CDW database (suicidal ideation codes: ICD-9: V62.84; ICD-10: R45.851). A total of 410,464 controls from various ancestries (African, Asian, European, and Hispanic) were included for genome-wide association along with 14,535 cases of non-fatal suicide attempt and 294 fatal attempts. Genome-wide association analyses were conducted using DNA from whole blood samples from subjects enrolled in MVP using a custom Affymetrix Biobank Array. Quality control and imputation was performed as previously described¹⁴. All subjects provided informed consent and the activities used to generate the GWAS summary statistics were approved by the VA Central Institutional Review Board. All ethical regulations relevant to human research participants were followed.

International Suicide Genetics Consortium (ISGC) suicide attempt GWAS summary statistics

Suicide attempt summary statistics were analyzed from two sets of suicide attempt summary statistics derived from civilian populations compiled by the ISGC¹³. SNPs were included from a general population of European ancestry (SA-EUR) as cases of suicide attempt or control subjects. Furthermore, additional summary statistics were derived from this general population conditioned on diagnosis status for major depressive disorder (SA-MDD) to generate an additional set of suicide attempt summary statistics. Thus, while the SA-MDD summary statistics are not independent of the SA-EUR summary statistics as they are comprised of the same set of controls and cases of suicide attempt, both SA-EUR and SA-MDD summary statistics were analyzed in order to determine the overlap between these results and results from the MVP cohort. All subjects involved in the ISGC provided informed consent and the activities used to generate the GWAS summary statistics were approved by their local institutional review boards as previously described¹³. All ethical regulations relevant to human research participants were followed.

SNP to gene assignment for suicide attempt summary statistics

SNPs from MVP and ISGC suicide attempt summary statistics were assigned to genes using the union of two separate methods in order to identify multiple possible genes contributing to this phenotype as input to GRIN. H-MAGMA¹ was used in combination with publicly available Hi-C data from adult dorsolateral prefrontal cortex (dlPFC)⁷⁵ to improve intergenic SNP-to-gene assignment based on three-dimensional chromatin structure in this brain region. Adult prefrontal cortex Hi-C data was used as this brain region is known to be involved in executive function and impulsivity processes, which are disrupted in individuals with a history of suicide attempt^50,51. Additionally, conventional MAGMA⁴ was used as an alternate method of SNP-to-gene assignment. Thus, H-MAGMA and conventional MAGMA were applied only as methods of assigning SNPs to genes only using SNPs at given significance thresholds, rather than using these tools as gene-based tests on the entire set of summary statistics.

SNPs were assigned to genes from MVP, SA-EUR, or SA-MDD summary statistics at multiple thresholds (p < 5e⁻⁸, p < 1e⁻⁵, p < 1e⁻⁴, p < 1e⁻³, p < 1e⁻², and p < 1e⁻¹; Supplementary Table 2) to determine the number of suicide attempt GWAS genes that could be input to GRIN at each of these thresholds. The union of conventional MAGMA and H-MAGMA-assigned genes (i.e., all genes assigned from either method) from MVP, SA-EUR, or SA-MDD suicide attempt summary statistics were subsequently used as gene set inputs to GRIN at a threshold of p < 1e⁻⁵, as this resulted in gene set sizes that would result in high precision while still obtaining more recall compared to genes identified at a threshold of p < 5e⁻⁸. The union of genes identified at a threshold of p < 1e⁻⁵ were then filtered into retained and removed gene sets using GRIN (Supplementary Tables 5, 8, and 9).

Gene set enrichment analysis

Gene sets from MVP and ISGC summary statistics were tested for multiple enrichments using the online ToppGene suite using ToppFun⁷⁶. Gene set enrichments were analyzed using the following enrichment categories: GO: Molecular Function; GO: Biological Process; GO: Cellular Component; Human Phenotype; Pathway (all databases selected); Transcription Factor Binding Site (all databases selected); Drug (all databases selected); Disease (all databases selected). Enrichments were considered significant using a Benjamini-Hochberg false discovery rate (FDR)-adjusted p-value threshold < 0.05.

Drug to gene target networks for putative drug repurposing and side effect evaluation

Genes identified as contributing to suicide attempt pathophysiology from MVP and ISGC summary statistics were used to construct drug to gene target networks from information derived from DrugBank⁷⁷. Drug to gene target networks were visualized in Cytoscape⁷⁸ (version 3.8.2, Cytoscape Consortium) to identify drugs known to target genes of interest from MVP and ISGC summary statistics using GRIN-retained genes at p < 1e⁻⁵ (Supplementary Table 20). ISGC GWAS genes were compared to genes from the MVP cohort using Venn diagrams generated from the open source R package Vennerable (https://github.com/js229/Vennerable).

Statistics and reproducibility

Full data points from Fig. 2, Supplementary Figs. 1–3, and Supplementary Figs. 6 and 7 are provided in Supplementary Data 2. All comparisons between the user’s gene set rank distributions and null distribution were performed using a two-sided Mann–Whitney U test, and the null distribution was generated by generating 100 random gene sets of equivalent size to the user’s gene set to ensure the reproducibility of results.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

GWAS summary statistics from the Million Veteran Program used in this study will be made available on the NIH database of Genotypes and Phenotypes (dbGaP) under accession ID phs001672.v1.p, and summary statistics from the International Suicide Genetics Consortium are available at https://tinyurl.com/ISGC2021. All other data are available from the corresponding authors upon request.

Code availability

GRIN is available as an open-source, command-line R script for public use. The code, installation instructions, and user manual can be found at https://github.com/sullivanka/GRIN and on Zenodo⁷⁹.

References

Sey, N. Y. A. et al. A computational tool (H-MAGMA) for improved prediction of brain-disorder risk genes by incorporating brain chromatin interaction profiles. Nat. Neurosci. 23, 583–593 (2020).
Article CAS PubMed PubMed Central Google Scholar
Hall, M. A. et al. Novel EDGE encoding method enhances ability to identify genetic interactions. PLoS Genet 17, e1009534 (2021).
Article CAS PubMed PubMed Central Google Scholar
Petersen, A., Alvarez, C., DeClaire, S. & Tintle, N. L. Assessing methods for assigning SNPs to genes in gene-based tests of association using common variants. PLoS One 8, e62161 (2013).
Article CAS PubMed PubMed Central Google Scholar
de Leeuw, C. A., Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, e1004219 (2015).
Article PubMed PubMed Central Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS PubMed PubMed Central Google Scholar
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
Article CAS PubMed Google Scholar
Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).
Article CAS PubMed PubMed Central Google Scholar
Gosak, M. et al. Network science of biological systems at different scales: a review. Phys. Life Rev. 24, 118–135 (2018).
Article PubMed Google Scholar
Carlin, D. E. et al. A fast and flexible framework for network-assisted genomic association. iScience 16, 155–161 (2019).
Article CAS PubMed PubMed Central Google Scholar
Shim, J. E. et al. GWAB: a web server for the network-based boosting of human genome-wide association data. Nucleic Acids Res. 45, W154–W161 (2017).
Article CAS PubMed PubMed Central Google Scholar
Voracek, M. & Loibl, L. M. Genetics of suicide: a systematic review of twin studies. Wien. Klin. Wochenschr. 119, 463–475 (2007).
Article CAS PubMed Google Scholar
Erlangsen, A. et al. Genetics of suicide attempts in individuals with and without mental disorders: a population-based genome-wide association study. Mol. Psychiatry 25, 2410–2421 (2020).
Article PubMed Google Scholar
Mullins, N. et al. Dissecting the shared genetic architecture of suicide attempt, psychiatric disorders, and known risk factors. Biol. Psychiatry https://doi.org/10.1016/j.biopsych.2021.05.029 (2021).
Kimbrel, N. A. et al. A genome-wide association study of suicide attempts and suicidal ideation in U.S. military veterans. Psychiatry Res. 269, 64–69 (2018).
Article PubMed PubMed Central Google Scholar
Kimbrel, N. A. et al. A genome-wide association study of suicide attempts in the million veterans program identifies evidence of pan-ancestry and ancestry-specific risk loci. Mol. Psychiatry https://doi.org/10.1038/s41380-022-01472-3 (2022).
Harris, M. A. et al. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32, D258–D261 (2004).
Article CAS PubMed Google Scholar
Wang, H., Xu, J., Lazarovici, P., Quirion, R. & Zheng, W. cAMP Response Element-Binding Protein (CREB): a possible signaling molecule link in the pathophysiology of schizophrenia. Front. Mol. Neurosci. 11, 255 (2018).
Article PubMed PubMed Central Google Scholar
Dvorakova, M. et al. SGIP1 is involved in regulation of emotionality, mood, and nociception and modulates in vivo signalling of cannabinoid CB1 receptors. Br. J. Pharmacol. 178, 1588–1604 (2021).
Article CAS PubMed Google Scholar
Trevaskis, J. et al. Src homology 3-domain growth factor receptor-bound 2-like (endophilin) interacting protein 1, a novel neuronal protein that regulates energy balance. Endocrinology 146, 3757–3764 (2005).
Article CAS PubMed Google Scholar
Thakar, S. et al. Evidence for opposing roles of Celsr3 and Vangl2 in glutamatergic synapse formation. Proc. Natl. Acad. Sci. USA 114, E610–E618 (2017).
Article CAS PubMed PubMed Central Google Scholar
Sugiura, N., Patel, R. G. & Corriveau, R. A. N-methyl-D-aspartate receptors regulate a group of transiently expressed genes in the developing brain. J. Biol. Chem. 276, 14257–14263 (2001).
Article CAS PubMed Google Scholar
Takahashi, H. et al. MED26 regulates the transcription of snRNA genes through the recruitment of little elongation complex. Nat. Commun. 6, 5941 (2015).
Article CAS PubMed Google Scholar
Vukojevic, V. et al. Evolutionary conserved role of neural cell adhesion molecule-1 in memory. Transl. Psychiatry 10, 217 (2020).
Article CAS PubMed PubMed Central Google Scholar
Walmod, P. S., Kolkova, K., Berezin, V. & Bock, E. Zippers make signals: NCAM-mediated molecular interactions and signal transduction. Neurochem. Res. 29, 2015–2035 (2004).
Article CAS PubMed Google Scholar
Monaghan, C. E. et al. REST corepressors RCOR1 and RCOR2 and the repressor INSM1 regulate the proliferation-differentiation balance in the developing brain. Proc. Natl. Acad. Sci. USA 114, E406–E415 (2017).
Article CAS PubMed PubMed Central Google Scholar
Abrajano, J. J. et al. Differential deployment of REST and CoREST promotes glial subtype specification and oligodendrocyte lineage maturation. PLoS One 4, e7665 (2009).
Article PubMed PubMed Central Google Scholar
Sokpor, G., Xie, Y., Rosenbusch, J. & Tuoc, T. Chromatin Remodeling BAF (SWI/SNF) complexes in neural development and disorders. Front. Mol. Neurosci. 10, 243 (2017).
Article PubMed PubMed Central Google Scholar
Caubit, X., Tiveron, M.-C., Cremer, H. & Fasano, L. Expression patterns of the three Teashirt-related genes define specific boundaries in the developing and postnatal mouse forebrain. J. Comp. Neurol. 486, 76–88 (2005).
Article CAS PubMed Google Scholar
Lessel, D. et al. De novo missense mutations in DHX30 impair global translation and cause a neurodevelopmental disorder. Am. J. Hum. Genet. 101, 716–724 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lentini, J. M., Alsaif, H. S., Faqeih, E., Alkuraya, F. S. & Fu, D. DALRD3 encodes a protein mutated in epileptic encephalopathy that targets arginine tRNAs for 3-methylcytosine modification. Nat. Commun. 11, 2510 (2020).
Article CAS PubMed PubMed Central Google Scholar
Meltzer, H. Y. et al. Clozapine treatment for suicidality in schizophrenia: International Suicide Prevention Trial (InterSePT). Arch. Gen. Psychiatry 60, 82–91 (2003).
Article CAS PubMed Google Scholar
Cliff, A. et al. A high-performance computing implementation of iterative random forest for the creation of predictive expression networks. Genes 10, 996 (2019).
Article CAS PubMed PubMed Central Google Scholar
Baranger, D. A. A. et al. Multi-omics cannot replace sample size in genome-wide association studies. Genes Brain Behav. 22, e12846 (2023).
Article CAS PubMed PubMed Central Google Scholar
Valdeolivas, A. et al. Random walk with restart on multiplex and heterogeneous biological networks. Bioinformatics 35, 497–505 (2019).
Article CAS PubMed Google Scholar
Duval, F. et al. Hypothalamic-prolactin axis regulation in major depressed patients with suicidal behavior. Psychoneuroendocrinology 151, 106050 (2023).
Article CAS PubMed Google Scholar
Oquendo, M. A. et al. Toward a biosignature for suicide. Am. J. Psychiatry 171, 1259–1277 (2014).
Article PubMed PubMed Central Google Scholar
Chilton, I. et al. De novo heterozygous missense and loss-of-function variants in CDC42BPB are associated with a neurodevelopmental phenotype. Am. J. Med. Genet. A 182, 962–973 (2020).
Article CAS PubMed Google Scholar
Agha, Z. et al. Exome sequencing identifies three novel candidate genes implicated in intellectual disability. PLoS One 9, e112687 (2014).
Article PubMed PubMed Central Google Scholar
Wang, X.-X. et al. MRCKβ links Dasm1 to actin rearrangements to promote dendrite development. J. Biol. Chem. 296, 100730 (2021).
Article CAS PubMed PubMed Central Google Scholar
Li, L. et al. Protein kinases paralleling late-phase LTP formation in dorsal hippocampus in the rat. Neurochem. Int. 76, 50–58 (2014).
Article CAS PubMed Google Scholar
Lund, H. et al. MARK4 and MARK3 associate with early tau phosphorylation in Alzheimer’s disease granulovacuolar degeneration bodies. Acta Neuropathol. Commun. 2, 22 (2014).
Article PubMed PubMed Central Google Scholar
Doki, C. et al. Microtubule elongation along actin filaments induced by microtubule-associated protein 4 contributes to the formation of cellular protrusions. J. Biochem. 168, 295–303 (2020).
Article CAS PubMed Google Scholar
Neale, B. M. et al. Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 485, 242–245 (2012).
Article CAS PubMed PubMed Central Google Scholar
Rehni, A. K., Singh, T. G. & Chand, P. Amisulpride-induced seizurogenic effect: a potential role of opioid receptor-linked transduction systems. Basic Clin. Pharmacol. Toxicol. 108, 310–317 (2011).
Article CAS PubMed Google Scholar
Rolf, M. G. et al. In vitro pharmacological profiling of R406 identifies molecular targets underlying the clinical effects of fostamatinib. Pharm. Res. Perspect. 3, e00175 (2015).
Article Google Scholar
Pinner, N. A., Hamilton, L. A. & Hughes, A. Roflumilast: a phosphodiesterase-4 inhibitor for the treatment of severe chronic obstructive pulmonary disease. Clin. Ther. 34, 56–66 (2012).
Article CAS PubMed Google Scholar
Hwang, S. et al. HumanNet v2: human gene networks for disease research. Nucleic Acids Res. 47, D573–D580 (2019).
Article CAS PubMed Google Scholar
Wang, T. et al. Gene essentiality profiling reveals gene networks and synthetic lethal interactions with oncogenic ras. Cell 168, 890–903.e15 (2017).
Article CAS PubMed PubMed Central Google Scholar
Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2018).
Article PubMed Central Google Scholar
Zhang, H. et al. Aberrant white matter microstructure in depressed patients with suicidality. J. Magn. Reson. Imaging https://doi.org/10.1002/jmri.27927 (2021).
Cao, J. et al. The association between resting state functional connectivity and the trait of impulsivity and suicidal ideation in young depressed patients with suicide attempts. Front. Psychiatry 12, 567976 (2021).
Article PubMed PubMed Central Google Scholar
Pearl, J. R. et al. Genome-scale transcriptional regulatory network models of psychiatric and neurodegenerative disorders. Cell Syst. 8, 122–135.e7 (2019).
Article CAS PubMed Google Scholar
Basu, S., Kumbier, K., Brown, J. B. & Yu, B. Iterative random forests to discover predictive and stable high-order interactions. Proc. Natl. Acad. Sci. USA 115, 1943–1948 (2018).
Article CAS PubMed PubMed Central Google Scholar
GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
Article Google Scholar
Kainer, D., Lane, M., Sullivan, K., Cashman, M. & Miller, J. dkainer/RWRtoolkit (Oak Ridge National Laboratory (ORNL), 2022). https://doi.org/10.11578/DC.20220607.1
Amberger, J. S., Bocchini, C. A., Schiettecatte, F., Scott, A. F. & Hamosh, A. OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 43, D789–D798 (2015).
Article PubMed Google Scholar
Piñero, J. et al. DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 45, D833–D839 (2017).
Article PubMed Google Scholar
Schunkert, H. et al. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease. Nat. Genet. 43, 333–338 (2011).
Article CAS PubMed PubMed Central Google Scholar
Nikpay, M. et al. A comprehensive 1,000 Genomes-based genome-wide association meta-analysis of coronary artery disease. Nat. Genet. 47, 1121–1130 (2015).
Article CAS PubMed PubMed Central Google Scholar
Klarin, D. et al. Genetic analysis in UK Biobank links insulin resistance and transendothelial migration pathways to coronary artery disease. Nat. Genet. 49, 1392–1397 (2017).
Article CAS PubMed PubMed Central Google Scholar
Liu, M. et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat. Genet. 51, 237–244 (2019).
Article CAS PubMed PubMed Central Google Scholar
Saunders, G. R. B. et al. Genetic diversity fuels gene discovery for tobacco and alcohol use. Nature 612, 720–724 (2022).
Article CAS PubMed PubMed Central Google Scholar
Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).
Article CAS PubMed PubMed Central Google Scholar
Willer, C. J. et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).
Article CAS PubMed PubMed Central Google Scholar
Graham, S. E. et al. The power of genetic diversity in genome-wide association studies of lipids. Nature 600, 675–679 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lango Allen, H. et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–838 (2010).
Article CAS PubMed PubMed Central Google Scholar
Wood, A. R. et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173–1186 (2014).
Article CAS PubMed PubMed Central Google Scholar
Yengo, L. et al. A saturated map of common genetic variants associated with human height. Nature 610, 704–712 (2022).
Article CAS PubMed PubMed Central Google Scholar
Ripke, S. et al. Genome-wide association study identifies five new schizophrenia loci. Nat. Genet. 43, 969–978 (2011).
Article CAS Google Scholar
Ripke, S. et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).
Article CAS PubMed Central Google Scholar
Trubetskoy, V. et al. Mapping genomic loci implicates genes and synaptic biology in schizophrenia. Nature 604, 502–508 (2022).
Article CAS PubMed PubMed Central Google Scholar
Morris, A. P. et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat. Genet. 44, 981–990 (2012).
Article CAS PubMed PubMed Central Google Scholar
Mahajan, A. et al. Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat. Genet. 54, 560–572 (2022).
Article CAS PubMed PubMed Central Google Scholar
Pletscher-Frankild, S., Pallejà, A., Tsafou, K., Binder, J. X. & Jensen, L. J. DISEASES: text mining and data integration of disease-gene associations. Methods 74, 83–89 (2015).
Article CAS PubMed Google Scholar
Wang, D. et al. Comprehensive functional genomic resource and integrative model for the human brain. Science 362, eaat8464 (2018).
Article CAS PubMed PubMed Central Google Scholar
Chen, J., Bardes, E. E., Aronow, B. J. & Jegga, A. G. ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res. 37, W305–W311 (2009).
Article CAS PubMed PubMed Central Google Scholar
Wishart, D. S. et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 36, D901–D906 (2008).
Article CAS PubMed Google Scholar
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13, 2498–2504 (2003).
Article CAS PubMed PubMed Central Google Scholar
Sullivan, K. et al. GRIN - Geneset Refinement Using Interacting Networks. (Zenodo). https://doi.org/10.5281/ZENODO.13684721 (2024).

Download references

Acknowledgements

This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This work was sponsored by MVP CHAMPION (DAJ), NIH grants DA041913 (DAJ), DA051908 (DAJ), MH116269 (DMR), MH121455 (DMR), Brain & Behavior Research Foundation (NARSAD Young Investigator Award No. 29551 [NM]), Department of Veterans Affairs (VA) Clinical Science Research and Development (CSR&D) grants lK6BX003777 (JCB, NAK, DWO), and the VA Million Veteran Program (MVP), and the Australian Research Council Center of Excellence for Plant Success in Nature & Agriculture (project number CE200100015) (DK). This publication does not represent the views of the VA or the United States Government. We also thank and acknowledge MVP (Office of Research and Development, Veterans Health Administration), the MVP Suicide Exemplar Workgroup, and the ISGC for their contributions to this manuscript. A complete listing of contributors from the MVP, MVP Suicide Exemplar Workgroup, and ISGC is provided in the Supplemental Information. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

Author information

A full list of members and their affiliations appears in the Supplementary Information.

Authors and Affiliations

Computational and Predictive Biology, Oak Ridge National Laboratory, Oak Ridge, TN, USA
Kyle A. Sullivan, Mikaela Cashman, J. Izaak Miller, Mirko Pavicic, Michael R. Garvin, John P. Pestian, Daniel A. Jacobson & David Kainer
The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, USA
Matthew Lane, Angelica M. Walker, Ashley Cliff & Jonathon Romero
Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory Berkeley, California, CA, USA
Mikaela Cashman
Durham Veterans Affairs Health Care System, Durham, NC, USA
Xuejun Qin, Jean C. Beckham & Nathan A. Kimbrel
Duke University School of Medicine, Duke University, Durham, NC, USA
Xuejun Qin, Allison E. Ashley-Koch & Nathan A. Kimbrel
Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York City, NY, USA
Niamh Mullins
Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York City, NY, USA
Niamh Mullins
Department of Psychiatry, University of Utah School of Medicine, Salt Lake City, UT, USA
Anna Docherty & Hilary Coon
Huntsman Mental Health Institute, University of Utah School of Medicine, Salt Lake City, UT, USA
Hilary Coon
Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
Douglas M. Ruderfer
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
Douglas M. Ruderfer
Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
Douglas M. Ruderfer
Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
Douglas M. Ruderfer
Cincinnati Children’s Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
John P. Pestian
Department of Psychiatry and Behavioral Sciences, Duke University Medical Center, Durham, NC, USA
Allison E. Ashley-Koch & Jean C. Beckham
VISN 6 Mid-Atlantic Mental Illness Research, Durham Veterans Affairs Health Care System, Durham, NC, USA
Jean C. Beckham & Nathan A. Kimbrel
Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, NM, USA
Benjamin McMahon
VISN 4 Mental Illness Research, Education, and Clinical Center, Center of Excellence, Corporal Michael J. Crescenz VA Medical Center, Philadelphia, PA, USA
David W. Oslin
Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
David W. Oslin
VA Health Services Research and Development Center of Innovation to Accelerate Discovery and Practice Transformation, Durham, NC, USA
Nathan A. Kimbrel
Centre of Excellence for Plant Success in Nature and Agriculture, University of Queensland, Brisbane, QLD, Australia
David Kainer

Authors

Kyle A. Sullivan
View author publications
Search author on:PubMed Google Scholar
Matthew Lane
View author publications
Search author on:PubMed Google Scholar
Mikaela Cashman
View author publications
Search author on:PubMed Google Scholar
J. Izaak Miller
View author publications
Search author on:PubMed Google Scholar
Mirko Pavicic
View author publications
Search author on:PubMed Google Scholar
Angelica M. Walker
View author publications
Search author on:PubMed Google Scholar
Ashley Cliff
View author publications
Search author on:PubMed Google Scholar
Jonathon Romero
View author publications
Search author on:PubMed Google Scholar
Xuejun Qin
View author publications
Search author on:PubMed Google Scholar
Niamh Mullins
View author publications
Search author on:PubMed Google Scholar
Anna Docherty
View author publications
Search author on:PubMed Google Scholar
Hilary Coon
View author publications
Search author on:PubMed Google Scholar
Douglas M. Ruderfer
View author publications
Search author on:PubMed Google Scholar
Michael R. Garvin
View author publications
Search author on:PubMed Google Scholar
John P. Pestian
View author publications
Search author on:PubMed Google Scholar
Allison E. Ashley-Koch
View author publications
Search author on:PubMed Google Scholar
Jean C. Beckham
View author publications
Search author on:PubMed Google Scholar
Benjamin McMahon
View author publications
Search author on:PubMed Google Scholar
David W. Oslin
View author publications
Search author on:PubMed Google Scholar
Nathan A. Kimbrel
View author publications
Search author on:PubMed Google Scholar
Daniel A. Jacobson
View author publications
Search author on:PubMed Google Scholar
David Kainer
View author publications
Search author on:PubMed Google Scholar

Consortia

International Suicide Genetics Consortium

Xuejun Qin
, Niamh Mullins
, Anna Docherty
, Hilary Coon
, Douglas M. Ruderfer
, Allison E. Ashley-Koch
, Jean C. Beckham
, Benjamin McMahon
, David W. Oslin
& Nathan A. Kimbrel

VA Million Veteran Program

Jean C. Beckham

MVP Suicide Exemplar Workgroup

Kyle A. Sullivan
, Matthew Lane
, Mikaela Cashman
, J. Izaak Miller
, Mirko Pavicic
, Angelica M. Walker
, Ashley Cliff
, Jonathon Romero
, Xuejun Qin
, Michael R. Garvin
, John P. Pestian
, Allison E. Ashley-Koch
, Jean C. Beckham
, Benjamin McMahon
, David W. Oslin
, Nathan A. Kimbrel
, Daniel A. Jacobson
& David Kainer

Contributions

Conceptualization and methodology: K.A.S., D.K., and D.A.J.; Software: K.A.S., M.L., M.C., J.I.M., A.C., J.R., and D.K.; Formal analysis: K.A.S. and D.K.; Investigation: K.A.S., D.K., X.Q., N.M., and D.A.J.; Resources: D.R., X.Q., and D.A.J.; Writing - Original Draft: K.A.S. and D.K.; Writing - Review & Editing: K.A.S., M.L., M.P., A.M.W., N.M., A.D., H.C., D.M.R., M.R.G., A.E.A.K., J.C.B., B.M., D.W.O., N.A.K., D.K., and D.A.J.; Visualization: K.A.S. and D.K.; Supervision: N.A.K., D.A.J., and D.K.; Project administration: A.D., H.C., D.M.R., J.P.P., A.E.A.K., J.C.B., B.M., D.W.O., N.A.K., D.A.J.; Funding acquisition: D.M.R., D.K., and D.A.J.

Corresponding authors

Correspondence to Nathan A. Kimbrel, Daniel A. Jacobson or David Kainer.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Biology thanks Niamh Ryan and the other, anonymous, reviewers for their contribution to the peer review of this work. Primary Handling Editors: Aylin Bircan and Benjamin Bessieres.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary File

Supplementary Data 1

Supplementary Data 2

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Sullivan, K.A., Lane, M., Cashman, M. et al. Analyses of GWAS signal using GRIN identify additional genes contributing to suicidal behavior. Commun Biol 7, 1360 (2024). https://doi.org/10.1038/s42003-024-06943-7

Download citation

Received: 08 March 2024
Accepted: 23 September 2024
Published: 21 October 2024
Version of record: 21 October 2024
DOI: https://doi.org/10.1038/s42003-024-06943-7