Abstract
Opioid dependence (OD) is epidemic in the United States and it is associated with a variety of adverse health effects. Its estimated heritability is ~50% in twin studies, and recent genome-wide association studies have identified more than a dozen common risk variants. However, there are no published studies of rare OD risk variants. In this study, we analyzed whole-exome sequencing data from the Yale-Penn cohort, comprising 2100 participants of European ancestry (EUR; 1321 OD cases) and 1790 of African ancestry (AFR; 864 cases). A novel low-frequency variant (rs746301110) in the RUVBL2 gene was identified in EUR (p = 6.59 × 10−10). Suggestive associations (p < 1 × 10−5; not passing the Bonferroni correction) were observed in TMCO3 in EUR, in NEIL2 and CFAP44 in AFR, and in FAM210B in the cross-ancestry meta-analysis. Gene-based collapsing tests identified SLC22A10, TMCO3, FAM90A1, DHX58, CHRND, GLDN, PLAT, H1-4, COL3A1, GPHB5 and QPCTL as top genes (p < 1 × 10−4) with most associations attributable to rare variants and driven by the burden of predicted loss-of-function and missense variants. This study begins to fill the gap in our understanding of the genetic architecture of OD, providing insights into the contribution of rare coding variants and potential targets for future functional studies and drug development.
Similar content being viewed by others
Introduction
Opioid dependence (OD, DSM-IV [1]; similarly, opioid use disorder (OUD), DSM-5 [2]) is a chronic relapsing disorder that causes clinically significant distress or impairment [3]. Symptoms of OD include an elevated desire to use opioids, tolerance, loss of control in intake, and a characteristic withdrawal syndrome that follows abrupt abstinence from opioids. In 2017, OD affected about 40.5 million people globally, with >100,000 opioid overdose deaths annually [4, 5]. The highest prevalence of OD has been observed in the United States, where the use of opioids has increased in recent decades [6].
Studies of the mechanisms underlying OD include those of the neurocircuitry in the brain [3, 7] and molecular pathways involving opioid receptors [8]. Twin studies showed a 54% heritability of OD [9, 10]. Thus, identifying genetic factors that contribute to OD risk could advance efforts to prevent, identify, and treat the disorder. To date, genome-wide association studies (GWAS) of OD [11,12,13,14,15,16,17,18] have identified more than a dozen common variants significantly associated with OD in genes encoding potassium ion channels, glutamate receptors, and opioid receptors, improving our understanding of the genetic architecture of the disorder [19, 20].
The largest GWAS of OD/OUD to date (including the Yale-Penn cohort) was conducted in European and African ancestry subjects [18] (EUR and AFR), identifying independent risk loci including the OPRM1 and FURIN genes – both identified in other genetic studies [14, 16, 17]. By incorporating GWAS data of problematic alcohol use and cannabis use disorder using a multi-trait analysis of GWAS (a method that boosts power for gene discovery for a set of similar traits), 19 independent loci were identified in total, reflecting a shared genetic architecture among substance use disorders [10, 21]. Another study used data from the Million Veteran Program (MVP) and identified 14 loci associated with OUD [16]. Convergent findings from these two studies include risk variants in or near genes OPRM1, FURIN, RABEPK, NCAM1, and others.
However, large gaps remain in our knowledge of the genetics of OD. The estimated single-nucleotide polymorphism (SNP)-based heritability (h2) of OD ranges from 6.0% in the Partners Biobank [15] to 12.8% in MVP-dominated studies [14, 16,17,18]. When using a stringent definition of cases in MVP, requiring at least one inpatient or two outpatient ICD-9/10 OUD diagnostic codes, the h2 was 19.8% in AFR participants and 15.3% in EUR participants [16]. These estimates are all lower than estimates from twin studies, reflective of heritability that is unaccounted for in GWAS, which is usual for complex traits. This discrepancy in heritability might be due to incomplete genome coverage, the limited power of current GWAS, and the exclusion of various contributing factors in GWAS. Recently, whole-exome sequencing (WES) and whole-genome sequencing data have been used to augment SNP-array data for many diseases and traits [22,23,24]. Both these methods outperform SNP-array-based studies in identifying rare variants and accounting for missing heritability [23, 25, 26]. However, there is little information in the literature concerning rare coding variants in OD [27]. Thus, a large-scale WES study is needed to provide adequate statistical power to detect rare OD variants. Here, we report findings from the largest multi-ancestry WES study in OD to date, which was conducted in the Yale-Penn cohort [11, 28] in a total of 3890 participants (2185 cases).
Materials and methods
Ethics
The Yale-Penn study was approved by Yale Human Research Protection Program, the University of Pennsylvania Institutional Review Board, University of Connecticut Human Subjects Protection Program, Medical University of South Carolina Institutional Review Board for Human Research and the McLean Hospital Institutional Review Board. All participants provided written informed consent. All methods were performed in accordance with the relevant guidelines and regulations.
Yale-Penn cohort
The Yale-Penn study of the genetics of drug and alcohol dependence enrolled samples from 2000–2020. As described previously [28], 4530 samples from Yale-Penn were sequenced in four batches. Batch 1 whole-exome sequencing (WES) data were generated on the Illumina HiSeq 2000 platform with NimbleGen SeqCap exome capture V2 kit. Batch 2–4 WES data were obtained using the NovaSeq sequencing system and the xGen Exome Research Panel v1. Variants were called using the BWA-GATK pipeline and mapped to the human reference genome build 38 (hg38) [29, 30]. Variants in low-complexity regions, with missingness rates>0.05, or that failed Hardy-Weinberg Equilibrium expectations (p < 10−6) were filtered out.
Samples were excluded if they had mean sequencing depth<20 (n = 4), mean genotype quality score<55 (n = 8), total missingness rate >10% (n = 46), or any of the following metrics fell outside the per-batch mean±3 SD range: transition/transversion ratio (n = 10), number of called variants (n = 14), number of singletons (n = 69), heterozygous/homozygous ratio (n = 31), and insertion/deletion ratio (n = 14). After the sequencing quality controls, samples with inconsistencies in self-reported and genetic sex (n = 55), duplicates or inferred sample swaps (n = 110) were removed before analysis (in total removed n = 203). For the remaining samples, principal component analysis (PCA) [31] was used to assign the genetic ancestry for each sample using 1000 Genomes phase 3 data [32] as reference, resulting in 2102 EUR individuals and 1790 AFR individuals (samples not clustered in EUR or AFR groups were removed). Within each ancestry group, we performed a second round of PCA using common variants to derive the first 10 principal components (PCs) for downstream analyses.
The diagnosis of OD (DSM-IV opioid dependence) was obtained using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA) [33]. Controls are defined as self-reported exposure to opioids (by answering ‘yes’ to ‘Have you ever used any of the following opiate/drugs?’). We included 1321 EUR cases and 779 controls (two individuals were removed due to not meeting the DSM-IV diagnosis of OD) and 864 AFR cases and 926 controls in the downstream analyses.
Variant annotation
Variants that passed the quality control steps were annotated using ANNOVAR [34], which annotates functions of genetic variants utilizing up-to-date information. Variants predicted as frameshift mutations, stop-gain, stop-loss, or splicing site alterations were categorized as loss-of-function (LoF) variants. Missense variants with predicted REVEL [35] score > 0.5 were considered deleterious missense (Dmis) variants. Other missense (Omis) variants were those with a REVEL score ≤ 0.5.
We also annotated the functions of the top-associated variants and genes using a series of bioinformatics tools, including those incorporating recent deep-learning methods. The tools include Combined Annotation-Dependent Depletion (CADD) score [36] for variant deleteriousness (higher values indicate more deleterious cases, a commonly used cutoff for which is 20), SpliceAI [37] to predict the splice-altering consequence (scores range from 0–1), AlphaMissense [38] to predict the pathogenicity of missense variants (scores range from 0–1), and Sei to predict regulatory functions based on chromatin features across diverse cell types [39].
Single-variant association analysis
Single-variant association analyses were performed on variants with minor allele counts (MAC) ≥ 5 using SAIGE-GENE + [40], correcting for age, sex, sequencing batch, and 10 principal components (PCs). Sex-stratified analyses in males and females were also performed, correcting for all covariates other than sex. Analyses were performed in EUR and AFR separately, followed by a cross-ancestry meta-analysis using METAL [41] with a sample-sized weighted approach. Bonferroni correction was used to define the genome-wide significance of the associations (p < 0.05/253,281 = 1.97 × 10−7). Rare variant PCs derived using PCA on variants with 5 ≤ MAC < 40 in each ancestry were added as additional covariates for sensitivity analyses [42].
Gene-based tests
Rare variants may act in aggregate and therefore be detected by gene-based analysis. To identify genes associated with OD, we performed gene-based tests using SAIGE-GENE+ which incorporates multiple minor allele frequency (MAF) cutoffs and functional annotations to improve power [40]. For each gene, coding variants were aggregated by different annotations (LoF only, LoF+missense, LoF+missense+synonymous, synonymous) and by different MAF cutoffs (MAF ≤ 0.01%, MAF ≤ 0.1%, MAF ≤ 1% and all). The covariates used were the same as those in single-variant analyses. EUR and AFR were analyzed separately, followed by a cross-ancestry meta-analysis [41]. The Bonferroni-corrected significance threshold for the exome-wide, gene-based analysis was set at p = 2.90 × 10−6 (17,252 genes).
Differential gene expression for candidate genes
For genes identified through both single-variant association analyses and gene-based tests, we examined the gene expression changes in brain tissues and cell types related to OUD. Bulk RNA-seq data from a previous study of four postmortem brain tissues from the Veterans Affairs Brain Bank [43], including dorsolateral prefrontal cortex (dlPFC), medial orbitofrontal cortex (mOFC), dorsal anterior cingulate cortex (dACC), and subgenual prefrontal cortex (sgPFC), were re-analyzed in 14 OUD cases and 88 controls. We also examined cell-type-specific gene expression changes using single-nuclei transcriptomics in human striatum from a previous study, which includes 6 OUD cases and 6 controls [44].
Gene set enrichment analyses
Enrichment analyses of Gene Ontology (GO) terms [45] and Reactome pathways [46] were performed in g:Profiler [47] with nominally significant (p < 0.05) genes derived from the cross-ancestry, gene-based meta-analysis. For network visualization of the gene set enrichments, we downloaded the gene set lists of GO terms and Reactome pathways from g:Profiler and used the terms with adjusted p-value < 0.01 as input for the Enrichment Map [48] app implemented in Cytoscape [49]. Enriched terms were clustered based on the Jaccard similarity coefficient of 0.6, with singleton clusters discarded. Within-ancestry gene-set enrichment analyses were performed using selected genes (p < 0.05) from gene-based association results of EUR and AFR, respectively. The input genes were ranked based on their respective beta values of the gene-based association tests from each ancestry. The top five terms from GO biological processes and KEGG pathways were visualized using clusterProfiler [50] package in R.
Results
Single-variant association analysis
Variants were called in 4530 WES samples using the BWA-GATK pipeline [29, 30]. After quality control, there were 1321 OD cases and 779 exposed controls in EUR (1189,765 variants) and 864 cases and 926 exposed controls in AFR (1189,609 variants), totaling 3890 individuals (Table 1). In EUR, 121,786 variants with MAC ≥ 5 were included in single-variant association analysis using a logistic mixed model implemented in SAIGE-GENE+ [40]; in AFR, 211,176 variants were included. In total, 253,281 variants were analyzed in the cross-ancestry meta-analysis, and Bonferroni corrections were used to determine the exome-wide significant associations.
In EUR, we identified a novel rare LoF variant in the RUVBL2 gene associated with OD (rs746301110, beta = −2.60, SE = 0.43, p = 6.59 × 10−10, Table 2, Supplementary Table 1). To confirm the sequencing quality of this variant, we extracted the quality metrics from the raw sequencing data. This indicated that the average depth (DP) was 56 and the average genotype quality (GQ) was 89, which both pass the variant QC thresholds (DP ≥ 10; GQ ≥ 20) [28].
In EUR, a suggestive association (p < 1 × 10−5; not passing the Bonferroni correction) was observed for rs765832505 (p = 8.22 × 10−6) in the TMCO3 gene (Table 2, Supplementary Table 1). In AFR, suggestive associations include rs804269 (p = 4.15 × 10−6) in NEIL2 and rs16860800 (p = 5.45 × 10−6) in CFAP44 (Table 2, Supplementary Table 2). The cross-ancestry meta-analysis combining both EUR and AFR samples identified additional suggestive associations, which include two variants in the FAM210B gene ─ rs6099114 (p = 5.55 × 10−6) and rs6099115 (p = 5.75 × 10−6), which are in high linkage disequilibrium (LD) in both populations (r2 > 0.99) (Fig. 1, Table 2, Supplementary Table 3). Sex-stratified single-variant association analyses (Supplementary Tables 4–9) identified suggestive variants, including a rare missense variant rs34270544 in females (p = 3.54 × 10−5) located in RHOD, one of the Rho GTPase proteins (Supplementary Table 9). The Rho GTPase family has been reported to have potential therapeutic functions in substance use disorders [51, 52]. Furthermore, rs34270544 has also been associated with obesity-related metabolic dysfunctions [53]. Sensitivity analyses including rare variant PCs as additional covariates showed negligible changes in the statistical significance of the top associations, with the p-value of rs746301110 changing from 6.59 × 10−10–7.23 × 10−10. In addition, after correcting for polysubstance use disorder status (any of alcohol dependence, cocaine dependence, tobacco dependence, and cannabis dependence), we observed no substantial changes in p-values for the top associations (e.g., the p-value of rs746301110 changed from 6.59 × 10−10–9.00 × 10−10).
Gene-based analysis
In EUR, while there were no genome-wide significant findings after Bonferroni corrections, the gene-based analysis identified suggestive genes (p < 1 × 10−4). These included SLC22A10 (p = 1.55 × 10−5; Table 3, Supplementary Table 10) with the cumulative burden resulting from the category of rare synonymous variants (MAF ≤ 0.001), TMCO3 (p = 2.79 × 10−5) with the cumulative variant burden from all LoF+missense variants (MAF ≤ 0.5), and FAM90A1 (p = 6.06 × 10−5) evaluated by the aggregated effects of rare (MAF ≤ 0.001) LoF+missense+synonymous variants. In AFR, we identified genes DHX58 (p = 5.20 × 10−5, LoF+missense, MAF ≤ 0.001), CHRND (p = 5.99 × 10−5, synonymous, MAF ≤ 0.01), GLDN from the burden of LoF+missense+synoymous variants (p = 6.77 × 10−5 for MAF ≤ 0.01; p = 7.00 × 10−5 for MAF ≤ 0.5), PLAT (p = 7.00 × 10−5, LoF+missense, MAF ≤ 0.01), and H1-4 (p = 8.59 × 10−5, synonymous, MAF ≤ 0.001) (Table 3, Supplementary Table 11). In the cross-ancestry meta-analysis, CHRND was the most significant association (p = 4.39 × 10−6, synonymous, MAF ≤ 0.01), almost reaching the Bonferroni-corrected significance threshold. The other top genes include QPCTL (p = 4.25 × 10−5), COL3A1 (p = 7.95 × 10−5), and GPHB5 (p = 9.50 × 10−5) (Fig. 2, Table 3, Supplementary Table 12).
Manhattan plot for meta-analyses of gene-based tests across the combinations of four variant categories and three MAF cutoffs. Burdens arising from different variant groups were represented by shape, with different MAF cutoffs separated by color. The red dashed line indicates the p-value threshold of 1 × 10−5, the green dashed line indicates a p-value of 1 × 10−4. Gene markers passing the threshold of 1 × 10−4 were marked.
We examined gene expression changes of the identified genes using published datasets (Supplementary Table 13). Bulk RNA-seq data from postmortem brain tissues revealed differential expression of CHRND in the dorsal anterior cingulate cortex (dACC) between OUD cases and controls (p = 0.042). In addition, single-nucleus RNA-seq data from the human striatum in OUD identified five genes ─ FAM210B, PLAT, DHX58, FAM90A1, and SLC22A10 ─ with cell-type-specific expression changes. For example, FAM90A1 showed differential expression in astrocytes (p = 0.029), SLC22A10 in interneurons (p = 0.003), and FAM210B in a subtype of DRD1+ neurons (marker EPHA4+, p = 0.023).
GO and pathway enrichment analyses of the nominally significant genes (p < 0.05) identified from the meta-analysis of gene-based tests identified the main clusters of metabolic regulation and tissue development. Clusters were defined based on the major biological themes that represent the similarities between GO terms/pathways. Among the largest cluster annotated as “metabolic regulation”, the enriched terms comprised those related to stimulus response and cell communication, among which the roles of opioid signals have been well characterized [54,55,56] (Fig. 3A, Supplementary Table 14). Within-ancestry, gene-set enrichment analyses demonstrated nominally significant GO terms in alcohol metabolic process, and KEGG pathways related to vasopressin-regulated water reabsorption and hexose stimulus (Fig. 3B, C, Supplementary Tables 15, 16).
A Integrative enrichment network of the genes derived from meta-gene-based analyses (p < 0.05) across GO biological processes, molecular function, cellular component and Reactome database. B, C Top five enriched terms from gene-set enrichment analysis (GSEA) for selected genes (p < 0.05) from gene-based associations across GO biological processes (left) and KEGG pathways (right) for European ancestry (B) and African ancestry (C).
Discussion
OD has enormous adverse health and economic consequences worldwide [4, 5]. Understanding the molecular mechanisms of OD could help in prevention efforts and in treating OD by developing novel drugs or repurposing existing ones. OD is moderately heritable [9]. As with other complex psychiatric disorders, many genes contribute to the etiology of OD, each with a small effect size [20]. Before the application of GWAS in this area, the most studied candidate genes included opioid receptor genes [8] (OPRM1 encodes µ-opioid receptor, OPRD1 encodes δ-opioid receptor), DRD2 (dopamine receptor D2), and BDNF (brain-derived neurotrophic factor; reviewed in ref. [19]). A functional coding variant rs1799971 (Asn40Asp) in OPRM1 has been studied extensively to understand its function in relation to OD and was confirmed in a moderately powerful GWAS [14]. Since then, larger studies confirmed the association with the OPRM1 locus repeatedly and discovered more than a dozen risk loci associated with OD [16,17,18]. In this study, OPRM1*rs1799971 is nominally significant in EUR (p = 1.72 × 10−2) but not in AFR (p = 0.90), consistent with previous observations [14] and reflecting the lower number of subjects in the present study compared to those where significant associations were identified.
WES and whole-genome sequencing have increasingly been used in human genetic studies, as rare variants or variants not genotyped in GWAS can explain part of the genetic architecture of complex diseases. Only a few WES studies of substance use or use disorders have been conducted [57, 58], including ours [28]. However, whereas there are no prior published WES studies of OD, this multi-ancestry study comprising 3890 participants from the Yale-Penn cohort is the first.
We identified an exome-wide significant (and also genome-wide significant) rare variant in the RUVBL2 gene (rs746301110, p = 6.59 × 10−10) in EUR participants. This variant is a low-frequency INDEL (CATAGA/C, MAF < 0.01) in EUR and is common (MAF > 0.01) in most non-European populations (data from gnomAD v4.1.0 [59]), suggesting a potential ancestry-specific effect of this variant on OD. Further analyses in larger cohorts from multiple ancestries are needed to differentiate random variations from true ancestry-specific associations. Rs746301110 has potential functional consequences evidenced by its CADD score of 27, which predicts a high likelihood of deleteriousness. In silico prediction using SpliceAI predicts a splice acceptor variant with a probability of 1 (Table 2). The RUVBL2 (RuvB Like AAA ATPase 2) gene encodes a DNA helicase essential for homologous recombination and DNA double-strand break repair; thus, it plays an important role in transcriptional regulation [60] and cancers [61, 62]. GWAS have identified associations between variants in RUVBL2 and biometric traits including multiple blood protein levels [63]. Further research is warranted to ascertain the biological mechanism that may connect genetic variation at this gene with OD.
Other findings with suggestive evidence (p < 1 × 10−5) were identified in genes NEIL2, CFAP44, FAM210B, and TMCO3. NEIL2 (Nei like DNA glycosylase 2) encodes a DNA glycosylase and also plays a role in DNA break repair. Variants within this gene were associated with biometric traits [64] and lipid level interactions with alcohol consumption [65]. Rs16860800 in CFAP44 showed a nominally significant association with OUD in the MVP AFR samples [16]. Two variants in high-LD in the FAM210B gene which encodes a mitochondrial membrane protein were identified in the cross-ancestry meta-analysis, one is in the intron region (rs6099114) and the other is a missense coding variant (rs6099115). AlphaMissense predicts rs6099115 to be likely benign. Among the regulatory elements assessed by Sei, CTCF binding exhibits the highest predicted regulatory impact for both rs6099114 and rs6099115. Specifically, rs6099114 is predicted to disrupt CTCF binding with strong evidence, reflected by an absolute sequence class score difference of 4.01 between the reference and alternative alleles. Rs6099115 is also predicted to affect CTCF binding, with an absolute score difference of 1.39 (Table 2). Another missense in the TMCO3 (transmembrane and coiled-coil domains 3) gene was identified, with a predicted pathogenicity of likely benign by AlphaMissense. Interestingly, variants in these genes were associated with several biometric and anthropometric traits, recorded in the GWAS Catalog [66].
The most significantly associated gene from the gene-based tests is CHRND (Cholinergic Receptor Nicotinic Delta Subunit, p = 4.39 × 10−6) in the cross-ancestry analysis, which was mainly driven by the results in AFR (p = 5.99 × 10−6). This gene encodes the delta subunit of the nicotinic acetylcholine receptor, which mediates synaptic transmission at the neuromuscular junction. Defects in this gene are a cause of multiple muscle-related disorders [67]. A burden of synonymous rare variants (MAF < 0.001) in SLC22A10 was observed in European samples (p = 1.55 × 10−5). This gene was nominally associated with OUD in a prior GWAS through gene-based analysis (p = 0.028 in EUR, p = 0.017 in AFR, and p = 0.003 in cross-ancestry meta-analysis) [18]. Variants in SCL22A10 have been reported to be associated with blood and urine biomarkers in the UK Biobank [64]. Convergent evidence was obtained in European samples from both the single-variant and gene-based associations in the TMCO3 gene, with the burden of Lof+missense variants (p = 2.79 × 10−5).
This study has limitations, the most significant being its relatively small sample size compared to that required for well-powered genomic discovery of complex traits. Increasing the study sample by incorporating biobank-level data has proven to be a useful way to identify novel risk variants associated with SUDs [16, 18, 68]. However, there is limited sequencing data from individuals with OD in publicly available cohorts. Thus, this study of <4000 subjects lacks ideal power but it is the first study of its kind. Incorporating data from large-scale biobanks, such as All of Us [69], could be a critical next step in advancing the genetic study of opioid dependence. Another limitation is that WES only covers the coding regions of the genome, and we therefore cannot uncover the part of the genetic architecture of OD that exists in non-coding regions. Larger studies are required to identify more associations and to confirm the ones we present here. The Yale-Penn cohort has a high prevalence of multiple substance use disorders. Although the associations of the top variants remained largely unchanged after adjusting for polysubstance use disorder status (see Methods), the results observed here may conceivably reflect a broader construct of generalized genetic liability to substance use disorders.
Our study includes roughly equal numbers of EUR and AFR subjects, avoiding a bias that is common in genetics studies. Focusing on the coding regions is essential to understanding the variant effect on proteins, as these are among the best targets for drug development, and we successfully identified significant risk variants and genes.
Data availability
The genetic summary statistics are available at https://medicine.yale.edu/lab/gelernter/stats/.
Code availability
All software used in this study is publicly available. EIGENSOFT, https://data.broadinstitute.org/alkesgroup/EIGENSOFT/; PLINK, https://www.cog-genomics.org/plink/; SAIGE-GENE+, https://github.com/weizhouUMICH/SAIGE; METAL, https://genome.sph.umich.edu/wiki/METAL_Documentation; GATK, https://gatk.broadinstitute.org/; ANNOVAR, https://annovar.openbioinformatics.org; CADD, https://cadd.gs.washington.edu/; SpliceAI, https://github.com/Illumina/SpliceAI; AlphaMissense, https://github.com/google-deepmind/alphamissense; Sei, https://github.com/FunctionLab/sei-framework.
References
American Psychiatric Association. Diagnostic and statistical manual of mental disorders, 4th ed. (DSM-IV). American Psychiatric Association, Washington, DC, 1994.
American Psychiatric Association. Diagnostic and statistical manual of mental disorders: DSM-5. American Psychiatric Association, Arlington, VA, 2013.
Strang J, Volkow ND, Degenhardt L, Hickman M, Johnson K, Koob GF, et al. Opioid use disorder. Nat Rev Dis Prim. 2020;6:3.
GBD 2017 Disease and Injury Incidence and Prevalence Collaborators. Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet. 2018;392:1789–858.
Degenhardt L, Grebely J, Stone J, Hickman M, Vickerman P, Marshall BDL, et al. Global patterns of opioid use and dependence: harms to populations, interventions, and future action. Lancet. 2019;394:1560–79.
Jalal H, Buchanich JM, Roberts MS, Balmert LC, Zhang K, Burke DS. Changing dynamics of the drug overdose epidemic in the United States from 1979 through 2016. Science. 2018;361:eaau1184.
Koob GF, Volkow ND. Neurocircuitry of addiction. Neuropsychopharmacology. 2010;35:217–38.
Darcq E, Kieffer BL. Opioid receptors: drivers to addiction? Nat Rev Neurosci. 2018;19:499–514.
Goldman D, Oroszi G, Ducci F. The genetics of addictions: uncovering the genes. Nat Rev Genet. 2005;6:521–32.
Tsuang MT, Lyons MJ, Meyer JM, Doyle T, Eisen SA, Goldberg J, et al. Co-occurrence of abuse of different drugs in men: the role of drug-specific and shared vulnerabilities. Arch Gen Psychiatry. 1998;55:967–72.
Gelernter J, Kranzler HR, Sherva R, Koesterer R, Almasy L, Zhao H, et al. Genome-wide association study of opioid dependence: multiple associations mapped to calcium and potassium pathways. Biol Psychiatry. 2014;76:66–74.
Nelson EC, Agrawal A, Heath AC, Bogdan R, Sherva R, Zhang B, et al. Evidence of CNIH3 involvement in opioid dependence. Mol Psychiatry. 2016;21:608–14.
Cheng Z, Zhou H, Sherva R, Farrer LA, Kranzler HR, Gelernter J. Genome-wide association study identifies a regulatory variant of RGMA associated with opioid dependence in European Americans. Biol Psychiatry. 2018;84:762–70.
Zhou H, Rentsch CT, Cheng Z, Kember RL, Nunez YZ, Sherva RM, et al. Association of OPRM1 functional coding variant with opioid use disorder: a genome-wide association study. JAMA Psychiatry. 2020;77:1072–80.
Song W, Kossowsky J, Torous J, Chen CY, Huang H, Mukamal KJ, et al. Genome-wide association analysis of opioid use disorder: a novel approach using clinical data. Drug Alcohol Depend. 2020;217:108276.
Kember RL, Vickers-Smith R, Xu H, Toikumo S, Niarchou M, Zhou H, et al. Cross-ancestry meta-analysis of opioid use disorder uncovers novel loci with predominant effects in brain regions associated with addiction. Nat Neurosci. 2022;25:1279–87.
Gaddis N, Mathur R, Marks J, Zhou L, Quach B, Waldrop A, et al. Multi-trait genome-wide association study of opioid addiction: OPRM1 and beyond. Sci Rep. 2022;12:16873.
Deak JD, Zhou H, Galimberti M, Levey DF, Wendt FR, Sanchez-Roige S, et al. Genome-wide association study in individuals of European and African ancestry and multi-trait analysis of opioid use disorder identifies 19 independent genome-wide significant risk loci. Mol Psychiatry. 2022;27:3970–9.
Crist RC, Reiner BC, Berrettini WH. A review of opioid addiction genetics. Curr Opin Psychol. 2019;27:31–35.
Gelernter J, Polimanti R. Genetics of substance use disorders in the era of big data. Nat Rev Genet. 2021;22:712–29.
Hatoum AS, Johnson EC, Colbert SMC, Polimanti R, Zhou H, Walters RK, et al. The addiction risk factor: A unitary genetic vulnerability characterizes substance use disorders and their associations with common correlates. Neuropsychopharmacology. 2022;47:1739–45.
Palmer DS, Howrigan DP, Chapman SB, Adolfsson R, Bass N, Blackwood D, et al. Exome sequencing in bipolar disorder identifies AKAP11 as a risk gene shared with schizophrenia. Nat Genet. 2022;54:541–7.
Backman JD, Li AH, Marcketta A, Sun D, Mbatchou J, Kessler MD, et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature. 2021;599:628–34.
Singh T, Poterba T, Curtis D, Akil H, Al Eissa M, Barchas JD, et al. Rare coding variants in ten genes confer substantial risk for schizophrenia. Nature. 2022;604:509–16.
Wainschtein P, Jain D, Zheng Z, Group TOAW, Consortium NT-OfPM, Cupples LA, et al. Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nat Genet. 2022;54:263–73.
Jang SK, Evans L, Fialkowski A, Arnett DK, Ashley-Koch AE, Barnes KC, et al. Rare genetic variants explain missing heritability in smoking. Nat Hum Behav. 2022;6:1577–86.
Xie P, Kranzler HR, Krystal JH, Farrer LA, Zhao H, Gelernter J. Deep resequencing of 17 glutamate system genes identifies rare variants in DISC1 and GRIN2B affecting risk of opioid dependence. Addict Biol. 2014;19:955–64.
Wang L, Kranzler HR, Gelernter J, Zhou H. Investigating the contribution of coding variants in alcohol use disorder using whole-exome sequencing across ancestries. Biol Psychiatry. 2025;98:46–55.
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.
Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levy-Moonshine A, et al. From FastQ data to high confidence variant calls: the Genome analysis toolkit best practices pipeline. Curr Protoc Bioinforma. 2013;43:11 10 1–11 10 33.
Galinsky KJ, Bhatia G, Loh PR, Georgiev S, Mukherjee S, Patterson NJ, et al. Fast principal-component analysis reveals convergent evolution of ADH1B in Europe and East Asia. Am J Hum Genet. 2016;98:456–72.
1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, et al. A global reference for human genetic variation. Nature. 2015;526:68–74.
Pierucci-Lagha A, Gelernter J, Feinn R, Cubells JF, Pearson D, Pollastri A, et al. Diagnostic reliability of the Semi-structured Assessment for Drug Dependence and Alcoholism (SSADDA). Drug Alcohol Depend. 2005;80:303–12.
Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:e164.
Ioannidis NM, Rothstein JH, Pejaver V, Middha S, McDonnell SK, Baheti S, et al. REVEL: an ensemble method for predicting the pathogenicity of rare missense variants. Am J Hum Genet. 2016;99:877–85.
Schubach M, Maass T, Nazaretyan L, Roner S, Kircher M. CADD v1.7: using protein language models, regulatory CNNs and other nucleotide-level scores to improve genome-wide variant predictions. Nucleic Acids Res. 2024;52:D1143–D1154.
Jaganathan K, Kyriazopoulou Panagiotopoulou S, McRae JF, Darbandi SF, Knowles D, Li YI, et al. Predicting splicing from primary sequence with deep learning. Cell. 2019;176:535–48. e524.
Cheng J, Novati G, Pan J, Bycroft C, Zemgulyte A, Applebaum T, et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science. 2023;381:eadg7492.
Chen KM, Wong AK, Troyanskaya OG, Zhou J. A sequence-based global map of regulatory activity for deciphering human genetics. Nat Genet. 2022;54:940–9.
Zhou W, Bi W, Zhao Z, Dey KK, Jagadeesh KA, Karczewski KJ, et al. SAIGE-GENE+ improves the efficiency and accuracy of set-based rare variant association tests. Nat Genet. 2022;54:1466–9.
Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–1.
Mathieson I, McVean G. Differential confounding of rare and common variants in spatially structured populations. Nat Genet. 2012;44:243–6.
Girgenti MJ, Wang J, Ji D, Cruz DA, Traumatic Stress Brain Research G, Stein MB, et al. Transcriptomic organization of the human brain in post-traumatic stress disorder. Nat Neurosci. 2021;24:24–33.
Phan BN, Ray MH, Xue X, Fu C, Fenster RJ, Kohut SJ, et al. Single nuclei transcriptomics in human and non-human primate striatum in opioid use disorder. Nat Commun. 2024;15:878.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–29.
Fabregat A, Jupe S, Matthews L, Sidiropoulos K, Gillespie M, Garapati P, et al. The reactome pathway knowledgebase. Nucleic Acids Res. 2018;46:D649–D655.
Raudvere U, Kolberg L, Kuzmin I, Arak T, Adler P, Peterson H, et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res. 2019;47:W191–W198.
Merico D, Isserlin R, Stueker O, Emili A, Bader GD. Enrichment map: a network-based method for gene-set enrichment visualization and interpretation. PLoS One. 2010;5:e13984.
Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, et al. Integration of biological networks and gene expression data using Cytoscape. Nat Protoc. 2007;2:2366–82.
Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16:284–7.
Chardin P. GTPase regulation: getting aRnd Rock and Rho inhibition. Curr Biol. 2003;13:R702–704.
Ru Q, Wang Y, Zhou E, Chen L, Wu Y. The potential therapeutic roles of Rho GTPases in substance dependence. Front Mol Neurosci. 2023;16:1125277.
Tabur S, Oztuzcu S, Oguz E, Korkmaz H, Eroglu S, Ozkaya M, et al. Association of Rho/Rho-kinase gene polymorphisms and expressions with obesity-related metabolic syndrome. Eur Rev Med Pharmacol Sci. 2015;19:1680–8.
Vilardaga JP, Nikolaev VO, Lorenz K, Ferrandon S, Zhuang Z, Lohse MJ. Conformational cross-talk between alpha2A-adrenergic and mu-opioid receptors controls cell signaling. Nat Chem Biol. 2008;4:126–31.
Stefano GB, Scharrer B, Smith EM, Hughes TK Jr, Magazine HI, Bilfinger TV, et al. Opioid and opiate immunoregulatory processes. Crit Rev Immunol. 2017;37:213–48.
Koob GF. Neurobiology of opioid addiction: opponent process, hyperkatifeia, and negative reinforcement. Biol Psychiatry. 2020;87:44–53.
Kang J, Deng YT, Wu BS, Liu WS, Li ZY, Xiang S, et al. Whole exome sequencing analysis identifies genes for alcohol consumption. Nat Commun. 2024;15:5777.
Rajagopal VM, Watanabe K, Mbatchou J, Ayer A, Quon P, Sharma D, et al. Rare coding variants in CHRNB2 reduce the likelihood of smoking. Nat Genet. 2023;55:1138–48.
Chen S, Francioli LC, Goodrich JK, Collins RL, Kanai M, Wang Q, et al. A genomic mutational constraint map using variation in 76,156 human genomes. Nature. 2024;625:92–100.
Gallant P. Control of transcription by Pontin and Reptin. Trends Cell Biol. 2007;17:187–92.
Huber O, Menard L, Haurie V, Nicou A, Taras D, Rosenbaum J. Pontin and reptin, two related ATPases with multiple roles in cancer. Cancer Res. 2008;68:6873–6.
Wang H, Li B, Zuo L, Wang B, Yan Y, Tian K, et al. The transcriptional coactivator RUVBL2 regulates Pol II clustering with diverse transcription factors. Nat Commun. 2022;13:5703.
Pietzner M, Wheeler E, Carrasco-Zanini J, Cortes A, Koprulu M, Worheide MA, et al. Mapping the proteo-genomic convergence of human diseases. Science. 2021;374:eabj1541.
Sinnott-Armstrong N, Tanigawa Y, Amar D, Mars N, Benner C, Aguirre M, et al. Genetics of 35 blood and urine biomarkers in the UK Biobank. Nat Genet. 2021;53:185–94.
de Vries PS, Brown MR, Bentley AR, Sung YJ, Winkler TW, Ntalla I, et al. Multiancestry Genome-Wide Association Study of lipid levels incorporating gene-alcohol interactions. Am J Epidemiol. 2019;188:1033–54.
Sollis E, Mosaku A, Abid A, Buniello A, Cerezo M, Gil L, et al. The NHGRI-EBI GWAS catalog: knowledgebase and deposition resource. Nucleic Acids Res. 2023;51:D977–D985.
Bonanno C, Rodolico C, Topf A, Foti FM, Liu WW, Beeson D, et al. Severe congenital myasthenic syndrome associated with novel biallelic mutation of the CHRND gene. Neuromuscul Disord. 2020;30:336–9.
Zhou H, Kember RL, Deak JD, Xu H, Toikumo S, Yuan K, et al. Multi-ancestry study of the genetics of problematic alcohol use in over 1 million individuals. Nat Med. 2023;29:3184–92.
All of Us Research Program Genomics Investigators. Genomic data in the All of Us research program. Nature. 2024;627:340–6.
Acknowledgements
We want to acknowledge the participants in the Yale-Penn cohorts. The authors are supported by grants from the National Institutes of Health (NIH) (R01 DA12890, R01 AA026364, R01 AA11330, R01 DA037974, P30 DA046345, R21 DA063120, P50 AA012870, U54 AA027989, RM1 HG011558, R01 AA030056, and DP1 DA058737) and the U.S. Department of Veterans Affairs (1I01 CX001849 and I01 BX004820).
Author information
Authors and Affiliations
Contributions
LW, JG, and HZ conceived and designed the study. JG, HRK, and YZN obtained the samples. HZ and LW called the variants. LW performed the analyses and wrote the manuscript. JLMO and KJB designed the gene expression analyses. JJMM, MRH, and ZM conducted additional analyses. All authors contributed to and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
J.G. is paid for his editorial work on the journal Complex Psychiatry. H.R.K. is a member of advisory boards for Altimmune, Dicerna Pharmaceuticals, Sophrosyne Pharmaceuticals, Enthion Pharmaceuticals, and Clearmind Medicine; a consultant to Altimmune and Sobrera Pharmaceuticals; the recipient of research funding and medication supplies for an investigator-initiated study from Alkermes; and a member of the American Society of Clinical Psychopharmacology’s Alcohol Clinical Trials Initiative, which was supported in the past 3 years by Alkermes, Dicerna, Ethypharm, Lundbeck, Mitsubishi, Otsuka, and Pear Therapeutics.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Wang, L., Nuñez, Y.Z., Martínez-Magaña, J.J. et al. Whole-exome sequencing study of opioid dependence offers novel insights into the contributions of exome variants. Transl Psychiatry 15, 380 (2025). https://doi.org/10.1038/s41398-025-03578-y
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41398-025-03578-y