Introduction

Depression is leading cause of disability1 and yet characterized by multiple etiologies, such as genetic effects. Twin and family-based studies have estimated a heritability of 30 ~ 40%2. However, genome-wide association studies (GWAS), which focus on common single nucleotide polymorphism (SNP), attributed only 5%–10% of the variance in liability to depression3,4. The missing heritability in depression may well be elucidated by further exploring the shared environment factors and the interplay of genes and environments.

Childhood maltreatment (CM) has been consistently addressed as a major contributing factor in the development of depression. Recent studies suggested an amplified genetic contribution to major depressive disorder (MDD) in the presence of trauma5, with 7%–20% of MDD risk variance accounted for by significant genome-by-trauma interaction6. Previous gene-by-environment (G-by-E or G×E) studies tested the interplay of some candidate genes and CM to the occurrence of depression from the perspectives of known biological pathway7,8. Genome-wide by environment interaction studies (GWEIS) have yielded fewer prospective findings than traditional GWAS due to the requirement for due to the need for larger, balanced samples, a more comprehensive environmental assessments, and high computational expenses. Meanwhile, the identified genetic variants in previous G-by-E studies were rarely replicated9,10. However, genetic variants that contributed to the risk of depression via gene-by-CM interactions remain unclear. Recent genetic approaches enable unbiased explorations across the whole genome for genetic variants associated with the interplay of G-by-E.

In this work, we investigate the interplay of genes and CM contributing to the occurrence of MDD by applying an accurate method named SPAGE11 (SaddlePoint Approximation implementation of G-by-E analysis) using data from the UK Biobank. We identify genome-wide significant genomic signal interacting with CM and uncover the underlying methylation mechanism. As a reliable endophenotype of psychiatric disorders, the brain structures affected by G-by-E interactions are examined and compared in adults from the UK Biobank and children from the Adolescent Brain Cognitive Development Study® (ABCD). Drawing on perspectives from brain development and the critical periods theory of CM exposure12, we further investigate the potential brain imaging mechanism linking G-by-E interactions to MDD in the ABCD child cohort using a prospective design. Finally, we conduct a preliminarily exploration of G-by-E effects on antidepressant treatment efficacy within Chinese Han cohorts. The study framework is presented in Fig. 1.

Fig. 1: Study frame.
figure 1

MDD = major depressive disorder, HC = healthy control, G-by-E or G × E = gene(G)-by-environment(E), ABCD = the Adolescent Brain Cognitive Development Study, ROI = region of interest.

Results

Genome-wide approach revealed IDO2 moderating CM’s impact on MDD

In discovery sample, a total of 26,664 MDD cases and 85,715 controls of European ancestry were included into SPAGE analysis. The mean age was 56.2 ± 7.67 years old and 44.20% were male. A significant higher rate of CM experience was found in MDD cases than controls (45.56% vs. 29.29%, P < 0.01). Details are provided in Supplementary Data 1.

The SPAGE analysis using cumulative CM experiences identified one genome-wide significant G-by-E genomic signal (see Fig. 2a). The index SNP was rs7846217 (A1 = C, A2 = T, MAF = 0.13, P = 2.04e–08) after clumping and physically mapped to IDO2 gene. Suggestive variants in the SPAGE can be found in Supplementary Data 2. SPAGE models using CM severity, CM experience (yes/no), physical abuse, emotional abuse, sex abuse, physical neglect and emotional neglect did not identify common genome-wide significant G-by-E variant (see Supplementary Fig.1). We then focused on the main results from cumulative CM experiences. To utilize rs7846217 as an example to depict the G-by-E effects, a gradual increased risk of cumulative CM experiences for developing MDD was observed in rs7846217 CC, CT and TT carriers (ORCC = 1.25 [1.13, 1.38]; ORCT = 1.34 [1.31, 1.38]; ORTT = 1.45 [1.43, 1.47]; see Fig.2b). Similar G-by-E pattern of rs7846217 was also found in CM subscales (see Fig.2b) and different sex subgroups (Supplementary Data 3). From the views of the genetic effect, rs7846217 TT genotype showed significantly lower risk for MDD among individuals without exposed to CM (OR = 0.92 [0.87, 0.97], P = 3.13e–03) but a significantly higher risk for MDD among individuals exposed to CM (OR = 1.08 [1.03, 1.12], P = 5.49e–04, see Fig.2c) compared with CT/CC genotypes. Sensitivity analyses showed consistent results (see Supplementary Figs.2 and Supplementary Data 4).

Fig. 2: Genome-wide by environment interaction results of cumulative childhood maltreatment experiences in UK Biobank.
figure 2

We conducted a genome-wide by environment interaction analysis with cumulative childhood maltreatment (CM) experiences to search for genetic variants associations with major depressive disorder (MDD) using the SaddlePoint Approximation implementation of Gene-by-Environment (G-by-E) analysis (SPAGE) approach in UK Biobank (Ncases = 26,664, Ncontrols = 85,715). a Shown is G-by-E interaction P values versus chromosomal position. We identified one locus at genome-wide significance (two-tailed genome-wide corrected threshold at P < 5×10 − 8; red dashed horizontal line). The arrow indicates the index variant IDO2 rs7846217. The orange dashed line indicates genome-wide suggestive significant threshold P < 1e-05. Source data are provided through Figshare (https://doi.org/10.6084/m9.figshare.27124452). b Shown is odds ratios (ORs) of different types of CM on risk for MDD using logistic regression, stratified by IDO2 rs7846217 genotypes (three genotypes: CC, CT, and TT). The center of each dot represents the OR, with error bars indicating the 95% confidence intervals (CIs). For CM (yes/no, indicating binary CM experience), the ORs reflect the risk for MDD in individuals exposed to CM refer to those unexposed. For cumulative CM (Cum-CM), physical abuse (PhyAbu), emotional abuse (EmoAbu), sexual abuse (SexAbu), physical neglect (PhyNeg), and emotional neglect (EmoNeg), the ORs represent the odds of MDD risk per 1 standard deviation (SD) increase in the respective CM severity scale. Two-tailed P values for the G-by-E interaction effects are shown below the x-axis. Source data are provided as a Source Data file. c Shown is odds ratios (ORs) of IDO2 rs7846217 TT genotype refer to TC/CC genotype on risk for MDD, stratified by CM exposure. The center of each dot represents the OR, with error bars indicating the 95% CIs. Two-tailed P values are shown below the x-axis.

IDO2 hypomethylation in subjects exposed to CM directly linked to MDD

SNP rs7846217 located in the intron of IDO2 and scored 1 f (eQTL+TF binding +any motif) in RegulomeDB13. HaploReg14 indicated the presence of multiple transcription factor binding sites within this genomic region, which overlapped with DHSs, promoter histone marks and enhancer histone marks. Databases eQTLGen15 and QTLbase16 showed this variant was expression quantitative trait loci (eQTL) on IDO1 and IDO2 in blood (P = 2.36e-55, 6.14e-21, respectively), and methylation quantitative trait loci (meQTL) in blood and brain prefrontal cortex (cg12565636, blood: P = 9.550e-62, brain: P = 2.690e-28, respectively). Under the G-by-E context, the IDO1/IDO2 expression or methylation levels may grab the potential biological alterations of the interacts with CM.

Colocalization between disease GWASs and expression/methylation signals has been successfully used to provide biological evidence for the causal association between traits. We focused on the specific causal associations under CM exposure, thus we utilized the MDD summary data obtained from individuals who exposed to CM in discovery sample to conduct colocalization analysis. Blood and brain eQTL and meQTL data from public database were tested. Our study detected 11 pairs significant colocalizations of blood meQTL signals of IDO2 with MDD trait in eCAVIAR at 1% colocalization posterior probability (CLPP) level (causal variants: rs2340953 and rs28631334, see Fig. 3a). Particularly, the CLPP value supported rs2340953 was a colocalizing variant between cg25981315 and MDD (CLPP = 16.8%, see Fig. 3b). The SMR analysis also identified a sharing causal variant rs11777027 between brain meQTL signal of IDO2 and MDD (SMR:βsmr = -0.08, SEsmr = 0.029, Psmr = 4.67e-03, PHEIDI = 0.83), and decreased brain methylation level at cg12565636 associated with increased risk for MDD. The three variants were in LD with index SNP rs7846217 (Fig. 3c). The LD r2 were 0.94, 0.96 and 0.76, and the G-by-E P values were 1.44e-07, 1.59e-07 and 4.99e-07 for rs2340953, rs11777027 and rs28631334, respectively.

Fig. 3: Colocalization of genomic signal for depression with methylation levels in individuals exposed to childhood maltreatment.
figure 3

a Shown is colocalization results of blood/brain methylation quantitative trait loci (meQTL) effect and major depressive disorder (MDD) genomic summary data obtained from individuals exposed to childhood maltreatment (Ncases = 17,225, Ncontrols = 45,372). Blue circle indicates meQTL effects, and red triangle indicates genetic effects on MDD; three different blue/red colors represent 3 colocalized variants rs11777027, rs2340953 and rs28631334, respectively. The center of each dot represents the beta, with error bars indicating the 95% confidence intervals (CIs). Source data are provided as a Source Data file. b Shown is scatter plot for the colocalization pair of blood meQTL cg25981315 and MDD genetic signal within CM exposure group rs2340953, with the highest colocalization posterior probability of 16.8% from eCAVIAR. c Shown is locus zoom around the index variant rs7846217 generated from LocusZoom.org. Three colocalized variants rs2340953, rs11777027 and rs28631334 are labeled, showing linkage disequilibrium (LD) with rs7846217. The LD r2 are 0.94, 0.96 and 0.76, respectively, with 1000 Genome as reference. d Shown is protein-protein interaction (PPI) network for IDO2 quired from STRING. The tryptophan-kynurenine catabolic process is the top rank pathway in PPI and are labeled using red dots.

PPI network and gene enrichment highlighted TRP-KYN metabolism

IDO2 was subjected to functional pathway analyzes performed with STRING (Fig.3d). The protein-protein interaction (PPI) network was significantly enriched (P = 1.0e-16). The tryptophan-kynurenine (TRP-KYN) metabolic process was labeled as the top pathway involved in GO biological process (Pfdr = 1.70e-10), local network cluster (Pfdr = 3.80e-05) and KEGG pathway (Pfdr = 9.13e-42). The TRP-KYN pathway, with IDO2 at its core, intricately regulates tryptophan breakdown and kynurenine metabolite production. Atypical depressive disorder was one of the top enriched diseases of these genes (Pfdr = 4.95e-2). GO enrichment analysis using clumped variants with P < 1e-05 from SPAGE identified significantly enrichment in tryptophan, kynurenine and indolalkylamine metabolic process, in which IDO1 and IDO2 are involved (q-value = 0.014 for all). Additionally, pathways such as regulation of hippo signaling (genes: WWC1/SHANK2), maintenance of synapse structure (genes: SHANK2/ERC1), neuron recognition (genes: NTM/ARHGAP35), and serine/threonine protein kinase complex (genes: CDK14/GTF2H5/ERC1) were also significantly enriched (q-value < 0.05, see Supplementary Data 5).

IDO2-by-CM influenced the supramarginal gyrus in children and adults

A total of 20,901 adults that completed CM measure, structural magnetic resonance imaging (sMRI) measure and genotyping in UK Biobank were included to test the brain structures affected by the identified G-by-E effect (the IDO2-by-CM). Mean age for the subsample was 55.29 ± 7.43 years and 46.88% was male. We mainly focused on the identified interaction between index SNP rs7846217 and cumulative CM experience. Of the sMRI indicators, mean thickness of supramarginal gyrus in the left hemisphere (G-pariet-inf-Supramar: GPIS-LCT) passed the Bonferroni-corrected significance level (P = 1.34e-04 < 3.38e-04, see Fig. 4a). Same with the results for MDD phenotype, for individuals carrying risk genotype of rs7846217 TT, cumulative CM experience significantly decrease the cortical thickness of GPIS-LCT (β = -0.02, SE = 0.007, P = 0.007), while positive associations between cumulative CM experience and cortical thickness of GPIS-LCT were found among rs7846217 TC and CC carriers (β = 0.03, SE = 0.013, P = 0.037; β = 0.11, SE = 0.047, P = 0.018, respectively, see below).

Fig. 4: Brian structures affected by interaction effects between IDO2 and childhood maltreatment.
figure 4

ab Shown are brain structural magnetic resonance imaging (sMRI) indicators affected by gene-by environment (G-by-E) interaction of IDO2 rs7846217 and cumulative childhood maltreatment (CM) experience in adults using UK Biobank (a, N = 20,901) and in children using ABCD cohort (b, N = 5,335). Each dot represents an individual brain region, and different colors represent cortical area (CA), cortical thickness (CT), cortical volume (CV), sulcal depth (SD), respectively. Linear regression models with adjustments made for age, sex and top ten genomic principal components (PCs) were performed, and the -logP for G-by-E are illustrated here. The red dashed line indicates the Bonferroni-corrected P threshold for morphometric measures with P = 3.38e-04 (calculated by 0.05/148) for (a), and P = 3.31e-04 (calculated by 0.05/151) for (b). The sMRI indicators passed threshold are labeled. GPIS-LCT = mean thickness of G-pariet-inf-Supramar in the left hemisphere, GPIA-RCV = cortical volume of G-pariet-inf-Angular for right hemisphere. Source data are provided as a Source Data file. c Shown is a diagram representing cross-validation. The significant indicator identified in adults was validated in children, but not vice versa. df Shown are effects of cumulative CM experience on GPIS-LCT in adults (d, N = 20,901) and children (e, N = 5,335), and on GPIA-RCV in children (f, N = 5,335), stratified by rs7846217 genotypes. Linear regression models with adjustments made for age, sex and top ten PCs were performed. The center of each dot represents the beta coefficient, with error bars indicating the 95% confidence intervals (CIs). P values across subgroups of rs7846217 genotypes are labeled, and two-tailed P < 0.05 was considered significant. Source data are provided as a Source Data file.

Childhood is a sensitivity period that CM acts. Thus, we tested whether the IDO2-by-CM effects on brain structure has emerged in childhood. A total of 5,335 European ancestry children was included into analysis. Mean age for the sample was 9.90 ± 0.62 years and 52.76% were male. The whole-brain scan was firstly performed to test whether IDO2-by-CM affect brain structure in children. Cortical volume for right hemisphere cortical angular gyrus (G-pariet-inf-Angular: GPIA-RCV) passed Bonferroni-corrected significance level (P = 1.02e-04 < 3.31e-04, see Fig. 4b).

We then cross-validated the top regions of interest (ROIs) in the two cohorts (see Figs. 4c–f). The identified top ROI GPIS-LCT in adults attained suggestive significant level (P = 9.72e-04 < 0.001) in adolescence, even though showing opposite genetic effect direction. But GPIA-RCV that identified in children failed to be validated in adults (P = 0.94). In all, the findings supported that interaction between rs7846217 and cumulative CM experience affected the inferior parietal lobule constituted by angular and supramarginal gyri, that have been associated with cognitive functions.

IDO2-by-CM predicted new onset of MDD in children

Findings from the brain imaging supported that the G-by-E effects have emerged in childhood. We then tested whether the IDO2-by-CM effect can predict new onset of MDD at 2-year follow-up in healthy children using a prospective design. A total of 4855 children not diagnosed as MDD at baseline were identified, of which 669 (13.78%) having experienced any type of CM. The total incidence rate of child-reported MDD was 7.27% at the 2-year follow-up. Casual variants were more likely to show biological function, so both index SNP rs7846217 and three causal variants were tested. Children having experienced any type of CM at baseline showed significantly higher incidence rate of MDD at 2-year follow-up (12.56% vs. 6.37%, P = 1.64e-08). Adjusting age, sex and top ten genetic principal components (PCs), significant interaction effects were observed involved in IDO2 causal variants rs2340953 and rs28631334 (P = 0.034, 0.044, respectively, see Fig. 5a). The IDO2-by-CM patterns kept consistent with those observed in adults. No significant result was found in analyses using parent’s reported MDD (Supplementary Data 6).

Fig. 5: Interaction effects of IDO2 and childhood maltreatment on children’s depression and antidepressant efficacy.
figure 5

Source data are provided as a Source Data file. a Shown is odds ratios (ORs) of childhood maltreatment (CM) experience (yes vs. no) on risk for new onset of child-report major depressive disorder (MDD) at 2-year follow-up using logistic regression, stratified by IDO2 genotypes. Children not diagnosed as MDD at baseline were included into analysis (N = 4,855). The center of each dot represents the OR, with error bars indicating the 95% confidence intervals (CIs). Two-tailed P < 0.05 for the gene-by-environment (G-by-E) interaction is considered as significant. bc Shown are betas of CM on 2-year follow-up brain structural magnetic resonance imaging (sMRI) partitioned by rs28631334 and rs2340953. Linear regressions with G-by-E for sMRI at 2-year follow-up were performed. sMRI indicators used standardized values. Five sMRI indicators passed multi-correction threshold (P < 3.31e-04) are shown. The center of each dot represents the beta, with error bars indicating the 95% CIs. P-LCA: cortical area for left hemisphere cortical precuneus, OP-LCV: cortical volume for left hemisphere cortical occipital pole, MOTSLS-RSD: sulcal depth for right hemisphere cortical medial occipito-temporal sulcus and lingual sulcus, MACGS-LCT: cortical thickness for left hemisphere cortical middle-anterior part of the cingulate gyrus and sulcus, CS-RSD: sulcal depth for right hemisphere cortical central sulcus. d CM increases risk for MDD at 2-year follow-up by affecting MACGS-LCT, this path was moderated by rs28631334 genotype. e Shown is box-plot of Hamilton’s Depression Scale (HAMD) change rate at midterm-treatment refer to baseline (N = 142) across groups by CM experience and rs2340953 (TCCC/CM: n = 5, TCCC/noCM: n = 20, TT/CM: n = 30, TT/noCM: n = 87; t = 1.99, P = 0.048).The center line represents the median in each group, the black diamond represents the mean in each group, whiskers extend from the box to the minimum and maximum data points within 1.5 times the interquartile range (IQR), box limits indicate the 25th and 75th quartiles of the data, with the box representing the IQR.

Impaired mACC development mediated IDO2-by-CM effects on child’s MDD

From the perspectives of brain development, we inferred the brain structure change mediated the IDO2-by-CM effect and the new onset of MDD at 2-year follow-up. We utilized the children sample not diagnosed as MDD at baseline to perform mediation analysis, and focused on the validated causal variants rs2340953 and rs28631334. Of the brain sMRI indicators, a total of 5 brain regions development at 2-year follow-up were affected by the IDO2-by-CM effects at Bonferroni-corrected significance level (P < 3.31e-04) (see Figs.5b–c). A moderated mediation model was further applied to assess changes in the 5 ROIs from baseline to the 2-year follow-up, aiming to determine whether the association between IDO2-by-CM interactions and ROIs directly contributes to MDD risk. Significant interaction effect with rs28631334 as a moderator and middle-anterior cingulate cortex (mACC) as a mediator was found. Specifically, children with GG genotype and having experienced CM at baseline showed stunted increase of cortical thickness for left hemisphere cortical middle-anterior part of the cingulate gyrus and sulcus (interaction effect: β = 0.27, SE = 0.11, P = 0.019), and increased risk for child-reported MDD at 2-year follow-up (sMRI→MDD: β = −0.21, SE = 0.07, P = 0.002; see Fig.5d).

IDO2-by-CM moderated antidepressant treatment efficacy

IDO2 genetic polymorphism was previously found associated with antidepressant treatment efficacy17. The four priority variants exhibited similar allele distributions between European and East Asian ancestries (see Supplementary Fig.3). The relationship between the IDO2-by-CM effect and treatment efficacy were preliminary explored in antidepressant cohorts of Chinese Han ancestry. Mean age of the sample was 37.06 ± 14.89 years, and 76.76% (109/142) were female. After antidepressant treatment, 101 (71.13%) of participants showed more than 50% decrease in Hamilton’s Depression Scale (HAMD) score than baseline. CM significantly decreased antidepressant treatment efficacy measured by HAMD change rate (β = −0.66, SE = 0.19, P = 6.11e-04). Limited by the genotype chips difference, only IDO2 rs2340953 (MAF = 0.09) was extracted in the antidepressant treatment cohort and was tested. A significant interaction effect between rs2340953 and CM experience was found on midterm HAMD reduction rate (P = 0.048). In details, for rs2340953 TT carriers, CM experience significantly reduced HAMD reduction rate (β = −11.75, SE = 4.05, P = 0.005), while no significant effect from CM was found on HAMD reduction rate among non-TT carriers (β = 3.12, SE = 4.25, P = 0.477, see Fig.5e).

Discussion

Utilizing a large-sample MDD case-control study, we have identified the crucial role of IDO2 in the interaction with cumulative CM experience in the development of MDD through a genome-wide scale interaction association analysis, and shedding light on the methylation alterations in tryptophan-kynurenine pathway. The interaction effect emerged in childhood and showed a life-time influence on cortical thickness for left hemisphere supramarginal gyrus. By following-up the brain imaging changes in the two years after CM experience, we acquired the stunted increase of the cortical thickness for left middle-anterior part of the cingulate gyrus and sulcus in individuals with IDO2 risk genotype. While the G-by-E effects was preliminarily validated in determining antidepressant treatment efficacy, our findings provide potentials for the precise treatment of MDD subtypes with CM.

IDO1 and IDO2 are two heme enzymes that catalyzes the first and rate-limiting step in TRP-KYN metabolic process. As a catalyst, the enzyme IDO metabolizes tryptophan in the kynurenine pathway during the neuroregulatory metabolism and is markedly activated by proinflammatory cytokines18. In normal organisms, the main metabolite of tryptophan is 5-hydroxytryptamine (5-HT), the neurotransmitter that strongly related to depression. However, under inflammatory conditions, the IDO-mediated TRP-KYN pathway would lead to the production of a neurotoxic metabolite (kynurenic acid), causing the development of neurotoxicity and deficiency of 5-HT in the brain19,20.

Abnormal changes in the TRP-KYN pathway have received widespread attention as one of the pathophysiology and therapy of depression21,22. The STAR*D cohort have reported the single nucleotide polymorphism of IDO2 in treatment efficacy of citalopram17. This study addresses an unexplored aspect of the interaction between IDO2 and CM. Previous studies reported CM activated hypothalamic-pituitary-adrenal (HPA) axis that mediated the endocrine, behavioral, immune, and autonomic effects of stress. Focusing on the TRP-KYN pathway, it can be found that the key rate-limiting enzymes (IDO1, IDO2 and TDO) were induced by inflammatory cytokines and cortisol23, which is also an important pathway of CM. Accumulating evidence indicates that changes in kynurenine metabolism affect brain development and synaptic plasticity24,25,26. Our findings further extend this knowledge by suggesting that methylation alterations involved in TRP-KYN pathway caused by CM increase the brain development tends to depression. However, due to the differentiate gene expression across whole blood and brain, we obtained discordant effect and cannot provide a simple conclusion. More studies involving IDO1 and IDO2 enzyme expression and activity in different cell types are needed to understand the pathophysiological mechanisms of MDD.

Another highlight of our study is the findings of the structural brain abnormalities associated with the interaction between CM and IDO2 in children and adults. The Critical Periods Hypothesis of early adversity posits that adverse life events during periods of heightened neuroplasticity, often referred to as critical periods, can elevate the risk of various neural, behavioral, and psychological deficit consequences12. Our findings supported that the effects of CM happened in the highly sensitive period can be life-lasting. Our identified ROIs affected by interactions between CM and IDO2, the angular and supramarginal gyri, together constitute the inferior parietal lobule that has been associated with attention, memory, and language - all functions that have been reported to be abnormal in MDD. In individuals with bipolar disorder and major depression, an aberrant pattern of activation in the left supramarginal gyrus during rest has been highlighted, suggesting its role in emotion processing and regulation27,28. Previous studies also found the clusters comprising angular gyrus and supramarginal gyrus have a role in associations between high level of childhood trauma and stronger differentiation in brain responses to negative compared with positive faces29. Even though a different effect direction was found between children and adults, it could be rational considering the developmental trajectory of children’s brain that may be reshaped by CM30,31. As far as middle-anterior cingulate cortex, it was a key indicator in brain development stage that associated with maltreatment30, while it also most frequently associated with an imbalance in the KYN pathway32. Overall, we identified the particular brain regions that play roles in interactions between CM and IDO2 and encouraged studies focused on functional connectivity in this region in the future.

In general, our study identified interaction effects between IDO2 and CM in relation to MDD by using an unbiased GWEIS method. We substantiated these interactions with robust evidence, spanning from the methylation colocalization, brain structure abnormalities in different ages, and responses to antidepressant treatment. Notably, the brain structural abnormalities of the interaction between CM and IDO2 spans the entire life course and needs an early intervention. In an era where genetic testing is becoming increasingly common, we foresee the interaction patterns of CM and IDO2 can be utilized for precision prevention and intervention strategies in children’s mental health, as well as personalized diagnosis and treatment of MDD.

Several limitations need to be mentioned. First, childhood depression and adult depression exhibit differences in phenotypes, which may also reflect differences in disease mechanisms. Thus, we cannot distinguish the source of the true difference between the children cohort and adult cohort. Second, there is a potential for recall bias related to CM in the discovery sample. And the assessment tools for CM differed between the two cohorts, and we cannot overlook the role of parental involvement in reporting CM in the ABCD cohort. It’s also worth noting that children in the ABCD cohort may have experienced additional maltreatment during the follow-up phase, which was not included in our analysis. Third, similar to previous studies, the identification of MDD cases in the ABCD cohort differed when using parent reports versus child reports, potentially introducing unknown biases. Fourth, the identified interaction effects should be replicated in additional ancestry groups to ensure broader generalizability. Fifth, our study utilized cleaned brain imaging data rather than raw MRI data to uncover potential brain abnormalities, which provides only limited evidence. Functional MRI data would be necessary to find abnormalities of brain networks and biological circuits under the interaction effects. Lastly, multi-omics data in different cell types would be required to identify changes in the kynurenine pathway caused by CM and MDD and to unveil the underlying mechanisms of interaction effects.

Methods

Ethics

UKB received approval from the National Information Governance Board for Health and Social Care and the National Health Service North West Center for Research Ethics Committee. In the ABCD study, written informed consent from caregivers and verbal assent from children were obtained. All procedures received approval from the centralized institutional review board (IRB) from the University of California, San Diego, and each study site obtained approval from its local IRB33. The antidepressant treatment studies were approved by the Ethical Committee of the Sixth Hospital, Peking University. And study protocols were also approved by the institutional ethics review boards at each site. Written informed consent was obtained from the participants in previous publications34,35. All studies comply with the 1964 Declaration of Helsinki and its later amendments.

Discovery cohort

UK Biobank is a population-based cohort study with individuals aged 40-69 years were enrolled in 2007-2010 across the UK. A total of 343,964 unrelated samples from white British participants were remained in UK Biobank. The present study included 157,363 participants that completed the online Mental Health Questionnaire (MHQ)36. MDD was assessed by a self-reported MHQ that formed according to the DSM-IV criteria37. Two core symptoms and other symptoms were used together to assess MDD (see Supplementary Table 1). CM was retrospectively measured using 5 items from the Childhood Trauma Screener (CTS)38, of which each item indicates one type of maltreatment (physical abuse, emotional abuse, sexual abuse, physical neglect, and emotional neglect). Three types of CM phenotypes: CM severity (range from 0 to 20) represented by the sum score of the CTS, CM experience (coded as 0/1) defined by experiencing any type of CM or not, and cumulative CM experiences (range from 0 to 5) represented by the number of CM types, were combined and analyzed separately. Details related to the CM definitions can be found in Supplementary Table 2.

Children validation cohort

The ABCD is a large long-term study of brain development and have recruited 11,880 children ages 9-10. A total of 5335 unrelated European ancestry children were included into the analysis. Depressive symptoms for children were measured using a computerized version of the Kiddie Schedule for Affective Disorders and Schizophrenia for DSM-5 (KSADS-5)39,40. MDD was diagnosed according to the DSM-5 criteria41. Due to the great difference between parent-reported depressive status and child-reported depressive status41, the MDD cases identified by children and parents were included into analysis separately. The five types of CM were measured separately at baseline in ABCD. As suggested by Warrier et al42. and Hoffman et al43, we used questions from KSADS-5, Children’s Report of Parental Behavior Inventory (CRPBI)44 and Parental Monitoring scale45 to assess the five types of CM separately. To keep consistent of the CM phenotypes between ABCD and UK Biobank cohorts, the binary phenotype of CM experience was used. Details related to definitions for CM phenotypes were are shown in Supplementary Table 3.

Antidepressant treatment cohort

MDD patients of Chinese Han who received acute phase antidepressant treatment and follow-ups from two clinical trials were used as treatment validation cohorts. Antidepressant treatment cohort 134 was an 8-week single-drug paroxetine treatment cohort conducted in patients with MDD, generalized anxiety disorder, or panic disorder from the Precision Medicine to Enhance Depression and Anxiety Outcome (PMEDA) consortium. Patients received follow-ups at 4 and 8 weeks. Antidepressant treatment cohort 235 was a 12-week randomized controlled trial aims to assess the efficacy and safety of individualized gene-guided treatment of MDD. The gene-guided antidepressants included agomelatine, citalopram, duloxetine, escitalopram, fluoxetine, fluvoxamine, paroxetine and sertraline. Patients were followed up at 6-week and 12-week. Both cohorts included adult patients diagnosed with MDD according to DSM-5 and were of Han Chinese ancestry. And the 17-item HAMD scale was used to assess depression severity at baseline and follow-ups, yielding a relatively homogeneous sample. More details can be found in Chinese clinical trial registry website (No. ChiCTR2000029671 and No. ChiCTR2000038462) and Supplementary Methods. We included those who have completed CM measures and have full data on antidepressant treatment efficacy in our study to assess the identified G-by-E effects on treatment efficacy. Treatment outcome at midterm-treatment (4/6 weeks) and after-treatment (8/12 weeks) were included into analysis. A total of 142 participants were finally included. HAMD change rates calculated by HAMD mean difference/HAMD at baseline × 100 were used to assess antidepressant treatment efficacy, with higher HAMD change rate indicating better treatment efficacy. CM was retrospectively assessed using 28-item Childhood Trauma Questionnaire (CTQ)46. Cutoff scores for physical abuse, emotional abuse, sexual abuse, physical neglect and emotional neglect were set at 10, 13, 8, 10 and 15. And CM experience was defined as whether experience any type of CM or not.

Genotype quality control

Genotyping was performed by the UK Biobank team, and ~28 million markers with minor allele counts (MAC) ≥ 20 and imputation info scores larger than 0.3 were obtained. The quality of the UK Biobank has been well controlled by the data provider. Common variants with minor allele frequency (MAF) ≥ 0.01 were included into analysis. In ABCD cohort, the imputed genotyping data were obtained from NDA (Annual Data Release 3.0). Following imputation, only polymorphic sites with imputation quality R2 ≥ 0.6 and MAF ≥ 0.01 were retained. In the antidepressant treatment cohort, Illumina ASA bead chips were used to genotyping. SNPs with > 5% missingness, samples with more than 10% missingness or ambiguous sex information were removed. Details related to the quality control (QC) can be found in Supplementary Methods. Candidate variant was extracted from the QCed data.

SPAGE analysis

Genome-wide scale single variant G-by-E analyses in the discovery sample were firstly conducted using SPAGE method with CM as environment factor and MDD as outcome phenotype, which is well calibrated for controlling type I error rates even under unbalanced case-control ratios11 (see Supplementary Methods, github: https://github.com/WenjianBI/SPAGE). Three types of CM phenotypes: CM severity, CM experience and cumulative CM experiences, were analyzed separately. Age, sex, and the top ten genetic PCs were included as covariates in the SPAGE model. A genome-wide significance level was set at 5e-08. Considering the specific effects from different CM types, SPAGE with five CM subscales were also performed.

Several sensitivity analyzes were conducted to assess the robustness of the SPAGE results with different confounders. To control for genetic covariance between CM and MDD, the genetic influence on CM was removed, and a supplementary CM variable was generated for SPAGE analysis (details in Supplementary Methods). Additionally, a sensitivity analysis adjusted for the Townsend Deprivation Index (TDI) to account for socioeconomic factors. Following Matthew C. Keller’s recommendations47, we further adjusted for covariate-by-CM and covariate-by-gene interaction terms to validate the G-by-E effect.

Colocalization analysis with transcriptome, methylome and MDD

We applied two colocalization tools: eQTL and GWAS CAusal Variants Identification in Associated Regions (eCAVIAR)48 and summary data-based mendelian randomization (SMR)/heterogeneity in dependent instruments (HEIDI)49, to test for colocalization of MDD trait with cis-eQTL and meQTL in associated gene region. We utilized the MDD summary data obtained from individuals who exposed to CM in UK Biobank. The eQTL and meQTL summary data in whole blood and brain were obtained from public databases (i.e., GTExv750 and v851, Qi52, McRae53 and Hannon54,55) and can be found in Supplementary Table 4. A 200 kb range centered the index SNP identified in SPAGE were tested. eCAVIAR was performed using ezQTL tool56. We used a cutoff threshold of 0.01 (1%) for CLPP to conclude that the causal variants are shared. And we applied a threshold of P-SMR < 0.05 and P-HEIDI > 0.05 to identify significant colocalizations of MDD GWAS signals with eQTL and meQTL signals in SMR/HIDI.

Function annotation, protein interaction and gene enrichment

GTEx51, eQTLGen15, QTLbase16, RegulomeDB13 and HaploReg14 tools were used to ingenerate multiple quantitative trait loci and annotate the genomic region. STRING57 was used to perform PPI network of the identified top genes. GO enrichment analysis was conducted using R package clusterProfiler58. And P < 0.05 and qvalue < 0.05 was set as the significant thresholds.

Brain imaging analysis of G-by-E effects

Structural magnetic resonance imaging (sMRI) of the brains in UK Biobank were tested to identify the potential structural brain abnormalities caused by the identified G-by-E. A total of 444 sMRI morphometric measures involved in cortical thickness, cortical surface area and cortical volume were employed here, utilizing the Destrieux cortical atlas. Linear regression models with the identified G-by-E were performed, with adjustments made for age, sex and top ten PCs. Bonferroni-corrected P < 0.05 was set as the significance level. Cortical thickness and cortical surface area reflect different mechanisms in cortical development59 and are likely influenced by different genetic factors60. Thus, the sMRI morphometric measures were corrected separately and P < 3.38e-04 (0.05/148) was considered significant per morphometry.

Brain sMRI indicators from ABCD children sample were accessed through ABCD-BIDS Community Collection61. Only data that passed the Data Analysis Imaging Center (DAIC) quality control were included62. The sMRI morphometric measures (cortical thickness, cortical surface area, cortical volume and sulcal depth) were employed here, utilizing the same atlas with UK Biobank. P < 3.31e-04 (0.05/151) was set as the significant level and P < 1e-03 was set as suggestive level.

G-by-E effects in childhood development

A prospective design using ABCD was utilized to validate the G-by-E effects in children. Healthy children who were not identified as MDD at baseline were retained for analysis, and the occurrence of incident MDD was defined as the detection of MDD at 2-year follow-up. Logistic regression models with G-by-E were conducted using the R glm function, with adjustments made for age, sex and top ten PCs. CM experience but CM severity were utilized in analysis so that to reduce potential bias came from phenotype definition with UK Biobank. P < 0.05 was set as the significant level.

The prospective design is helpful to clarify the causal effects. We tested the potential structural brain abnormalities at 2-year follow-up caused by the G-by-E effect at baseline. Linear regression models with G-by-E for sMRI at 2-year follow-up were performed, with age, sex, top ten PCs and the corresponding sMRI at baseline adjusted. P < 3.31e-04 was set as the significance level. Afterwards, moderated mediation models were performed to explore the potential mechanism of G-by-E on MDD. Change of the brain structure from baseline to follow-up was included as a mediator, and onset of MDD was included as dependent variable. Interactive effect of SNP and CM was included as independent variable. Same covariates were adjusted. Continuous variables were centered in the moderated mediation models. The null hypothesis mainly tested at the interaction effect on sMRI in the first regression and the sMRI on MDD onset in the second regression, both retained P < 0.05 suggesting a significant moderated mediation model.

G-by-E effects on antidepressant treatment efficacy

HAMD change rates at midterm-treatment and after-treatment were tested using R glm linear regression model. Age, sex, first episode status (yes/no), first drug status (yes/no) and center were included as covariates. P < 0.05 was set as the significance level.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.