Evaluating the precision of EBF1 SNP x stress interaction association: sex, race, and age differences in a big harmonized data set of 28,026 participants

Singh, Abanish; Babyak, Michael A.; Sims, Mario; Musani, Solomon K.; Brummett, Beverly H.; Jiang, Rong; Kraus, William E.; Shah, Svati H.; Siegler, Ilene C.; Hauser, Elizabeth R.; Williams, Redford B.

doi:10.1038/s41398-020-01028-5

Download PDF

Article
Open access
Published: 20 October 2020

Evaluating the precision of EBF1 SNP x stress interaction association: sex, race, and age differences in a big harmonized data set of 28,026 participants

Abanish Singh ORCID: orcid.org/0000-0002-0968-0466^1,2,
Michael A. Babyak^1,2,
Mario Sims³,
Solomon K. Musani³,
Beverly H. Brummett^1,2,
Rong Jiang ORCID: orcid.org/0000-0003-0757-9310^1,2,
William E. Kraus ORCID: orcid.org/0000-0003-1930-9684^4,5,
Svati H. Shah^4,5,
Ilene C. Siegler^1,2,
Elizabeth R. Hauser^4,6 &
…
Redford B. Williams^1,2

Translational Psychiatry volume 10, Article number: 351 (2020) Cite this article

1881 Accesses
2 Citations
21 Altmetric
Metrics details

Subjects

Abstract

In prior work, we identified a novel gene-by-stress association of EBF1’s common variation (SNP rs4704963) with obesity (i.e., hip, waist) in Whites, which was further strengthened through multiple replications using our synthetic stress measure. We now extend this prior work in a precision medicine framework to find the risk group using harmonized data from 28,026 participants by evaluating the following: (a) EBF1 SNPxSTRESS interaction in Blacks; (b) 3-way interaction of EBF1 SNPxSTRESS with sex, race, and age; and (c) a race and sex-specific path linking EBF1 and stress to obesity to fasting glucose to the development of cardiometabolic disease risk. Our findings provided additional confirmation that genetic variation in EBF1 may contribute to stress-induced human obesity, including in Blacks (P = 0.022) that mainly resulted from race-specific stress due to “racism/discrimination” (P = 0.036) and “not meeting basic needs” (P = 0.053). The EBF1 gene-by-stress interaction differed significantly (P = 1.01e−03) depending on the sex of participants in Whites. Race and age also showed tentative associations (Ps = 0.103, 0.093, respectively) with this interaction. There was a significant and substantially larger path linking EBF1 and stress to obesity to fasting glucose to type 2 diabetes for the EBF1 minor allele group (coefficient = 0.28, P = 0.009, 95% CI = 0.07-0.49) compared with the same path for the EBF1 major allele homozygotes in White females and also a similar pattern of the path in Black females. Underscoring the race-specific key life-stress indicators (e.g., racism/discrimination) and also the utility of our synthetic stress, we identified the potential risk group of EBF1 and stress-induced human obesity and cardiometabolic disease.

A comprehensive analysis of genetic risk for metabolic syndrome in the Egyptian population via allele frequency investigation and Missense3D predictions

Article Open access 22 November 2023

Genetic variation in NOTCH1 is associated with overweight and obesity in Brazilian elderly

Article Open access 24 July 2024

Brain insulin receptor gene network shapes risk for metabolic disease after early-life stress in women

Article Open access 26 September 2025

Introduction

Understanding the precise role of genetic, demographic, and environmental variations on the expression of complex biological mechanisms contributing to the development and course of major medical disorders is critical for next-generation medicine, often referred to as precision medicine^1,2. Its framework defines the human diseases at greater resolution by focusing on a particular target risk group or subpopulation³ based on several factors, such as individual’s genetic and molecular makeup; complex physiological aspects of race, sex, and age; lifestyle and environmental factors; and their interactions. Although the progress in precision medicine research has been slow to develop⁴, there are several examples of success in this emerging field⁵, which support the contention that precision medicine approaches have the potential to help with the safety and effective delivery of health solutions to the target risk groups⁶. Two among the many challenges in achieving clinical utility of the precision medicine framework are accomplishing data pooling (i.e., data harmonization) in order to reconcile the evidence from multiple ongoing investigations⁷ and developing robust estimates of the interactions among an individual’s genetic makeup and complex physiological aspects of race, sex, and age. Our past and current work have focused on both of these challenges. In prior work, we have accomplished the integration of heterogeneous data sets collected from multiple sources, harmonizing inconsistencies among measurement protocols, units, and coding⁸. In the current work we focus on developing generalizable robust estimates within specific strata of the interactions using this large harmonized data set.

Our central working hypothesis holds that chronic psychosocial stress modifies the association between genetic variants and cardiovascular disease (CVD) risk at key phenotypic nodes (e.g., obesity) along the disease pathways⁹. In our previous study¹⁰, we identified a novel CVD-risk gene EBF1 using the Multi-Ethnic Study of Atherosclerosis (MESA) cohort White samples, wherein the presence of chronic psychosocial stress a common variation (SNP rs4704963) influenced individual differences in central obesity—hip circumference as primary phenotype. We further identified a statistically significant path linking stress and EBF1 genotype to obesity to fasting glucose to CVD risk¹⁰, confirming our central hypothesis⁹. We also observed similar gene-by-stress associations with waist circumference as an obesity trait¹⁰. Our efforts to replicate this finding¹⁰ in other data sets were challenged by the absence of an explicit measure of psychosocial stress. In response, we created a synthetic measure of psychosocial stress in such data sets¹¹. Our synthetic stress algorithm is based on the use of proxy items in the domains of the formal, self-rated measure of chronic psychosocial stress from the MESA, based on information about financial, marital, work, own health, and health of a spouse or someone close¹². We achieved replication of the EBF1 SNPxSTRESS association with obesity, however, also only in White samples. One of the possible reasons for not originally finding (at genome-wide level) or achieving replication (at level P ≤ 0.05) in Black samples may be the use of MESA-like five components of chronic psychosocial stress measure, which may not be sufficient to capture the stress in everyday life among the Black participants. This demographic subgroup-specific (i.e., only in Whites) gene-by-stress association of EBF1 made it an ideal candidate for the precision medicine analytic framework. In the current study, we used the MESA data set for the initial observations on race and sex-stratified analysis in order to formulate several testable hypotheses (as described in the “Materials and method” section) that were related to the variability we observed in the results of fitting a gene-by-stress interaction. Our hypotheses were focused on the following problems: (a) failure to observe a significant EBF1 gene-by-stress association in Black samples, (b) observing a significant association only in a specific subgroup of samples (i.e., White females), and (c) extending our understanding of the clinical implications of race and sex-specific gene-by-stress associations linking to a path from EBF1 and stress to obesity to fasting blood glucose to the development of cardiometabolic disease risk (e.g., type 2 diabetes mellitus). We used harmonized data sets from 28,026 participants derived from ten studies⁸, including the Jackson Heart Study¹³, for testing these hypotheses. In this report, we present the utility of our synthetic stress in harmonized data sets; additional efforts to show, including in Black samples, that common variation in EBF1 may contribute to inter-individual differences in human obesity in the presence of stress; a systematic evaluation of sex, race, and age interactions with EBF1 gene-by-stress association to identify the precise risk group; and also the evaluation of its clinical implication, i.e., a path linking EBF1 and stress to obesity to fasting blood glucose to the development type 2 diabetes mellitus in the risk group.

Materials and methods

Study data sources

We used a harmonized data set that was derived from ten studies⁸, including six large cohort public-access studies and four smaller Duke studies. The public-access data sets were from the Jackson Heart Study (JHS)¹³; The Women’s Health Initiative (WHI) Study¹⁴; The Coronary Artery Risk Development in Young Adults Study (CARDIA)¹⁵; Atherosclerosis Risk in Communities Study (ARIC)¹⁶; Framingham Offspring Cohort¹⁷; and Multi-Ethnic Study of Atherosclerosis (MESA)¹⁸. These public-access data sets were obtained from the dbGaP/database of Genotypes and Phenotypes/National Center for Biotechnology Information, National Library of Medicine (NCBI/NLM)/ https://www.ncbi.nlm.nih.gov/gap¹⁹ through authorized/controlled data access under the standard user agreement. The Duke data sets were the Duke Family Heart Study (DFHS)²⁰; Duke Caregiver Study (DCS)²¹; and two cohorts for Studies of a Targeted Risk Reduction Intervention through Defined Exercise (STRRIDE), i.e., STRRIDE—Aerobic Training/Resistance Training (AT/RT)²², and STRRIDE pre-diabetes (PD)²³ studies. These Duke data sets were obtained from the studies conducted at Duke University Medical Center (DUMC). The accession of these data sets was approved by the Duke Institutional Review Board (IRB). A more detailed description of these study populations is provided in the Supplementary Materials.

Data harmonization

The harmonization of data from the above-listed study cohorts of varying size and demography is described elsewhere in Singh et al.⁸, where we harmonized data sets for a measure of chronic psychosocial stress, candidate SNPs (e.g., EBF1 rs4704963), and CVD-risk variables, including adiposity and hyperglycemia. Our focus for the current work was mainly on non-related samples of White and Black ancestries, thus, we used 28,026 White and Black participants (Table 1).

Table 1 Race and sex-stratified summary of study variables from harmonized data set comprising 28,026 non-related samples that were derived from ten studies.

Full size table

Study variables

Genetic variant

For the proposed analysis, we used harmonized EBF1 genetic variation data, i.e., SNP rs4704963 if available, or SNP rs17056278 (LD R² = 1 with rs4704963)⁸. The SNP genotyping was done in dbGaP SHARe data sets MESA, CARDIA, WHI, ARIC, and JHS using the Affymetrix Genome-Wide Human SNP Array 6.0 and in the Framingham Cohort using the Affymetrix Mapping250K (Nsp and Sty) Arrays and Mapping50K (Hind240 and Xba240) Arrays. The SNP genotyping was done in the Duke data sets DFHS and DCS using ABI 7900 Taqman system (Applied Biosystems) platform and in STRRIDE using Taqman (Life Technologies) and the QuantiFast Multiplex PCR + ROX kit (Qiagen) platforms. The population ancestry principal components were created for the studies that included SNP array data sets. The SNP genotypes were subjected to standard quality control metrics¹⁰. The EBF1 genotypes were available for all samples of the harmonized data set that were included in this study (Table 1).

Synthetic stress measure

Out of the ten harmonized studies, only two studies, the MESA and JHS, used a self-rated stress measure. The MESA stress measure was based on the following five domains: financial strains, relationship or marital problems, difficulties with job or ability to work, serious health problems of a spouse or someone close, and one’s own serious health problems¹². However, the JHS stress summary measure was based on eight components, i.e., stress due to job, relationships, neighborhood, care-giving, legal problems, medical problems, racism/discrimination, and (not) meeting basic needs²⁴. The three additional components in JHS were stress due to neighborhood, legal problems, and racism/discrimination. In addition, the financial strains component was evaluated differently, i.e., by using a more specific question on (not) meeting basic needs. For the remaining studies that did not have a self-rated stress measure, we constructed a synthetic stress measure as reported in our data harmonization efforts^8,11 employing our algorithm that uses proxy indicators of the domains used in the MESA chronic burden measure¹². Briefly, our algorithm¹¹ searched for proxy indicators of each stress domain, scored each proxy item as 1 = stressful, 0 = not stressful. The item scores were then summed to obtain a single score, which varied in range across the studies due to the non-availability of all items in each study. We transformed the scores to z-scores (mean of zero and a standard deviation of one) within each study in order to harmonize the differently scaled measures. The list of proxy items for each study is provided in Supplementary Materials. More details of the construction of synthetic stress are provided elsewhere in our data harmonization efforts^8,11.

In addition, we created MESA-alike and MESA-unlike partial stress summary scores in the JHS using its available individual stress components: For MESA-alike score, we used a total four out of the eight self-rated stress components, i.e., job, relationship, caring for, and medical problem; and for MESA-unlike score, we used four components that were additional to MESA-alike components or was a differently evaluate an item, i.e., “racism/discrimination”, “living in neighborhood”, “legal problems”, and “not meeting basic needs”.

Primary phenotypes and independent variables

In order to evaluate the EBF1 gene-by-stress interaction in consistency with our discovery analysis¹⁰, we used the obesity trait hip circumference as the primary phenotype for the initial phase of analysis (i.e., observation/ hypotheses building, as described below in the Analysis section) in MESA data set. Hip circumference was not present in JHS, therefore, we used waist circumference as the primary phenotype for the final phase of analysis (i.e., hypotheses testing) in the JHS and harmonized studies’ data sets. We also used fasting blood glucose and type 2 diabetes mellitus (DM) status for evaluating the structural equation path model linking EBF1 SNP and stress to obesity to fasting glucose to type 2 diabetes mellitus, a cardiometabolic disease risk¹⁰. Other study variables that were used as key independent variables in the analyses were demographic variables (age, sex) and population ancestry principal components (if the data were available for analysis). Table 1 shows a summary of these study variables.

Analysis

The analysis proceeded in three phases—observation, hypotheses building, and hypotheses testing to evaluate race, sex, and age differences in the EBF1 gene-by-stress association in the context of a precision medicine framework. We used the MESA data set for the observation and hypotheses building phases and JHS and combined (harmonized) data sets derived from the ten studies for the hypotheses testing phase.

Observation phase

Race and sex-stratified analysis in MESA

We performed race and sex-stratified linear regression on the primary phenotype hip circumference in the MESA data set for the EBF1 SNP rs4704963 under the additive genetic model, consistent with an original discovery analysis¹⁰. The interaction model included age, SNP, STRESS, and SNPxSTRESS along with population ancestry correction. We also evaluated the SNP main-effect using a conventional SNP-only additive model (i.e., without STRESS and SNP×STRESS terms). The ordinal stress variable was treated as a linear variable. The gene-by-stress interaction was tested by the SNPxSTRESS product term in the model and interaction was considered significant at the threshold P-value ≤ 0.05 for the single SNP analysis. We also plotted the race and sex-stratified distribution of the mean of hip circumference in MESA data set against each ordinal value of stress for the two genotype groups of EBF1 SNP, i.e., major allele homozygotes (TT) and minor allele heterozygotes and homozygotes (CT/CC).

Hypotheses building phase

We formulated the following testable hypotheses based on the outcomes of the above described observational phase of analysis:

(1a) that a five-point stress measure is not adequate to capture stress information in Black populations and thus one or more additional stress component(s) (specifically, “racism/discrimination”) or an existing component but evaluated differently (especially, financial strains component evaluated using the question on “not meeting basic needs”) may contribute in EBF1 gene-by-stress interaction. We used the JHS data set to test this hypothesis.

(1b) that excluding the above additional or differently evaluated items from the stress score (i.e., making the stress score equivalent to MESA-like self-rated stress measure) may not result in the significant EBF1 gene-by-stress interaction in the JHS data set.

(2a) that the significant EBF1 gene-by-sex interaction was observed only in females and thus we may observe a 3-way SNPxSTRESSxSEX interaction.

(2b) that this interaction was significant in White MESA samples and thus we may observe a 3-way SNPxSTRESSxRACE interaction.

(2c) that the significant EBF1 gene-by-sex interaction was initially observed in a relatively old population and thus there may be a 3-way SNPxSTRESSxAGE interaction.

(3) that there will be a sex and race-related difference in the implication of obesity-related EBF1 gene-by-stress interaction to other cardiometabolic risk factors and clinical outcomes, such as, fasting glucose, type 2 diabetes mellitus status—i.e., there may be sex and race-related differences in the significance of path from stress to obesity to fasting blood glucose to type 2 diabetes mellitus status.

Hypotheses testing phase

(1) We tested the EBF1 gene-by-stress interaction in the JHS data set (Black samples) using its eight components stress summary score. We also tested EBF1 gene-by-stress interaction in JHS Black samples using the MESA-alike and MESA-unlike partial stress score. In addition, we tested the EBF1 gene-by-stress interaction using the individual items that were included in MESA-unlike partial stress score in the JHS data set, i.e., “racism/discrimination”, “living in neighborhood”, “legal problems”, and “not meeting basic needs”. The full model was specified as:

WAIST CM = SNP + AGE + SEX + STRESS + SNPxSTRESS + ancestry PCAs.

We also performed a sex-stratified analysis using the model.

(2) Mega-analysis: Using the harmonized data set of combined samples composed from the ten studies⁸, we fit the 3-way interaction terms SNPxSTRESSxSEX, SNPxSTRESSxRACE, and SNPxSTRESSxAGE in separate regression models. Each model included all subordinate terms of the 3-way interaction term—for example, a model to test SNPxSTRESSxSEX also included SNPxSTRESS, STRESSxSEX, SNPxSEX, SNP, STRESS, and SEX terms. The multiple sourcing of data in combined samples in mega-analysis was taken into account using dummy study variables. For details on the use of dummy variables, see Singh et al.⁸. We checked for departure from normality for the primary outcome (dependent) variable and did not perform a transformation. The full models were specified as:

1: WAIST CM = SNP + STRESS + AGE + SEX + RACE + SNPxSTRESS + SNP × SEX + STRESS × SEX + SNP × STRESS × SEX + Study Dummy Variables.

2: WAIST CM = SNP + STRESS + AGE + SEX + RACE + SNP × STRESS + SNP × RACE + STRESS × RACE + SNP × STRESS × RACE + Study Dummy Variables.

3: WAIST CM = SNP + STRESS + AGE + SEX + RACE + SNP × STRESS + SNP × AGE + STRESS × AGE + SNP × STRESS × AGE + Study Dummy Variables.

We evaluated these interactions in each stratum, i.e., sex interaction in White, Black, and two races combined; race interaction in male, female, and two sexes-combined; and age interaction in White, Black, and two races combined samples. The visualization of 3-way interactions was performed using an R function based on the generalized Johnson–Neyman (J–N) technique²⁵. This was followed by evaluating the association of the 2-way interaction term SNPxSTRESS and the model variables-adjusted partial correlation of SNP and stress separately with waist circumference in each stratum of the data set, i.e., for male, female, and two sexes-combined in White and Black samples.

(3) Structural equation path analysis: We modeled the path linking stress and EBF1 genotypes to obesity (waist circumference) to fasting glucose to type 2 diabetes mellitus as a cardiometabolic clinical risk factor using structural equations path models¹⁰. We used generalized structural equation modeling (GSEM) as implemented in STATA 15.1 (StataCorp LLC.) to evaluate race, sex, and EBF1 genotype-stratified possible causal paths from stress to waist circumference to fasting glucose to type 2 diabetes mellitus status (0/1). Path analysis uses a series of simultaneous equations to estimate possible mediating paths. The magnitude of a mediating effect is then calculated by taking the product of all path coefficients along a given proposed path. We modeled waist circumference and fasting glucose using linear regressions and dichotomous type 2 diabetes mellitus status using logistic regression, i.e., logit link in Bernoulli family as implemented in GSEM, and all analyses were adjusted for age. We implemented the multiple sourcing of data sets (i.e., combining data from ten studies) in the analyses by using study variable-based clustering of variance–covariance estimates and clustered robust standard errors. Our model was expressed as three simultaneous equations: (1) diabetes mellitus = b1*fasting glucose + b2*waist + b3*stress + b4*age; (2) fasting glucose=b5*waist + b6*stress + b7*age; and (3) waist =b8*stress + b9*age. In each equation, the term on the left is equivalent to the dependent variable in a regression-type model, while the terms on the right are the predictor variables. The coefficients b1 through b9 represent the regression slopes of each association. The three path model equations also imply three mediated effects from stress to type 2 DM: stress ⇒ waist ⇒ DM; stress ⇒ glucose ⇒ DM; and stress ⇒ waist ⇒ glucose ⇒ DM. We estimated simultaneous models for the TT and CT/CC groups in each category based on race- and sex-stratification. Further technical details of the structural equation modeling are shown in the Supplementary Materials.

Results

Observation phase

Table 2 shows the SNP main effect and interaction P-values, beta, SNP minor allele frequencies (MAF), and sample sizes of race and sex-stratified EBF1 GxE interactions with hip circumference in the MESA data set. Figure 1 shows the direction of race and sex-stratified interaction associations, which is consistent with the direction of association that we observed in our initial finding¹⁰ and its replications¹¹, i.e., mean hip circumference increased with the increase in chronic psychosocial stress score for the minor allele group. The sex-stratified analysis showed that all observed significant EBF1 gene-by-stress interactions were contributed by only female participants, it was significant only in Whites, and the average age of study samples was about 62 years (Table 2 and Fig. 1). Based on these results, our initial observation was that using a five-component stress measure, we might identify the EBF1 gene-by-stress interaction only in White females that are relatively old. Not observing a significant interaction in Blacks in MESA study samples might be due to our inability to capture the adequate stress information through a MESA-like five-component stress measure.

Table 2 The observations of race and sex-stratified EBF1 GxE (i.e., gene-by-stress or SNPxSTRESS) interactions with a hip circumference in the MESA data set.

Full size table

**Fig. 1: Direction of GxE association.**

Hypothesis test 1

Table 3 shows the P-values of SNPxSTRESS term in JHS male, female, and both sexes combined for the self-rated eight-component stress summary score, a partial summary score using MESA-like items (i.e., job, relationship, caring for, and medical problem), and for the additional stress components (i.e., “racism/discrimination”, “living in neighborhood”, and “legal problems”) or an existing component but evaluated differently (i.e., not meeting basic needs). As hypothesized, the EBF1 gene-by-stress interaction association in JHS females was significant for full stress summary score (P = 0.022), for the additional stress component “racism/discrimination” (P = 0.036), and moderately significant for differently evaluated financial strains component “not meeting basic needs” (P = 0.053). However, the association was not significant for the partial summary score using MESA-like components, i.e., making it equivalent to MESA-like self-rated or our synthetic stress measure, clearly indicating why we would not observe EBF1 gene-by-stress interaction using a MESA-like self-rated or our synthetic stress measures in Black populations in other data sets.

Table 3 EBF1 gene-by-stress (GxE)association with waist circumference in JHS.

Full size table

Hypothesis test 2

Table 4 shows the P-values for the associations of 3-way interaction terms with waist circumference using harmonized data sets, i.e., combined samples from ten studies in a mega-analysis. As hypothesized, the P-value (1.01e−03) for the association of 3-way interaction term SNPxSTRESSxSEX with waist circumference in White samples was statistically significant. However, the P-values for other 3-way interactions, i.e., SNPxSTRESSxRACE and SNPxSTRESSxAGE, were not statistically significant. We observed likely tentative associations for the 3-way interaction terms SNPxSTRESSxRACE and SNPxSTRESSxAGE in all-female samples (P = 0.103) and all White samples (P = 0.093), respectively. Figure 2 displays the 3-way interactions from Table 4 with Johnson–Neyman confidence bands. Each plot displays the estimated slope of EBF1 genotype term predicting waist circumference when standardized stress score is low (mean-1 SD), medium (mean), and high (mean +1 SD), given sex, race, or age as the third moderator. The plots display distinctive estimated slopes of the EBF1 genotype term, particularly for the 3-way interactions that showed statistically significant or tentative associations (Fig. 2a, b, f, h). These plots suggest that although the 3-way interaction tests with race and age did not achieve conventional levels of statistical significance, there may be some value in further pursuing these observed differences in additional samples. The P-values for the associations of 2-way interaction term SNPxSTRESS are shown in Supplementary Table S1. The SNP rs4704963 itself was not correlated with waist circumference (partial correlation coefficients = −0.0048─0.0149) in any stratum of the data set (Supplementary Table S1). In addition, stress was also not correlated with waist circumference for male participants in both Black and White samples (coefficients = 0.0231, 0.0509, respectively), but, as expected, it was moderately correlated for female participants in both ancestries (coefficients = 0.182, 0.137, respectively; Supplementary Table S1).

Table 4 3-Way interactions: EBF1 SNPXSTRESSXSEX, SNPXSTRESSXRACE, and SNPXSTRESSXAGE interaction association on WAIST CIRCUMFERENCE in harmonized data sets.

Full size table

**Fig. 2: Johnson-Neyman interval plots for 3-way interaction.**

Hypothesis test 3

Table 5 shows the unstandardized coefficients and P-values of paths stress ⇒ waist, waist ⇒ glucose, glucose ⇒ DM, and indirect path stress ⇒ waist ⇒ glucose ⇒ DM for the race, sex, and EBF1 SNP genotype-stratified analysis. Path coefficients can be interpreted as the expected change in the dependent (endogenous) variable for each one-unit change in the independent (exogenous) variable, similar to regression slope coefficients. We observed in general larger and significant paths for female carriers of EBF1 minor allele heterozygotes and homozygotes (CT/CC) genotype compared to the carriers of major allele homozygotes (TT). The biggest difference between major and minor allele groups was clearly in the stress ⇒ waist path, all in the expected direction and consistent with our original finding¹⁰. For an example, a one-point increase in stress in the White female CT/CC group was associated with a 3.95 cm increase in waist circumference as compared to of 1.57 cm increase in the White female TT group, 0.68 cm increase in White male CT/CC group, and 0.33 cm in White male TT group. Figure 3 shows a graphical representation of complete path analysis in a different race, sex, and rs4704963 genotype groups. The indirect path stress ⇒ waist ⇒ glucose ⇒ DM was statistically significant and substantially larger in the White female CT/CC group (coefficient = 0.28, P = 0.009, 95% CI = 0.07–0.49) compared with the same path in the White female TT group (coefficient = 0.09, P = 0.03, 95% CI = 0.01–0.18). We also observed a similar pattern, i.e., larger coefficients of paths for Black female CT/CC group, however, the indirect path stress ⇒ waist ⇒ glucose ⇒ DM was not statistically significant (P = 0.089) for these samples (Table 5 and Fig. 3).

Table 5 Coefficients and P-values of paths in structural equation models (SEMs) stratified by race, sex, and EBF1 rs4704963 genotypes, i.e., homozygote major allele (TT) and minor allele heterozygotes and homozygotes (CT/CC).

Full size table

**Fig. 3: Structural equation model (SEM) path diagrams.**

Discussion

In our prior work¹⁰, we identified an EBF1 gene-by-stress interaction associated with cardiometabolic risk factors (e.g., central obesity) in MESA White and Framingham Offspring data sets. Subsequently, we replicated this gene-by-stress interaction in three additional data sets using a synthetic stress measure that we created employing our algorithm based on the proxy indicators of MESA-like stress items¹¹. Later, we performed data harmonization for chronic psychosocial stress, EBF1 SNP rs4704963, and CVD-risk variables, including adiposity and hyperglycemia, in the ten studies that we used in the current work⁸. The data harmonization involved the construction of a synthetic stress measure¹¹ in eight out of the ten studies that did not have a self-rated formal stress measure, while two studies (MESA, JHS) had a formal self-rated stress measure⁸. As presented in our previous work⁸, the broad domains of psychosocial stress that we have used were consistent with that of others using similar stress measures -- including measures explicitly designed to assess stress^12,24—and were apparently sufficient to capture life stress, even when not all domains were present in the synthetic measure¹¹.

In the present study, we replicated the EBF1 gene-by-stress interaction in JHS Black samples (P = 0.022), in addition to previously observed significant associations in White samples^10,11. The replication of the EBF1 gene-by-stress interaction in Black females in JHS was apparently due to the additional components of “racism/discrimination” and “not meeting basic needs” (Ps = 0.036, 0.053, respectively), which were unique to the JHS stress assessment. This may be due to the added relevance of these additional stress components in Black populations. However, the JHS observations may be study-specific that may not be generalized until more work elucidates these interactions in Black samples. We did not have all additional JHS indicators in our synthetic stress, therefore, it was not possible for us to test the hypothesis 1 in the harmonized data set⁸, combining all samples from multiple studies.

The formal testing of 3-way interactions involving SNP, stress, and sex, race, and age on waist circumference resulted in a statistically significant association only for SNPxSTRESSxSEX term (P = 1.01e−03) in White samples. The P-values for 3-way interaction terms SNPxSTRESSxRACE and SNPxSTRESSxAGE did not reach conventional statistical significance level in any of the stratified categories. Detecting statistically significant interactions are known to be challenging, depending on several factors, including sample size, the distribution of the interaction component variables (e.g., minor allele or genotype frequency or in the case of continuous variables, the thinness of the tails), discovery power, phenotypic information contents, and related differential biological mechanism(s) in specific sample groups. The information regarding these factors with respect to an observed or unobserved association may elucidate the precision of an interaction. In the EBF1 gene-by-stress interaction case, the minor allele frequency in White samples was more than three times larger than that was for Black samples. Replication in JHS also revealed that population specificity of stress measures might make a difference, despite the low sample size. Any observation of 3-way interactions clearly depended on the differential association of the SNPxSTRESS term over the third term, i.e., sex, race, or age. While sex and race have well-defined categories (i.e., male/female, white/black) to reveal which population group(s) contribute to the SNPxSTRESS term association (e.g., white females, Supplementary Table S1), it is generally not recommended to dichotomize continuous variable²⁶. We, therefore, used age as a continuous variable in our analyses (Tables 2–5) and observed only a tentative 3-way SNPxSTRESSxAGE interaction association in White samples (P = 0.093). More studies are needed to evaluate the differential relationship of age over the gene-by-stress interaction and its influence on the development of cardiometabolic risks.

Structural equation path analysis for possible mediated causal paths from stress to obesity to fasting blood glucose to type 2 diabetes mellitus, a cardiometabolic disease risk factor, revealed that the direct path stress ⇒ waist was substantially larger and significant in White and Black female CT/CC groups as compared to other stratified groups. Also, the indirect path stress ⇒ waist ⇒ glucose ⇒ DM was largest and statistically significant in the White female CT/CC group. Our inability to observe a similar statistically significant indirect path in the Black female CT/CC group might be due in parts to the lack of a population-specific stress measure component (e.g., racism or discrimination) in the harmonized data set and/or low minor allele frequency of the EBF1 SNP in Black samples.

The use of the harmonized data set from all studies helped us develop robust and stable estimates of the interactions among an individual’s genetic makeup for EBF1 SNP rs4704963 and complex physiological aspects of sex, race, and age. The EBF1xSTRESS term P-value (4.68E−06) for White samples from the multi-study large harmonized data set (Supplementary Table S1) was not as strong as the P-value of the same term (7.14E−09) for the relatively much smaller MESA data set in our discovery GWAS analysis¹⁰. This indicated that the analysis using a larger sample size may result in a stable and more generalizable association, not necessarily a stronger association in terms of P-value.

In conclusion, the use of synthetic stress measures in the harmonized data set has shown that if a self-rated chronic psychosocial stress score was not obtained at the time of initial sample collection, we could still use available data to compute a synthetic stress score retrospectively. Our work provides additional confirmation, including in Black samples, that common variation in EBF1 may contribute to inter-individual differences in human obesity in the presence of stress, and that the gene-by-stress interaction differs depending on the sex of participants in both White and Black samples. The observed associations appeared to be present only among female participants. There also was preliminary evidence that at least some of the associations may vary across the lifespan, but more work is needed for confirmation. A MESA-like 5-component stress measure does not appear to capture the key life-stress indicator in the Black population, in which a measure of discrimination appears necessary. We observed a substantially larger and significant direct path stress ⇒ waist in White and Black female EBF1 minor allele carriers as compared to other stratified groups and also the largest and significant indirect path stress ⇒ waist ⇒ glucose ⇒ DM in the White female EBF1 minor allele group. Our work may provide a foundation to the precision medicine framework related to EBF1 gene-by-stress interactions, which in turn may lead to therapeutic intervention focused on a precise risk group, i.e., only female, mostly Whites, and possibly older individuals.

Data availability

The public-access data sets used in this study can be obtained from the dbGaP/database of Genotypes and Phenotypes/National Center for Biotechnology Information, National Library of Medicine (NCBI/NLM)/https://www.ncbi.nlm.nih.gov/gap through authorized access approved by NIH Data Access Committee. The Duke data sets, however, are available for collaborative use upon reasonable request of collaborations with authors on the permission of the respective Study Committee and Duke IRB.

References

Luft, F. C. Personalizing precision medicine. J. Am. Soc. Hypertens. 9, 415–416 (2015).
Article Google Scholar
Peer, D. Precision medicine – Delivering the goods? Cancer Lett. 352, 2–3 (2014).
Article CAS Google Scholar
Ashley, E. A. Towards precision medicine. Nat. Rev. Genet. 17, 507 (2016).
Article CAS Google Scholar
Joyner, M. J. Precision medicine, cardiovascular disease and hunting elephants. Prog. Cardiovasc. Dis. 58, 651–660 (2016).
Article Google Scholar
Hansen, J. & Iyengar, R. Computation as the mechanistic bridge between precision medicine and systems therapeutics. Clin. Pharmacol. Therap. 93, 117–128 (2013).
Article CAS Google Scholar
Heckman-Stoddard, B. M. & Smith, J. J. Precision medicine clinical trials: defining new treatment strategies. Semin Oncol. Nurs. 30, 109–16. (2014).
Article Google Scholar
Beckmann, J. S. & Lew, D. Reconciling evidence-based medicine and precision medicine in the era of big data: challenges and opportunities. Genome Med. 8, 134 (2016).
Article Google Scholar
Singh, A. et al. Developing a synthetic psychosocial stress measure and harmonizing CVD-risk data: a way forward to GxE meta- and mega-analyses. BMC Res. Notes 11, 504 (2018).
Article Google Scholar
Williams, R. B. Psychosocial and biobehavioral factors and their interplay in coronary heart disease. Annu. Rev. Clin. Psychol. 4, 349–365 (2008).
Article Google Scholar
Singh, A. et al. Gene by stress genome-wide interaction analysis and path analysis identify EBF1 as a cardiovascular and metabolic risk gene. Eur. J. Hum. Genet. 23, 854–862 (2015).
Article CAS Google Scholar
Singh, A. et al. Computing a synthetic chronic psychosocial stress measurement in multiple datasets and its application in the replication of G × E interactions of the EBF1 gene. Genet. Epidemiol. 39, 489–497 (2015).
Article Google Scholar
Shivpuri, S., Gallo, L. C., Crouse, J. R. & Allison, M. A. The association between chronic stress type and C-reactive protein in the multi-ethnic study of atherosclerosis (MESA): does gender make a difference? J. Behav. Med. 35, 74–85 (2012).
Article Google Scholar
Sempos, C. T., Bild, D. E. & Manolio, T. A. Overview of the Jackson Heart Study: a study of cardiovascular diseases in african american men and women. Am. J. Med. Sci. 317, 142–146 (1999).
Article CAS Google Scholar
The WHI Study Group. Design of the women’s health initiative clinical trial and observational study. Controlled Clin. Trials 19, 61–109 (1998).
Article Google Scholar
Friedman, G. D. et al. CARDIA: study design, recruitment, and some characteristics of the examined subjects. J. Clin. Epidemiol. 41, 1105–1116 (1988).
The ARIC Investigators. The Atherosclerosis Risk In Communities (ARIC) Study: design and objectives. Am. J. Epidemiol. 129, 687–702 (1989).
Feinleib, M., Kannel, W. B., Garrison, R. J., McNamara, P. M. & Castelli, W. P. The Framingham offspring study. Design and preliminary data. Prev. Med. 4, 518–525 (1975).
Bild, D. E. et al. Multi-ethnic study of atherosclerosis: objectives and design. Am. J. Epidemiol. 156, 871–881 (2002).
Article Google Scholar
Mailman, M. D. et al. The NCBI dbGaP database of genotypes and phenotypes. Nat. Genet. 39, 1181 (2007).
Article CAS Google Scholar
Brummett, B. H. et al. Associations of depressive symptoms, trait hostility, and gender with C-reactive protein and interleukin-6 response following emotion recall. Psychosom. Med. 72, 333–339 (2010).
Article CAS Google Scholar
Siegler, I. C., Brummett, B. H., Williams, R. B., Haney, T. L. & Dilworth-Anderson, P. Caregiving, residence, race, and depressive symptoms. Aging Ment. health 14, 771–778 (2010).
Article Google Scholar
Slentz, C. A. et al. Effects of aerobic vs. resistance training on visceral and liver fat stores, liver enzymes, and insulin resistance by HOMA in overweight adults from STRRIDE AT/RT. Am. J. Physiol. Endocrinol. Metab. 301, E1033 (2011).
Article CAS Google Scholar
Slentz, C. A. et al. Effects of exercise training alone vs a combined exercise and nutritional lifestyle intervention on glucose homeostasis in prediabetic individuals: a randomised controlled trial. Diabetologia 59, 2088–2098 (2016).
Article CAS Google Scholar
Johnson, D. A. et al. The contribution of psychosocial stressors to sleep among African Americans in the Jackson Heart Study. Sleep 39, 1411–1419 (2016).
Article Google Scholar
Bauer, D. J. & Curran, P. J. Probing interactions in fixed and multilevel regression: inferential and graphical techniques. Multivar. Behav. Res. 40, 373–400 (2005).
Article Google Scholar
MacCallum, R. C., Zhang, S., Preacher, K. J. & Rucker, D. D. On the practice of dichotomization of quantitative variables. Psychol. Methods 7, 19–40 (2002).
Article Google Scholar

Download references

Acknowledgements

This work was supported by NIH/NHLBI grant P01HL036587 (Williams). The public-access data sets (MESA, Framingham, CARDIA, ARIC, WHI, and JHS) were obtained from dbGaP/database of Genotypes and Phenotypes/National Center for Biotechnology Information, National Library of Medicine (NCBI/NLM)/https://www.ncbi.nlm.nih.gov/gap (Mailman et al. 2007) through authorized/controlled data access under the standard user agreement. The Duke data sets (DFHS, CAREGIVER, and STRRIDE) were obtained from Duke studies. We thank the investigators, staff, and participants of the dbGaP and Duke studies for their valuable contributions. Study-specific acknowledgments can be found in the Supplementary Materials.

Author information

Authors and Affiliations

Behavioral Medicine Research Center, Duke University School of Medicine, Durham, NC, USA
Abanish Singh, Michael A. Babyak, Beverly H. Brummett, Rong Jiang, Ilene C. Siegler & Redford B. Williams
Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC, USA
Abanish Singh, Michael A. Babyak, Beverly H. Brummett, Rong Jiang, Ilene C. Siegler & Redford B. Williams
Department of Medicine of the University of Mississippi Medical Center, Jackson, MS, USA
Mario Sims & Solomon K. Musani
Duke Molecular Physiology Institute, Duke University School of Medicine, Durham, NC, USA
William E. Kraus, Svati H. Shah & Elizabeth R. Hauser
Department of Medicine, Duke University School of Medicine, Durham, NC, USA
William E. Kraus & Svati H. Shah
Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA
Elizabeth R. Hauser

Authors

Abanish Singh
View author publications
Search author on:PubMed Google Scholar
Michael A. Babyak
View author publications
Search author on:PubMed Google Scholar
Mario Sims
View author publications
Search author on:PubMed Google Scholar
Solomon K. Musani
View author publications
Search author on:PubMed Google Scholar
Beverly H. Brummett
View author publications
Search author on:PubMed Google Scholar
Rong Jiang
View author publications
Search author on:PubMed Google Scholar
William E. Kraus
View author publications
Search author on:PubMed Google Scholar
Svati H. Shah
View author publications
Search author on:PubMed Google Scholar
Ilene C. Siegler
View author publications
Search author on:PubMed Google Scholar
Elizabeth R. Hauser
View author publications
Search author on:PubMed Google Scholar
Redford B. Williams
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Abanish Singh.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Consent to participate

Our study does not report any individual participant’s data or health-related outcome. It involves a secondary analysis of data of human samples that were previously collected and anonymized by other studies (see subsection Study Data Sources) as per the consent of participants. This secondary analysis study is approved by the Duke Institutional Review Board (IRB) protocol number Pro00070669.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental Material

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Singh, A., Babyak, M.A., Sims, M. et al. Evaluating the precision of EBF1 SNP x stress interaction association: sex, race, and age differences in a big harmonized data set of 28,026 participants. Transl Psychiatry 10, 351 (2020). https://doi.org/10.1038/s41398-020-01028-5

Download citation

Received: 01 May 2020
Revised: 10 September 2020
Accepted: 22 September 2020
Published: 20 October 2020
Version of record: 20 October 2020
DOI: https://doi.org/10.1038/s41398-020-01028-5

Subjects

Abstract

Similar content being viewed by others

A comprehensive analysis of genetic risk for metabolic syndrome in the Egyptian population via allele frequency investigation and Missense3D predictions

Genetic variation in NOTCH1 is associated with overweight and obesity in Brazilian elderly

Brain insulin receptor gene network shapes risk for metabolic disease after early-life stress in women

Introduction

Materials and methods

Study data sources

Data harmonization

Study variables

Genetic variant

Synthetic stress measure

Primary phenotypes and independent variables

Analysis

Observation phase

Race and sex-stratified analysis in MESA

Hypotheses building phase

Hypotheses testing phase

Results

Observation phase

Hypothesis test 1

Hypothesis test 2

Hypothesis test 3

Discussion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Consent to participate

Additional information

Supplementary information

Supplemental Material

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links