Main

Obesity is a major risk factor for a variety of cardiometabolic disease outcomes and is the consequence of intricate interactions between genes and environment1,2,3,4. Genome-wide association studies (GWAS) identified more than 1,000 genetic loci associated with obesity risk5 and pointed to the central nervous system (CNS) as a key player in body weight regulation5,6. Despite these insights into the overarching biology, our understanding of the mechanisms that control body weight is still limited. This could be, in part, because GWASs have so far analyzed one adiposity trait at a time, most often body mass index (BMI). Such single-trait GWASs ignore the vast heterogeneity among individuals with obesity in, for example, etiology, life course trajectory and cardiometabolic comorbidities.

Therefore, while single-trait GWASs have identified numerous loci5, they likely represent only a subset of the mechanisms underlying obesity. Multi-trait GWASs, on the other hand, hold the potential to reveal more layers of the underlying biology. For example, we and others have performed genome-wide searches for loci that uncouple excess adiposity from cardiometabolic risk and identified 87 loci for which the adiposity-increasing allele associates with lower cardiometabolic risk7,8,9,10,11,12,13,14. These uncoupling loci have implicated peripheral mechanisms, such as fat distribution, adipocyte function and differentiation, and inflammation, but not the CNS15.

Here, we build upon our previous work11 and leverage individual-level data from 452,768 European participants in the UK Biobank to perform a comprehensive multi-trait genome-wide screen. We aimed to identify new genetic loci that uncouple adiposity from cardiometabolic comorbidities by analyzing three adiposity and eight cardiometabolic traits, including lipid, glycemic and blood pressure traits. We identified 205 genetic loci, harboring 266 lead variants, where the adiposity-increasing allele is associated with a lower risk of at least one cardiometabolic trait. Follow-up analyses identified genetic subtypes of obesity with distinct pathways, cardiometabolic risk and serum protein profiles.

Results

Genome-wide screen identifies 266 adiposity-increasing alleles with protective effects on cardiometabolic health

We performed a genome-wide screen in up to 452,768 individuals of European ancestry from the UK Biobank to identify adiposity-increasing loci that have protective effects on cardiometabolic health. We analyzed three adiposity traits (BMI, body fat percentage (BFP) and waist-to-hip ratio (WHR)) and eight cardiometabolic traits (total cholesterol (TC), LDL cholesterol (LDL-C), HDL cholesterol (HDL-C), triglycerides (TGs), random glucose and HbA1c levels, systolic blood pressure (SBP) and diastolic blood pressure (DBP)) (Methods, Fig. 1 and Supplementary Table 1). We first created 24 ‘bi-traits’, which combine an adiposity and a cardiometabolic trait into a new phenotype, obtained by subtracting standardized values of one of the eight cardiometabolic traits from standardized values of one of the three adiposity traits. High values for a bi-trait represent high adiposity but low levels of the cardiometabolic trait and vice versa (Methods and Extended Data Fig. 1). We next performed GWAS for each of the 24 bi-traits and each of the 11 single traits from which the bi-traits were derived. Variants associated with bi-traits at genome-wide significance (P < 5 × 10−10), and for which the associations with both corresponding single traits reached nominal significance (P < 10−4) and that also colocalized (when the same underlying signal drives the associations with both single traits as well as with the bi-trait (Methods)) were considered uncoupling variants. As such, we identified 266 unique lead variants located in 205 loci (more than 1 Mb apart) across the 24 bi-traits (Methods and Supplementary Tables 24). Of the 205 uncoupling loci we identified, 139 have not previously been reported in the context of cardiometabolic uncoupling, whereas the remaining 66 replicate the majority of the 87 previously reported loci7,8,9,10,11,12,13,14,15. The remaining 21 did not reach significance in our study due to a more stringent significance threshold and differences in study design, traits and disease outcomes studied. Altogether, our analyses identified more than double the number of previously reported loci, in part driven by a larger sample size, but also by the use of individual-level data that allowed us to design new, continuous uncoupling phenotypes, as opposed to summary statistics used in previous studies7,8,9,10,11,14 (Supplementary Tables 2 and 5).

Fig. 1: Study overview.
figure 1

Overall steps and traits analyzed in the study. Bi-traits are obtained by subtracting standardized values of a cardiometabolic trait from an adiposity trait.

A genetic risk score (GRSuncoupling) associated with higher adiposity, but lower cardiometabolic traits

To assess the aggregate effect of the 266 uncoupling lead variants (variants that were associated with increased adiposity and lower cardiometabolic effects or vice versa), we created a genetic risk score (GRS) consisting of the 266 lead variants, GRSuncoupling (Methods and Supplementary Table 6). To compare GRSuncoupling with a proxy for adiposity that captures overall body fat without factoring in the cardiometabolic component, we created another GRS consisting of 647 variants that were significantly (P < 5 × 10−10) associated with BFP, GRSBFP (Methods). We tested the association of both GRSs with 16 adiposity and eight cardiometabolic traits (Methods, Fig. 2a and Supplementary Table 7). Both GRSs were associated with higher values of most adiposity traits; with effect sizes for GRSBFP tending to be generally larger than those for GRSuncoupling; however, compared to the GRSBFP, the GRSuncoupling was associated with a more favorable fat distribution. Specifically, a higher GRSuncoupling was associated with a lower (β < 0) MRI-derived visceral adipose tissue (VAT) to abdominal subcutaneous adipose tissue volume (ASAT) ratio, trunk fat:gluteofemoral fat (GFAT) ratio and liver fat fraction, whereas a higher GRSBFP was associated with a higher WHR, trunk fat:GFAT ratio and liver fat fraction, but not with VAT:ASAT ratio. Consistently, the GRSuncoupling had a larger effect on gynoid than on android fat percentage (android FP), whereas the opposite was seen for the GRSBFP.

Fig. 2: Associations of genetic risk scores with anthropometric and cardiometabolic traits in the UK Biobank.
figure 2

a, Estimated per ten-allele change effect sizes of GRS–trait associations in UK Biobank European ancestry population for GRSuncoupling (in magenta) and GRSBFP (in blue). b,c, Estimated per ten-allele change effect sizes of GRS–trait associations in UK Biobank European ancestry population for each cluster-specific GRS (GRS 1–8, in red) and GRSBFP (in gray). Dashed circles indicate β = 0, indicating no association between each GRS and the trait. Points outside the circle represent positive GRS–trait associations, whereas those inside represent negative associations. The effect and reference alleles of GRS2, a cluster associated with lower WHR and higher blood pressure, were flipped to reflect a profile of higher adiposity and facilitate comparison with the other clusters.

For cardiometabolic traits, a higher GRSuncoupling was associated with a healthier profile; that is lower levels of LDL-C, TC, TG, HbA1c, glucose, SBP and DBP and higher HDL-C. A higher GRSBFP, on the other hand, was associated with higher TG, HbA1c, glucose, SBP and DBP and lower HDL-C, but no effect on LDL-C or TC (Fig. 2a and Supplementary Table 7). This association signature of the GRSuncoupling was significantly different from that of the GRSBFP (P < 0.0001) (Fig. 2a and Supplementary Table 7).

In men and women separately, the association of GRSuncoupling with adiposity traits tended to differ for the fat distribution traits. Specifically, the association with a more favorable fat distribution (WHR, VAT:ASAT, trunk fat:GFAT ratio) was more pronounced in women than in men (Extended Data Fig. 2). This was mostly due to a smaller effect on abdominal fat accumulation (waist circumference and android FP) in women compared to men, whereas the effect on peripheral fat accumulation (hip circumference and gynoid FP) was not substantially different between sexes. Despite these differences in fat distribution, associations with cardiometabolic traits were similar in men and women. No sex-specific effects for either anthropometric or cardiometabolic traits were observed for the GRSBFP (Extended Data Fig. 2).

GRSuncoupling associates with protective effect on cardiometabolic outcomes but with increased risk of weight-bearing diseases

To better understand the clinical impact of genetic predisposition to adiposity and cardiometabolic comorbidities, we performed a phoneme-wide association analysis (PheWAS) between each of the two GRSs (GRSuncoupling, GRSBFP) and 10,965 disease outcomes in the UK Biobank (n = 373,747) (Methods, Supplementary Table 8 and Fig. 3). Both scores were associated with increased risk of obesity. While GRSuncoupling is associated with a healthier cardiometabolic profile, it is also associated with an increased risk for other diseases, often weight-bearing diseases, to the same extent as GRSBFP. Specifically, a higher GRSuncoupling was associated with a lower risk of conditions related to lipoprotein metabolism (OR = 0.92, P = 1.4 × 10−89), essential primary hypertension (OR = 0.96, P = 1.7 × 10−27), noninsulin-dependent diabetes (OR = 0.94, P = 5.6 × 10−21), ischemic heart disease (OR = 0.96, P = 7.4 × 10−11), angina (OR = 0.96, P = 1.6 × 10−8) and acute myocardial infarction (OR = 0.96, P = 3.6 × 10−6) (Fig. 3), whereas a higher GRSBFP was associated with increased risk of these conditions; however, a higher GRSuncoupling was associated with an increased risk of noncardiometabolic, weight-related conditions, including cellulitis (OR = 1.05, P = 1.14 × 10−11), gonarthrosis (OR = 1.06, P = 9.9 × 10−25) and varicose veins (OR = 1.08, P = 7.9 × 10−26), similar to GRSBFP (Fig. 3 and Supplementary Table 8).

Fig. 3: Association of genetic risk scores with disease outcomes in the UK Biobank.
figure 3

Phenome-wide association results of disease outcomes and GRSBFP (in blue) and GRSuncoupling (in magenta) performed using the PHEnome Scan ANalysis Tool (PHESANT) in 373,747 European participants. Data are presented as OR ± CI. ORs represent effect size estimates per ten risk-allele increments. Case–control sample sizes for each outcome are presented in Supplementary Table 8.

The 266 uncoupling lead variants group into 8 clusters with distinct cardiometabolic signatures

We sought to identify genetically defined subgroups among the 266 uncoupling variants based on similarity of association with adiposity and cardiometabolic traits using NAvMix clustering analyses (Methods). As such, we identified eight clusters with distinct association signatures (Fig. 4). Specifically, variants in three clusters (4,7 and 8) were associated with increased adiposity and protective effects across multiple trait groups, whereas lead variants in the other five clusters were associated with increased adiposity with protective effects on only one of the cardiometabolic trait groups.

Fig. 4: Heatmap of the association of the lead variants with adiposity and cardiometabolic traits.
figure 4

Clustering of the 266 uncoupling lead variants using the NAvMix algorithm identified eight clusters. The color coding represents β values ranging from negative (blue) to positive (red). Associations are expressed by the BFP-increasing allele as the effect allele to enable comparison across traits.

We then calculated cluster-specific GRSs, GRS1–GRS8, which aggregate the effects of lead variants in each cluster and tested their association with anthropometric and cardiometabolic traits (Methods and Supplementary Table 6). These cluster-specific GRSs displayed a distinct association signature (Methods, Fig. 2b,c and Supplementary Table 7). For example, GRS4, GRS7 and GRS8 are associated with two or more cardiometabolic trait groups. Specifically, GRS4 is associated with higher overall adiposity (higher BFP, BMI, fat-free mass index (FFMI)) and favorable lipid and glycemic profiles (lower LDL-C, TC, TG, higher HDL-C and lower HbA1c), possibly mediated through a more favorable fat distribution (lower WHR, VAT:ASAT and android FP). GRS7 and GRS8 were mainly associated with lower TG and higher HDL-C, lower HbA1c and blood pressure (Fig. 2b,c and Supplementary Table 7). In addition to the stronger effects on HDL-C and TG, GRS7 differs from GRS8 in its association with adiposity. GRS7 has a strong effect on body fat distribution (for example lower WHR, VAT:ASAT and higher gynoid and lower android FP). On the other hand, GRS8 is associated with higher overall adiposity (BFP, BMI, gynoid and android FP, ASAT and SAT), lower FFMI, without an obvious effect on fat distribution (Fig. 2b and Supplementary Table 7). GRS3 and GRS5 are similarly associated with higher overall adiposity, with GRS3 being associated with lower blood pressure (SBP and DBP) and GRS5 with lower glycemic traits (glucose and HbA1c). GRS1 is associated with greater overall body size (BMI, FFMI and BFP), and with lower TG and higher HDL-C levels, but also with higher levels of LDL-C, TC and HbA1c. Of all cluster-specific GRSs, GRS6 has the strongest effect on overall body size. On the cardiometabolic side, GRS6’s profile is opposite to that of GRS1; as GRS6 is associated with lower LDL-C and TC levels, but also with higher TG and lower HDL-C levels, and high blood pressure and glycemic traits (Figs. 2b,c and 4). Finally, GRS2 is the only cluster where the adiposity effect is driven mainly by an association with WHR. A higher GRS2 is associated with higher gynoid fat accumulation (WHR, VAT:ASAT ratio and trunk fat:GFAT ratio), but lower blood pressure. In terms of disease outcomes, we observed cluster-specific associations consistent with the characteristics of each cluster (Supplementary Table 8).

These findings highlight the potential for GRSs to identify subgroups among individuals with obesity. The GRSuncoupling quantifies people’s risk of obesity without cardiometabolic comorbidities, whereas the GRSBFP quantifies people’s risk of obesity with cardiometabolic comorbidities. The cluster-specific GRSs provide further granularity to this subgroup identification.

GRSuncoupling association signature validated in an independent cohort

We validated the association profiles of GRSuncoupling, the eight cluster-specific GRSs (GRS1–8) and GRSBFP with adiposity and cardiometabolic traits in the Atherosclerosis Risk in Communities (ARIC) study, a population-based cohort (n = 15,792 individuals) (Methods and Supplementary Table 9). Similar to the UK Biobank, in ARIC, GRSuncoupling was associated with a healthier cardiometabolic profile with lower levels (β < 0) for glucose, TC, LDL-C, TG, SBP and DBP and higher HDL compared to GRSBFP. Both GRSs were associated with high adiposity including BMI, waist and hip circumference, and WHR, albeit GRSBFP generally had higher effect sizes (Extended Data Fig. 3a and Supplementary Table 10). Similarly, the association signature of the cluster-specific GRSs with adiposity and cardiometabolic traits corresponded to that of the UK Biobank (Extended Data Fig. 3b,c and Supplementary Table 10).

GRSuncoupling is associated with lower T2D and CHD incidence

The implementation of GRSs in other cohorts is straightforward and allows for the assessment of individuals’ predisposition to comorbidities. We tested associations of GRSuncoupling and GRSBFP with incident coronary heart disease (CHD) and T2D in ARIC and the Mount Sinai BioMe Biobank, an electronic medical record-linked biobank (n = 50,000). A higher GRSuncoupling was associated with a significantly lower risk of both incident CHD (hazard ratio (HR) 0.95; 95% CI 0.92–0.98, per ten-allele change in GRS) and T2D (HR 0.96, 95% CI 0.92–0.99) (Supplementary Table 11), whereas a higher GRSBFP was associated with higher risk of developing T2D (HR 1.04, 95% CI 1.01–1.07), but not CHD (HR 1.01, 95% CI 0.99–1.04).

To evaluate whether lifestyle factors influence the effects of GRSuncoupling and GRSBFP on cardiometabolic disease risk, we stratified ARIC participants by physical activity level (Methods). Within the physically active group, the protective effect of GRSuncoupling on T2D risk was enhanced (HR 0.88, 95% CI 0.80–0.97) (Supplementary Table 12), whereas the adverse association of GRSBFP was attenuated (HR 0.99, 95% CI 0.92–1.05). These findings suggest that physical activity may modulate the effects of GRSuncoupling and GRSBFP on T2D susceptibility. We did not observe the same effect for CHD.

GRSuncoupling already predisposes to higher adiposity but a more favorable cardiometabolic profile early in life

Among 3,457 Danish children and adolescents from the HOLBAEK study (Methods and Supplementary Table 13), the GRSBFP and GRSuncoupling were associated with higher overall adiposity (Extended Data Fig. 4a). Within the population-based cohort subset of 1,811 participants, both GRSs showed associations with BMI, to the same extent as those observed in the UK Biobank (for example, GRSuncoupling: 0.09 in UK Biobank and 0.08 in HOLBAEK) (Supplementary Table 14). Participants with a higher GRSBFP were more likely to present with a dysglycemic profile (higher HOMA-IR, insulin, C-peptide), whereas the GRSuncoupling associated with a neutral glycemic profile and lower alkaline phosphatase (ALP) levels (Extended Data Fig. 4b). Moreover, having a higher GRSuncoupling associated with a lower risk of dyslipidemia (OR 0.89, 95% CI 0.82–0.97) (Extended Data Fig. 4c).

Uncoupling loci and overall adiposity loci have distinct tissue and pathway enrichment profiles

We next performed enrichment analyses for the 266 uncoupling lead variants, using data-driven expression prioritized integration for complex traits (DEPICT) (Methods), to identify the tissues and gene sets in which potential candidate genes may be acting and compared these with the results for the 647 BFP variants. Genes located in BFP-associated loci were mainly enriched in the CNS (P = 0.002), consistent with previous observations for BMI-associated loci6. In contrast, genes located in the uncoupling loci were not enriched in the CNS (P = 0.74) and were mostly enriched in adipose tissue (P = 7 × 10−7), and in cardiovascular (P = 1.4 × 10−5), digestive (P = 7.8 × 10−4), endocrine (P = 2.5 × 10−4) and musculoskeletal systems (P = 2.5 × 10−5) (Fig. 5 and Supplementary Table 15).

Fig. 5: Physiological systems, tissue and cell-type enrichment analyses for uncoupling loci and general body fat percentage loci.
figure 5

DEPICT results were based on summary statistics generated from the association analyses of the 266 lead single-nucleotide polymorphisms (SNPs) with BFP. Tissues enriched after correction for multiple testing at significance threshold (FDR < 0.05) are highlighted in bright blue. The dashed line represents the nominal significance level (P < 0.05). For full results, see Supplementary Table 15.

In gene set enrichment analyses, we replicate previous findings for uncoupling loci, implicating insulin signaling, glucose homeostasis, lipid metabolism, immune and inflammatory response, and pathways related to adipose tissue biology (Supplementary Table 16). In addition, we identify gene sets not previously implicated, including those related to vascular development, skeletal muscle development, liver development, circadian rhythm, sex differentiation, among others, thereby implicating new biological processes.

On the other hand, genes in BFP loci were enriched for neurodevelopment and neuron differentiation including pathways related to brain development and regulatory mechanisms of the nervous system, consistent with previous literature6 (Extended Data Fig. 5 and Supplementary Table 16).

Cluster-specific gene set enrichment analyses implicated biological pathways that are consistent with their cardiometabolic health profile. For example, triglyceride lipase activity was highlighted for cluster 1, cardiovascular-related gene sets for cluster 2, muscle-related gene sets for cluster 4, transcriptional regulation of white adipose tissue regulation for cluster 7 (Extended Data Fig. 6 and Supplementary Tables 16 and 17).

The uncoupling GRSs are defined by distinct plasma proteomic profiles

To further characterize the biological signature of the GRSs, we assessed their association with 2,920 Olink-derived plasma proteins in the UK Biobank (Methods and Supplementary Tables 18 and 19). In line with the finding that 80% of these proteins are associated with measured BMI in this population16, both the GRSBFP and GRSuncoupling were associated (false discovery rate (FDR) < 0.005) with a substantial number of protein levels: 915 (31.3%) and 337 (11.5%), respectively, of which 208 proteins were associated with both GRSs. Of these 208 overlapping proteins, 176 (85%) showed directional consistency in effect estimates, likely reflecting primarily adiposity-driven effects (Extended Data Fig. 7 and Supplementary Table 18). Notable examples include proteins shown to associate most strongly with higher BMI16, such as leptin (LEP), fatty acid binding protein 4 (FABP4) and pro-adrenomedullin (ADM). On the other hand, 32 proteins showed directionally opposing effects, potentially highlighting health-driven effects (Fig. 6). These include proteins involved in lipid transport (for example LDLR, APOA1 and APOF), hormonal status (for example IGFBP1, IGFBP2, SHBG and FGF21) or thermogenesis (for example LDLR, GHR, SHBG, CKB and LAMP2). A total of 129 proteins were associated with GRSuncoupling, but not with GRSBFP (Extended Data Fig. 8 and Supplementary Table 18), including neuropeptides (AGRP, NPY and BDNF), hormones (GCG and ADIPOQ), lipoprotein lipase (LPL) and myostatin (MSTN).

Fig. 6: Plasma proteins with directionally opposing associations with body fat percentage and uncoupling genetic risk scores in the UK Biobank.
figure 6

Estimated per ten-allele change effect sizes of GRS–protein associations in UK Biobank European ancestry population for GRSuncoupling (in magenta) and GRSBFP (in blue), for rank-based inverse-normal transformed Olink-derived plasma protein concentrations (n = 32).

Gene prioritization identifies genes implicated in various pathways

To identify putative causal genes within the 205 uncoupling loci, we used 14 bioinformatic and functional genomics tools. We prioritized the gene(s) most likely to be causal in each locus and ranked them based on the number of genomic tools that provided support for the given gene. Of the 1,623 candidate genes, 82 were considered high-scoring genes (Methods and Supplementary Table 20). These include genes previously described to be associated with opposite effects on obesity and cardiometabolic traits, such as PPARG7,17,18,19, FAM13A9,11, PEPD11,19,20,21 and IRS1 (refs. 11,19,22,23). The two highest scoring genes, prioritized by 11 tools, were PCSK1 and SMG6 (Supplementary Table 20). While PSCK1’s role in obesity is well established24,25, there is no obvious functional role for either PCSK1 or SMG6 in uncoupling of obesity from cardiometabolic health.

Other high-scoring genes have been implicated in adipose tissue expandability (PPARG26,27, IRS1 (refs. 28,29,30), RSPO3 (ref. 31), FAM13A32, CTSS33,34, TIMP4 (refs. 35,36), PEPD19,20,21, JMJD1C37, CSK38,39, HLX40, MED19 (ref. 41), SENP2 (ref. 42), MLXIPL43,44, ARNT45,46, PIK3R1 (ref. 47) and PNPLA2 (ref. 48)), insulin secretion and beta-cell function (PIK3R149, GPRC5B50, MEF2D51, FBN1 (ref. 52), LDB1 (ref. 53), SENP2 (ref. 54), MAPT55, PBX1 (ref. 56), beiging of white adipose tissue and brown adipose tissue function (CSK39, SLC22A3 (ref. 57), SENP2 (ref. 58), MED19 (ref. 41), LDB1 (ref. 59), HLX40 and CRHR1 (ref. 60)) and inflammation and fibrosis (PEPD20,21, BCN2 (ref. 61), MST1 (refs. 62,63,64), GPRC5B65, MAFF66, CTSS34,67, NPEPPS68, CSK38 and FBN1 (ref. 52,69)) (Supplementary Table 21). Many of these genes and pathways have been described before in the context of uncoupling11, but we also identify genes involved in pathways that have not previously been implicated, such as hepatic control of glucose homeostasis (ARNT70, CTSS34,71, YWHAB72, FBN1 (ref. 73), LDB1 (ref. 74) and SENP2 (ref. 75)), hepatic lipid accumulation (JMJD1C76, NPEPPS77 and MLXIPL44,78) and skeletal muscle growth and function (PPP3R1 (ref. 79), CTSS80,81, CXXC5 (ref. 82), NPEPPS83, SENP2 (ref. 84) and FBN1 (ref. 69)) (Supplementary Table 21).

Prioritized genes in loci that belong to the same cluster tend to share related pathways. For example, prioritized genes in cluster 7 are implicated in regional adipose expandability, including FAM13A32 and RSPO3 (ref. 31), consistent with lower WHR and improved lipid profile, being the defining characteristics for cluster 7 (Fig. 3 and Supplementary Table 21). Another example is for cluster 3, characterized by lower SBP and DBP, which contains several genes implicated in beiging of white adipose tissue and brown adipose tissue function, including CSK39, HLX40, LDB1 (ref. 59), MED19 (ref. 41) and SENP2 (ref. 58) (Fig. 4 and Supplementary Table 21). Brown adipose tissue has been previously linked to lower odds of cardiometabolic diseases, including hypertension85. The overall protective cluster 8 has a more diverse biological basis with multiple genes implicated in adipose expandability, including PPARG26,86, IRS1 (refs. 23,28,29,30,87), TIMP4 (ref. 36), CTSS34, ARNT45,46 and PIK3R1 (ref. 47). TIMP4 is also implicated in nutrient uptake35 and ARNT and CTSS in hepatic glucose control70,71. Therefore, our prioritized genes are implicated in known and/or novel pathways that plausibly contribute to the uncoupling of adiposity from cardiometabolic risk.

Discussion

Obesity is a highly heterogeneous disease that cannot be captured by one single adiposity trait. Here, we performed a multi-trait gene-discovery analysis to account for heterogeneity in cardiometabolic comorbidities. We designed continuous uncoupling phenotypes that range from high adiposity with a healthy cardiometabolic profile to low adiposity with an unhealthy cardiometabolic profile. GWASs of these new phenotypes identified 266 independent variants across 205 genomic loci where the adiposity-increasing allele is also associated with a lower cardiometabolic trait. Furthermore, the 266 variants cluster into eight groups, each representing a genetic subtype with a distinct cardiometabolic risk profile, pointing to specific underlying pathways.

The genetic uncoupling score that aggregates the uncoupling effects of the 266 variants (GRSuncoupling) was associated with a healthier cardiometabolic profile, distinctly different from that of the genetic adiposity score (GRSBFP). The protective effects of GRSuncoupling may be partially mediated through an association with a more favorable fat distribution characterized by a lower WHR and lower MRI-derived VAT/ASAT and trunk fat/GFAT, in particular among women, compared to GRSBFP. These findings corroborate previous observations with greater power, including the sex-specific observations9,10,12,88,89. With such distinct cardiometabolic risk profiles, these genetic scores may facilitate early risk stratification of individuals with obesity allowing for a timely and personalized prevention. This genetic risk stratification was already apparent in childhood and adolescence. Moreover, the uncoupling score was also associated with lower risk of prevalent and incident T2D and CHD in adulthood; however, while the cardiometabolic risk is reduced among individuals with a high uncoupling score, risk for diseases, such as cellulitis, arthrosis, sleep disorders and phlebitis among others, is comparable to those with a high adiposity score, consistent with a previous report90. These observations underscore that the weight-bearing impact of a high body weight on overall health remains, even when cardiometabolic risk is lower.

In previous studies, we and others have identified three clusters of uncoupling loci11,14. Here, with many new uncoupling loci, we replicate the previously reported clusters with greater delineation and identify several new clusters. For example, in previous studies, one of the uncoupling clusters was linked to favorable fat distribution. We identified two such clusters (4 and 7) that, however, differ in the strength of associations with each of the phenotypes. Most new clusters are characterized by the fact that the adiposity-increasing loci are associated with specific cardiometabolic traits, for example, healthier glycemic profile (cluster 5), lower blood pressure (cluster 3) and more granularity (clusters 1, 6, 7 and 8 with four distinct lipid profiles). Our findings extend current knowledge by underscoring that there is substantial heterogeneity among uncoupling loci. The clusters of loci represent distinct genetic subtypes that suggest a range of mechanisms underlying the uncoupling of obesity from its cardiometabolic comorbidities.

We identified shared and distinct proteomic association signatures for GRSuncoupling versus GRSBFP. For the majority (85%) of the 208 proteins associated with both genetic scores, the direction of the association was consistent across both scores. This suggests that, for these proteins, levels are driven by adiposity. For example, levels of leptin and adipsin/complement factor D, two adipokines known to be elevated in individuals with obesity irrespective of their cardiometabolic health status91, increased with the increase in both uncoupling and adiposity scores. A subset of proteins (15%) showed directionally opposite associations between the two scores, capturing health-driven effects. For example, higher plasma levels of IGFBP1 and IGFBP2 were associated with a higher uncoupling score (higher adiposity and improved cardiometabolic health), but with a lower adiposity score, representing lower adiposity and improved cardiometabolic health, corroborating previous reports demonstrating that lower levels of IGFBP1 and IGFBP2 are associated with hypertriglyceridemia and insulin resistance92,93,94. Also, lower LDLR levels and higher SHBG levels were associated with a higher uncoupling score, consistent with a metabolically healthy state as observed by others95,96,97,98,99,100.

Higher levels of circulating LDLR may indicate increased hepatic LDLR shedding and reduced lipoprotein clearance101. GRSBFP was associated with higher levels of circulating LDLR, which has been linked to elevated plasma triglycerides and LDL-C95,101—factors contributing to cardiometabolic complications such as inflammation, atherosclerosis and myocardial infarction96,97. In contrast, GRSuncoupling and lipid-protective clusters 1, 7 and 8 were associated with lower plasma LDLR, consistent with a healthier metabolic profile. Reduced circulating LDLR in these subtypes may reflect greater hepatic LDLR availability and more efficient lipid clearance, suggesting enhanced hepatic lipid clearance as a potential mechanism underlying the cardiometabolic benefits observed in these subtypes.

Several proteins were exclusively associated with a higher uncoupling score, such as ADIPOQ and LPL. ADIPOQ has been shown to be higher in metabolically healthy individuals, promoting insulin sensitivity and having cardioprotective and anti-inflammatory effects91,102,103, whereas LPL plays a role in triglyceride clearance and lipid distribution, potentially contributing to a metabolically healthy state104,105. Myostatin levels decreased with an increasing uncoupling score. Myostatin is considered a drug target for sarcopenia and muscle mass preservation in combination with weight-loss drugs106,107, implicating skeletal muscle mass and function in metabolic health91.

In contrast to a role of the CNS in overall obesity, tissue enrichment analysis for genes in uncoupling loci pointed to adipose tissue, cardiovascular, digestive, endocrine and musculoskeletal systems, implicating pathways previously reported for uncoupling (insulin signaling, glucose homeostasis, lipid metabolism, immune and inflammatory response and adipose tissue biology)7,11,15, but also revealing new ones (for example vascular development, skeletal muscle development, liver development, kidney development, cartilage development, circadian rhythm, sex differentiation and response to hypoxia).

To pinpoint candidate causal genes within each uncoupling locus, we established a bioinformatics and functional genomics gene prioritization pipeline. The highest scoring genes provide further support for biological processes such as adipose tissue expandability, fat distribution and brown adipose tissue function. Other newly identified genes highlight emerging mechanisms, such as inflammation and fibrosis, hepatic glucose control and lipid accumulation, and muscle function. For example, liver specific ablation of Jmjd1c, a gene prioritized for rs10761785, which is associated with higher BMI, WHR and lower LDL-C and TC levels, decreases lipogenesis and protects from insulin resistance despite obesity in mice76. Knockout of Arnt, a gene prioritized for rs10888393 and associated with increased BFP, high HDL-C and low HbA1c, may have a tissue-specific role affecting adiposity and cardiometabolic health. Fat-specific Arnt knockout mice are leaner and protected against diet-induced glucose intolerance and obesity, whereas hepatocyte-specific Arnt knockout mice have increased fasting glucose and impaired glucose tolerance45,46,70.

Our study has several limitations. First, it was conducted exclusively in individuals of European ancestry, so the generalizability of our findings to other populations remains to be determined. Second, our proteomic profiling analysis was constrained by the set of proteins assayed and may not capture all relevant proteins driving adiposity and/or health effects. Third, gene prioritization relied on existing annotations and previous knowledge, which may bias against novel genes with currently unknown functions. Nonetheless, we employed a comprehensive strategy that integrates predictions from multiple bioinformatic and functional genomic tools to help mitigate these limitations.

Taken together, by designing continuous uncoupling traits, we substantially increased statistical power for discovery, resulting in a more than twofold increase in the number of uncoupling loci identified. Gene prioritization and pathway and protein analyses underscore the importance of a range of peripheral pathways in uncoupling. We provide further support for adipose tissue expandability, insulin secretion and beta-cell function, beiging of white adipose tissue, inflammation and fibrosis, and also highlight mechanisms not previously implicated in uncoupling, such as hepatic lipid accumulation, hepatic control of glucose homeostasis and skeletal muscle growth and function. Individuals with a high genetic uncoupling score display a protective cardiometabolic risk profile despite having a higher risk of obesity. Their profile is distinct from that of individuals with a high overall adiposity score who have an increased cardiometabolic risk. Notably, we show that this risk stratification is already evident in childhood and adolescence. The overall genetic uncoupling score and its eight derived sub-scores advance the genetic subtyping of obesity by delineating distinct cardiometabolic risk signatures. The GRSs corresponding to the eight genetic subtypes can be readily implemented in other populations. These genetic subtypes may form the basis of subtype-stratified treatment, prevention and prognosis and may ultimately contribute to precision medicine in obesity.

Methods

Ethical approval

UK Biobank data access was approved by the UK Biobank through project application number 1251. UK Biobank has obtained approval from a committee and researchers do not need separate approval. The HOLBAEK study was approved by the ethics committee of region Zealand, Denmark (SJ-104) and by the Danish Data Protection Agency (REG-043-2013). For ARIC, all relevant ethical guidelines have been followed and any necessary Institutional Review Board (IRB) and/or ethics committee approvals have been obtained.

Study populations

UK Biobank

The UK Biobank is a prospective cohort study with extensive genetic and phenotypic data, collected in approximately 500,000 individuals, aged between 40–69 years. Participants were enrolled from April 2007 to July 2010 at one of 21 assessment centers across the UK. Baseline information, physical measures and biological samples were collected according to standardized procedures108,109,110. Questionnaires were used to collect health and lifestyle data. Study design, protocols, sample handling and quality control have been described in detail elsewhere108,109,110.

Ancestry

We restricted analyses to individuals of European ancestry, defined by using k-means clustering111. In brief, we calculated principal components and their loadings for 488,377 genotyped participants based on the intersection of ~121,000 quality-controlled variants with the 1000 Genomes Project reference panel (phase 3 v.5). We projected the 1000 Genomes reference panel dataset on the principal-component analysisA (PCA) loadings from the UK Biobank. We then applied k-means clustering to the UK Biobank PCA and the projected 1000 Genomes reference panel dataset, prespecifying four clusters. Individuals that clustered with the EUR 1000 Genomes cluster were assigned as European ancestry.

Phenotypes

We analyzed 11 single traits; three adiposity traits: BMI, BFP and WHR; and eight cardiometabolic traits: TC, LDL-C, HDL-C, TGs, glucose, HbA1c levels, SBP and DBP. All phenotypic data used for analyses were collected at the baseline visit. BMI was calculated as weight (kg) divided by height squared (m2). WHR was created by dividing the waist circumference by the hip circumference. Individuals with waist and hip measurements <50 cm and >150 cm were removed. For individuals on lipid-lowering medication, LDL-C was adjusted by dividing the LDL-C value by 0.7 and TC by dividing by 0.8 (refs. 112,113). TC and TG were log transformed. For the GWAS analysis of glucose and HbA1c, individuals receiving insulin therapy, (n = 4,697) and those with glucose > 15 mmol l−1 or HbA1c > 100 were excluded (n = 804). SBP and DBP were created by calculating the average of two measurements at the baseline visit and adjusted for medication use by adding 15 mm Hg on the SBP value and 10 mm Hg on the DBP value114,115. Upon exploring the data, we identified a subgroup of women recruited in one center, Sheffield, who deviated from the rest of the data. As a result, we excluded 121 women recruited at Sheffield and that had BFP > 55% and BMI > 40 kg m−2. Additionally, women who were pregnant at the time of recruitment were excluded. Finally, 452,768 participants (207,204 men and 245,564 women) were included in the analyses of lipid traits and 448,071 (204,377 men and 243,694 women) for glycemic traits.

Genotypes

Participants were genotyped on two arrays. The majority (n= ~450,000) were genotyped using the UK Biobank Axiom Array and the remaining participants (n= ~50,000) were genotyped using the UK BiLEVE Array, which has >95% of the variants in common with the UK Biobank Axiom Array. Quality control, performed by the UK Biobank team, included testing for batch, plate and array effects, Hardy–Weinberg equilibrium and discordance across control replicates. Samples of poor quality, with high missingness rate and heterozygosity, were removed110. Missing single-nucleotide polymorphisms (SNPs) were imputed using the UK10K reference panel by the UK Biobank team.

Atherosclerosis Risk in Communities study

The ARIC study is a prospective cohort study of 15,792 individuals, including 11,478 white individuals and 4,314 African American individuals, from four US communities (Forsyth County, NC; Jackson, MS; suburbs of Minneapolis, MN; and Washington County, MD). Participants aged between 45–64 years at baseline and recruited between 1987–1989, received extensive examinations, including medical, social and demographic data. A detailed description of the ARIC study design is published elsewhere116. Adiposity and cardiometabolic traits considered in this study were measured at the baseline visit.

Incident CHD was ascertained through a combination of death certificate reviews, hospital records and annual participant follow-ups to identify hospitalizations and deaths occurring during the previous year117. Incident CHD cases were defined as definite fatal CHD, definite or probable myocardial infarction (MI), silent MI between examinations as determined by ECG, or coronary revascularization. T2D cases were identified at baseline and during follow-up visits using glucose measurements, self-reported physician diagnosis of T2D or use of diabetes medication. T2D was defined in accordance with World Health Organization guidelines as a fasting serum glucose ≥7.0 mmol l−1, a non-fasting serum glucose ≥11.0 mmol l−1 (when fasting samples were unavailable) or the use of blood glucose-lowering medications. Individuals with T2D at baseline were removed from this analysis.

Blood was drawn for DNA extraction at the baseline exam. Genotyping within ARIC was performed on the Affymetrix 6.0 DNA microarray (Affymetrix) and genotype data that passed quality control filters were imputed into the 1000 Genomes phase 3 reference data using IMPUTE v.2.3.2 (refs. 118,119). The ARIC study was approved by the IRBs at each site and written informed consent was obtained from all study participants.

BioMe Biobank

BioMe is an ongoing electronic medical record-linked biobank with more than 60,000 patients enrolled through the Mount Sinai Health System in New York. BioMe is a multiethnic biobank comprising individuals of African, Hispanic, European, Asian and other ancestries. Genotyping data on the Global Screening Array (GSA-24v1-0_A1) is available for 32,595 individuals. The data were cleaned for duplicate samples, discordant sex, heterozygosity rate that exceeded 6 × s.d. from the population mean, call rate <95% at the site and individual level and deviation from the Hardy–Weinberg equilibrium. After quality control (QC), 31,705 individuals and 604,869 variants were retained. Imputation of the GSA array was performed using impute2 (ref. 120) using the 1000 Genomes reference panel. In the BioMe Biobank, CHD cases were identified using ICD-9 and ICD-10 codes, procedure codes for bypass surgery or percutaneous transluminal coronary angioplasty or documentation of abnormal cardiac catheterization. T2D cases were defined using the eMERGE phenotyping algorithm121. Baseline was defined as first outpatient visit after 1 January 2011, with at least 1 year enrolled in the Mount Sinai Health system. Individuals with CHD or T2D cases occurred before or within 1 year of enrollment were classified as prevalence cases and excluded from analyses. The BioMe Biobank received ethics approval from the IRB of the Mount Sinai School of Medicine.

The HOLBAEK study

The HOLBAEK study consists of two Danish cohorts: a cohort from the Children’s Obesity Clinic of the Holbaek Hospital, comprising children and adolescents with a BMI at or above the 90th percentile (BMI s.d. score (SDS) ≥ 1.28, overweight or obesity) based on Danish reference standards122; and a population-based cohort recruited from schools in 11 municipalities across Region Zealand123. In the obesity clinic cohort, anthropometrics were measured at clinical examinations, whereas the population-based group was assessed in a mobile laboratory by medical professionals. Details on the cohort and phenotypic data are published elsewhere124. We considered 23 traits for our cross-sectional analyses (5 binary and 18 continuous). BMI SDS was derived using the least mean squares method, referenced against Danish reference standard122. Waist-to-height ratio (WtHR) SDS and WHR SDS were calculated based on age- and sex-specific reference values from NHANES125. Obesity was defined as having a BMI SDS ≥ 2.33 (99th percentile and above126). Hyperglycemia, insulin resistance, dyslipidemia and hypertension were defined according to published guidelines127,128,129,130. HOMA-IR was calculated as (insulin mU l−1 × glucose mM)/22.5. Exclusion criteria for the current analyses included individuals younger than 5 years or older than 19 years, those taking medications for obesity or diabetes, participants meeting T2D criteria based on fasting plasma glucose levels ≥7.0 mmol l−1 or HbA1c ≥ 48 mmol mol−1, and individuals without genotyping data. The final number of participants of European ancestry analyzed was 1,646 for the obesity clinic cohort (45% boys, median age 11.7 years (Q1 9.6 years–Q3 14.0 years)) and 1,811 for the population-based cohort (43% boys, median age 11.6 years (Q1 8.9 years–Q3 14.5 years))

Genotype data

Genotyping in the HOLBAEK study was conducted using Illumina Infinium HumanCoreExome-12 v.1.0 or HumanCoreExome-24 v.1.1 BeadChips, analyzed on the Illumina HiScan system. Genotype calling was performed using the Genotyping Module (v.1.9.4) within GenomeStudio software (v.2011.1; Illumina). Phasing was done with EAGLE2 (v.2.0.5), and imputation was carried out using PBWT to the Haplotype Reference Consortium (HRC1.1) via the SANGER imputation server. As HRC1.1 does not include insertions and deletions and does not fully overlap with imputed UK Biobank genotype data, up to 20% of GRSBFP and GRSuncoupling variants were not available. We therefore identified high linkage disequilibrium (LD) (R2 > 0.8) proxies, based on LD information from 20,000 unrelated (KING < 0.0884), randomly sampled UK Biobank participants. These participants self-identified as ‘White British’ and clustered with this group in PCA. Genotype QC for this reference panel included filtering on SNP missingness <5% and INFO > 0.3, and we excluded participants with sex chromosome anomalies, sex discrepancies, heterozygosity outliers and genotype call rate outliers. Proxies were identified for 123 of 142 missing variants for GRSBFP and 56 of 58 missing variants for GRSuncoupling, with a median R2UKB of 0.99. The final GRSs therefore included 628 and 264 variants, respectively, with a median INFOHOLBAEK of 0.98. The GRSs were scaled to per ten-allele change.

Statistical analysis

Using linear and logistic regression, we assessed the association between both GRSs and the 23 outcome traits, adjusting for age, sex, four PCs and genetic batch (n = 3). The continuous traits were log transformed (except for BMI, WHR, WHtR, SBP and DBP SDS) and then standardized to unit variance and s.d. of one. All analyses were stratified by cohort (obesity clinic/population) and estimates pooled using inverse-variance weighting. In addition, all analyses were further stratified by sex (Supplementary Table 11). P values were adjusted using Benjamini–Hochberg correction across the 23 traits.

GWAS analyses

Our GWAS analyses aimed to identify variants that uncouple adiposity from its comorbidities. As such, we used 11 single traits: three adiposity traits: BMI, BFP and WHR; and eight cardiometabolic traits: TC, LDL,-C HDL-C, TG, glucose, HbA1c, SBP and DBP. First, we derived residuals for each of the single traits, for men and women separately, using linear regression analyses adjusting for age, age2, genotyping array and sequencing center. Next, the distributions of residuals of the 11 single traits were inverse normalized. The derived s.d. scores have a mean of 0 and a s.d. of 1. We then created pairwise composite traits with one of the anthropometric traits and one of the cardiometabolic traits by subtracting the s.d. scores of the cardiometabolic trait (TC, LDL-C, HDL-C, TG, glucose, HbA1c, SBP and DBP) from those of the adiposity traits (BMI, BFP and WHR), resulting in 24 bi-traits. For example, a BMI–TC bi-trait is created as follows: BMIsdcores – TCsdcores (Extended Data Fig. 1). For any given individual, a positive score for the BMI–TC bi-trait indicates that the person has a relatively higher BMI compared to their TC levels, whereas a negative score means that they have a relatively lower BMI compared to their TC levels. As HDL-C correlates positively with cardiometabolic health, HDL-C needs to be treated differently when creating the bi-traits. We have addressed this by flipping the sign of HDL z-scores before creating the bi-traits. This adjustment ensures that subtracting HDL-C from an adiposity trait results in a bi-trait where a positive or negative score retains the same meaning as for the other bi-traits. In other words, BMI–HDL, is in fact BMI–(–HDL). Therefore, a positive score indicates high BMI and high HDL, which corresponds to high adiposity and a protective cardiometabolic effect. Conversely, a negative score indicates low adiposity and high cardiometabolic risk (low HDL).

For BMI as the adiposity measure, we have eight bi-traits: BMI–TC, BMI–LDL-C, BMI–HDL-C, BMI–TG, BMI–glucose, BMI–HbA1c, BMI–SBP, BMI-DBP. Similarly, for BFP: BFP–TC, BFP–LDL-C, BFP–HDL-C, BFP–TG, BFP–glucose, BFP–HbA1c, BFP–SBP, BFP–DBP, and for WHR: WHR–TC, WHR–LDL-C, WHR–HDL-C, WHR–TG, WHR–glucose, WHR–HbA1c, WHR–SBP and WHR–DBP. As such, we performed 70 mixed model GWAS tests analyzing 24 bi-traits and 11 single traits within men and women separately (35 traits × 2) using BOLT-LMM v.2.4.1131 adjusting for the first ten principal components.

Variants with MAF < 0.1% were removed from the analyses. For imputed variants, an INFO score threshold = 0.3 was used. For each trait, we used METAL to meta-analyze the GWAS results of men and women using fixed effects132. Using LD score regression v.1.0.1 (LDSC), we observed mild inflation, with LDSC intercepts ranging between 1.09 and 1.18 and LDSC ratios indicating that 10–13% of the inflation observed can be ascribed to causes other than a polygenic signal. Therefore, we applied genomic control by adjusting the s.e. for the LDSC intercept. Specifically, for each trait, we used the corrected s.e.: s.e.corrected = s.e.GWAS × sqrt (LDSC intercept) of the sex-specific GWAS as the s.e. column in the inverse-variance weighted meta-analysis of men and women. The new corrected LDSC intercepts ranged from 1.03 to 1.08. LDSC ratio indicated that no more than 5.7% of the inflation observed can be ascribed to causes other than a polygenic signal.

Conditional analyses

To identify additional independent signals in associated loci for the 24 bi-traits, we used GCTA v.1.94.4 (ref. 133). We performed approximate joint and conditional SNP association analyses in each locus, which takes into account LD between SNPs. For each locus, we defined a 2-Mb region encompassing 1 Mb on both sides of the lead SNP. Lead SNPs (P < 5 × 10−10) identified in known long-range high-LD regions were treated as a single large locus in the GCTA analysis134. We used unrelated European ancestry participants from the UK Biobank as the reference sample to acquire conditional P values for association. Conditional independent variants that reached P < 5 × 10−10 were considered as index SNPs. We additionally restricted to SNPs that were genome-wide significant (P < 5 × 10−10) in the original summary statistics.

Identification of genome-wide significant loci

Our genome-wide significance threshold of P < 5 × 10−10 accounted for the analyses of 24 bi-traits, across women and men. To identify genome-wide associated loci and their respective lead SNPs, we proceeded as follows. We started with the independent variants that resulted from the conditional analyses. To define a locus associated with increased adiposity and protective effects on cardiometabolic traits, we retained only variants for which the single-trait GWASs for both traits reached marginal significance, defined as P < 10−4, and for which the association was opposite to the established phenotypic correlation. For example, for BMI and LDL-C, we selected variants for which the BMI–LDL-C bi-trait reached genome-wide significance, and subsequently extracted the variants for which the single-trait associations with BMI and with LDL-C reached P < 10−4 and their direction of association was opposite of the established phenotypical positive association. As such, we identified 1,103 association signals across the 24 bi-traits. Next, we applied a Bayesian divisive clustering algorithm, HyPrColoc135 to determine, for each of the 1,103 association signals, whether the associations across the bi-traits and both single traits colocalize. In the above example, we would only keep loci of which the associations colocalize across the BMI–LDL-C bi-trait, BMI and LDL-C. HyPrColoc is a deterministic Bayesian algorithm that, for a given genomic region, identifies clusters of traits which colocalize at distinct causal variants135. The algorithm also allows for sample overlap for the tested traits and corrects for it. As such, we provided the following input files to HyPrColoc upon testing the colocalization for each of the 1,103 associations: (1) a file with the β values of all the variants to be tested and the traits in consideration; (2) another file with their s.e. values; (3) a 3 × 3 matrix denoting the phenotypic correlation between the bi-trait and its corresponding pair of single traits estimated from the UK Biobank data; (4) an LD matrix for all variants within 1-Mb region around the lead SNPs; and (5) a 3 × 3 matrix with all values equal to 1. We used a 1-Mb region around the input variant (0.5 Mb on each side). Therefore, of 1,103 association signals, we found that 602 association signals corresponding to 425 unique SNPs colocalized across the bi-traits and their corresponding single traits. We subsequently inspected LocusZoom plots for potential overlap between independent loci across the traits and for missed colocalized association signals because HyPrColoc assumes only one ‘causal’ variant per region. This manual inspection led to either narrowing or widening the genomic region of 32 colocalized association signals and performing HyPrColoc again to identify other colocalized variants that were missed.

If more than one variant in the same region was retained for different traits, we chose the lead variant based on the following criteria. If the variants were in high LD (r2 > 0.9), we randomly picked one of them. Otherwise, if more than one variant was significantly associated with related traits, such as BMI–glucose, BFP–glucose and BMI–HbA1c, then we chose the variant that had lower P values for a larger number of traits. Last, if two variants in the same region were each associated with a bi-trait that represents different categories of cardiometabolic traits, we kept both variants even though they were in the same locus. For example, if one variant was associated with BMI–SBP and another with BMI–HbA1c, then we kept both, as the two may be contributing to the uncoupling of adiposity and comorbidities via different mechanisms. In all cases, each variant is represented only once regardless of how many traits the variant is associated with. In total, we retained 266 unique variants in 205 genomic loci. For 152 (57%) of the 266 variants, the effect allele was the minor allele.

Cluster analysis

We used the Noise-Augmented von Mises-Fisher Mixture model (NAvMix) algorithm to cluster the 266 lead variants based on their association with the single traits136. In brief, noise-augmented directional clustering clusters variants based on their proportional associations with different traits. The algorithm outputs a probability of membership for each datapoint (variant) to belong to each cluster. Each datapoint is then assigned to the cluster for which it has the highest probability. The procedure is repeated for a varying number of clusters and the final number of clusters is chosen based on the Bayesian Information Criterion. NAvMix outputs a noise cluster that includes data points (variants) that do not belong to any cluster and are thus considered outliers. We used the effect size (β) of the association of each of the 266 lead variants with each of the 11 single traits as input. We used BFP as a reference, assuming a positive direction of effect to facilitate comparison across traits. The associations with other traits were expressed using the BFP-increasing allele as the effect allele and the BFP-decreasing allele as the alternate allele.

Genetic risk scores and their association with anthropometric and cardiometabolic traits, and phenome-wide association study

We constructed GRSs for all 266 identified variants combined (GRSuncoupling) and for each of the eight clusters separately (GRS1–GRS8) in 373,747 unrelated individuals of European ancestry from the UK Biobank. We also generated a GRS for BFP (GRSBFP) based on 647 lead variants (where 353 (55%) variants had the effect allele as the minor allele) that reached P < 5 × 10−9 (clumped for r2 < 0.1 in a 1-Mb region, MHC region removed) in our single-trait GWAS for BFP in the UK Biobank. All of the ten GRSs were weighted by the effect size estimated for BFP from the GWAS that we performed in the current study. The GRSs were rescaled to per ten-allele change137 to enable comparison across GRSs that consist of different number of SNPs. Cluster 2 was identified by the clustering analysis to be associated with lower WHR. For all analyses that followed the clustering analyses, the effect and reference alleles of genetic variants that were used to construct GRS2 were flipped to reflect a profile of higher adiposity and facilitate comparison with the other clusters. To support implementation of our approach in other cohorts, we provide the code for constructing the GRSs (Supplementary Code) and the variant-specific effect sizes (β values) used in the GRSs (Supplementary Table 6).

To test the association of GRSs with anthropometric and cardiometabolic traits, we performed linear regression analyses for 24 traits, including SBP, DBP, HDL-C, TG, LDL-C, TC, glucose and HbA1c, and the following anthropometric traits: height, WHR, hip circumference, waist circumference, BFP, FFMI (computed as whole body fat-free mass divided by height squared), BMI, gynoid fat percentage, android fat percentage, MRI-measured VAT, ASAT, VAT:ASAT ratio, liver proton density fat fraction (liver fat), trunk fat (total trunk fat volume), GFAT volume and trunk fat:GFAT ratio. GFAT was calculated using VAT, ASAT and the total adipose tissue between the bottom of the thigh muscles to the top of vertebrae T9 (TAT) volume using the following formula138:

GFAT = TAT (between top of T9 and bottom of thigh muscles) – VAT – ASAT. Each of the traits was adjusted for age, sex, genotype array and study site and the first ten principal components in a linear regression model. The resulting residuals were transformed to approximate normality using inverse-normal rank scores before the association testing.

In addition, we performed phenome-wide association (PheWAS) analyses in unrelated individuals of European ancestry from the UK Biobank using the PHEnome Scan ANalysis Tool (PHESANT)139. Analyses were performed using a linear or logistic regression for continuous and binary outcomes, respectively, using the following covariates: age at enrollment, sex, genotyping array and the first ten genetic principal components.

Statistical analysis in ARIC and BioMe

We generated ten GRSs: GRSuncoupling, GRS1–GRS8 and GRSBFP in 9,240 unrelated European ancestry participants of the ARIC study. For the continuous traits, we created rank-based inverse-normal transformed traits adjusting for age, sex and the first ten PCs. We analyzed adiposity and cardiometabolic traits, including hip circumference, waist circumference, WHR, BMI, SBP, DBP, HDL-C, TG, LDL-C, TC and glucose in ARIC. We performed linear regression analyses to test the association of each GRS with the continuous traits. For analyses of incident T2D and CHD, we generated ten GRSs: GRSuncoupling and GRSBFP in 23,208 unrelated individuals of European (n = 8,985), Hispanic (n = 7,984) and African (n = 6,239) ancestry from the BioMe Biobank. We tested the association of each GRS with incident T2D and CHD in BioMe and ARIC using a Cox proportional hazard model after adjusting for age, sex, ancestry (for BioMe) and ten principal components. BioMe and ARIC association results were meta-analyzed using inverse-variance weighted meta-analysis. To further examine how lifestyle factors influence the association between GRSs and cardiometabolic disease risk in ARIC, we assessed physical activity—quantified as total metabolic equivalent (MET) hours per week of moderate-to-vigorous leisure-time physical activity. In the Cox models, we further stratified participants into physically active (top 30% of MET levels) and inactive (bottom 70%) groups to assess whether physical activity level impacted these associations.

Plasma proteomic characterization of the GRSs

We assessed the association of GRSuncoupling, GRS1–GRS8 and GRSBFP with Olink-derived plasma protein measurements in the UK Biobank. This analysis included 30,271 unrelated individuals of European ancestry from the random baseline sample selected by the UK Biobank Pharma Proteomics Project16, specifically those included in Olink batches 1–6. After excluding three proteins with >25% missing data (PCOLCE (62%), NPM1 (73%) and GLIPR1 (>99%)), we included 2,920 proteins with a median missingness of 2.8%. Measurements below the limit-of-detection were included, in line with Olink’s recommendations (https://olink.com/faq/how-is-the-limit-of-detection-lod-estimated-and-handled/). Using linear regression, we assessed the association between the ten GRSs (scaled to a per ten-allele change) and protein measurements (rank-based inverse-normal transformed). Covariates included age at measurement, age squared, sex, UK Biobank assessment center, ten PCs, genotyping array, Olink batch, fasting time at measurement (hours) and time between measurement and processing of the sample by Olink (years). Benjamini–Hochberg adjustment was applied to control the FDR at 0.005 (0.05/10 × GRS) across the 29,200 protein–GRS associations. To distinguish between adiposity- and health-driven associations, we grouped proteins that showed evidence of association with both GRSBFP and one or more of the adiposity-uncoupled GRSs, based on their directional concordance. Moreover, we assessed which proteins uniquely associated with any of the adiposity-uncoupling GRSs, but not GRSBFP.

Gene prioritization analysis

To identify the likely causal gene(s) within each of the identified loci, we used a combination of up to 14 bioinformatics and functional genomics methods. In addition to the annotated ‘nearest gene’, we used a ‘coding proxy’ approach, five bioinformatics tools, and leveraged in vitro adipogenic differentiation dataset with seven measures. For each gene, we summed the number of methods for which it was prioritized. Genes with a score ≥7 were prioritized. The methods used for the gene prioritization score are described below.

Bioinformatics tools

Nearest gene

We used the nearest gene as predicted by Ensembl Variant Effect Predictor (VEP)140.

Coding proxy

For a given lead SNP, we considered variants in high LD (r2 > 0.8) within a 1-Mb window. If one or more of those variants was a coding variant, then the gene(s) in which those coding variants lie were prioritized. We annotated the variants in VEP. A coding variant was defined as any variant with the following annotations: synonymous_variant, missense_variant, inframe_insertion, inframe_deletion, stop_gained, frameshift_variant, splice_donor_variant and splice_acceptor_variant.

ABC-max

We used FUMA141 to select all SNPs in high LD (r2 > 0.8) using as reference panel UK Biobank release 2b 10K Europeans. We intersected the total of 10,095 SNPs (lead SNPs and proxies in high LD) with enhancers and target genes predicted by the Activity-by-Contact (ABC) model142 in the following tissues: adipose, adrenal gland, astrocytes, pancreas, cardiac muscle cell, coronary artery, smooth muscle cell of coronary artery, heart ventricle, hepatocyte, liver, spleen, skeletal muscle myoblast and thyroid gland. Intersection of SNPs and enhancers was carried out using the function intersect from BEDtools v.2.29.2143.

Polygenic Priority Score

We used Polygenic Priority Score v.0.1 (ref. 144), a similarity-based algorithm that uses a broad range of omics data, including scRNA-seq. We used a reference panel of 10,000 randomly selected subjects from the UK Biobank and retrieved the gene with the highest Polygenic Priority Score in each associated loci.

Data-driven expression prioritized integration for complex traits

We used summary statistics generated from the association analyses of the 266 lead SNPs with BFP as input for DEPICT with default parameters145. DEPICT is an integrative tool that uses transcriptome expression (microarray data), pathways and protein–protein interactions to prioritize the most likely causal genes and highlight enriched tissues and pathways.

fastENLOC

We colocalized associated loci with expression quantitative trait loci (eQTLs) from GTEx v.8 (ref. 146) using fastENLOC147, prioritizing genes affected by eQTLs colocalizing at regional colocalization probability (RCP) > 0.1. We colocalized lead SNPs and variants in high LD (r2 > 0.8) with eQTLs in adipose tissue (subcutaneous and visceral), pituitary, brain cortex, brain hypothalamus, brain hippocampus, brain amygdala, adrenal gland, thyroid gland, liver, kidney, pancreas, skeletal muscle, salivary gland and heart (atrial appendage and left ventricle).

CS2G depot-specific gene prioritization

We intersected the 266 associated variants and their proxies (r2 > 0.8) with depot-specific chromatin accessible signals in subcutaneous and visceral human primary adipose-derived mesenchymal stem cells (AMSCs). We then performed combined S2G strategy CS2G148 to predict the effector genes. We reported the effector genes with a cutoff of 0.05 on the 1000 Genome and UK Biobank scores. The detailed protocol for AMSC proliferation, induction and differentiation is outlined in Laber et al.149. Overall, AMSCs were obtained from subcutaneous and VAT from patients undergoing a range of abdominal laparoscopic surgeries149 and isolated as previously described150. For a subset of donors, the purity of AMCSs was assessed as previously described151. Each participant gave written informed consent before inclusion and the study was approved by the ethics committee of the Technical University of Munich (study no. 5716/13). Cells were introduced (counted as differentiation day 0) and put into differentiation for 14 days until fully differentiated. Samples are collected at differentiation day 14. Donor genotyping, SNP QC as well as the genotype imputation were performed as previously described152. Nuclei and library preparation for AMSC ATAC-seq were performed as previously described152. ATAC peaks were called by MACS3 (v.3.0.0). After peak calling, narrow peaks from all the samples (n = 15 subcutaneous AMSCs and n = 14 visceral AMSCs) were first combined, then the overlapped intervals were merged into a single interval using BEDtools (v.2.30.0) (function BEDtools merge -I)143.

Overlap with eQTL data from GTeX

For the association and scoring of SNPs with genes we made use of eQTL data from GTEx (v.8) for subcutaneous and VAT as well as enhancer capture HiC data (‘GSE140782_ECHiC.txt.gz’ https://doi.org/10.1038/s41588-020-0709-z), DNase-seq based chromatin accessibility (‘GSE113253_DNase_processed_data.tar.gz’ https://doi.org/10.1038/s41588-019-0359-1) and gene expression data (‘GSE113253_GeneExpr_BM.txt.gz’ https://doi.org/10.1038/s41588-019-0359-1) of hBM-MSC-TERT4 cells subjected to adipocyte differentiation in vitro.

Specifically, the lead SNP or a proxy SNP, which was determined using the R package LDlinkR (https://doi.org/10.3389/fgene.2020.00157) with a threshold of R2 > 0.8, had to fulfill at least one of the four criteria: (1) overlapping with an eQTL in VAT; (2) overlapping with an eQTL in subcutaneous adipose tissue; (3) overlapping with a genomic region that is linked by the enhancer capture data to a promoter region; or (4) overlapping with a DNase1 hypersensitive region. If only the latter case was true, the closest transcription start site was chosen to be the candidate gene. Overlap was determined using the R package GenomicRanges.

For each unique combination of lead SNP and putative candidate gene, we set up a score based on an overlap of the proxy SNPs with an eQTL, overlap of the proxy SNPs with a DNase1 hypersensitive site and its change in accessibility during adipocyte differentiation, overlap of the proxy SNPs with an enhancer region contacting the promoter of the candidate gene and the expression of the candidate gene and its significant change during adipocyte differentiation.

Tissue and gene set enrichment

Tissue and gene set enrichment were performed by DEPICT, using summary statistics generated from the analysis on BFP as input with default parameters145. Both were performed on all uncoupling loci and on cluster-specific loci. Only gene sets with at least ten genes were included. We restricted our analyses to the Gene Ontology, KEGG and REACTOME pathway terms. We used FDR < 0.05 as a threshold for significance when considering all the 266 variants. For the cluster-specific analysis, as the clusters have a small number of variants and therefore less power, we used P < 0.05 as a threshold from the ‘Nominal P value’ output from DEPICT regardless of the FDR value.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.