Main

Clonal haematopoiesis (CH), the clonal expansion of haematopoietic stem cells (HSC) and their progeny driven by somatic driver mutations, is associated with increased risk of progression to acute myeloid leukaemia (AML) and other myeloid neoplasms3,4,5,6. More than half of all CH cases are driven by mutations in DNMT3A, the gene for de novo DNA methyltransferase 3A3,4,7,8. DNMT3A mutations occur throughout the gene but most commonly affect codon R882 in the protein’s methyltransferase domain. These alter DNMT3A enzymatic activity and seem to have a dominant-negative effect and/or lead to defective homodimerization9,10, resulting in DNA hypomethylation. Studies of murine models showed that Dnmt3a mutations impart a differentiation block and increased self-renewal on HSC11,12,13, drive HSC clonal expansion in the presence of inflammation14,15, promote AML development and impart resistance to anthracyclines16,17. DNMT3A-mutant CH is associated with an increased risk of progression to AML, an association that is driven primarily by DNMT3A-R882 hotspot mutations1,18,19. Proposed pharmacological interventions for DNMT3A-mutant AML include hypomethylating agents20, induction of apoptosis and/or necroptosis21, or targeting genes and/or pathways activated in DNMT3A-R882-mutant cells, such as DOT1L22 and mTOR23.

Recent advances in our ability to predict the risk of progression to AML and related cancers1,2 have spurred an interest in therapeutic approaches to avert/delay progression of CH to these cancers. This approach could be applied to DNMT3A-R882-mutant CH, given its significant risk of progression to AML. Previous studies reported azacytidine20 and oridonin21 as potential strategies for targeting DNMT3A-R882 CH. However, their effectiveness has not been formally tested, and there is an unmet need for safe and well-tolerated treatments to prevent progression of DNMT3A-R882-mutant CH.

Here, we demonstrate that DNMT3A-R882-mutant cells rely on oxidative phosphorylation (OXPHOS) and that their clonal proliferative advantage is vulnerable to inhibition by pharmacological agents targeting mitochondrial genes shown by a genome-wide CRISPR screen in mouse Dnmt3a-R882H hematopoietic stem and progenitor cells (HSPCs). We validate these vulnerabilities in our preclinical model of CH using several modulators of mitochondrial metabolism, including CTPI2, IACS-010759 and metformin. Importantly, analysis of 412,234 UK Biobank (UKB) participants showed that individuals taking metformin (n = 11,190) exhibited markedly lower prevalence of DNMT3A-R882-mutant CH (odds ratio (OR) = 0.49 (95% confidence interval (Cl), 0.32–0.74), P = 0.00081). Notably, non-R882 DNMT3A-CH and CH driven by other driver genes did not exhibit this relationship, emphasizing the striking specificity of metformin for DNMT3A-R882 CH. Given the established safety record of metformin, this work proposes its investigation in clinical studies of prevention of DNMT3A-R882 CH progression to AML.

Dnmt3a-R882H enhances HSPC self-renewal

To study the effect of DNMT3A-R882 hotspot mutations on haematopoiesis and leukaemogenesis, we developed a conditional mouse model by floxing the native murine Dnmt3a exon 23 (containing codon R878, equivalent to human R882) and inserting human R882H-mutant exon 23 immediately downstream (Dnmt3aflox-R882H; Fig. 1a). Dnmt3aflox-R882H mice were crossed into the Mx1-Cre background, enabling efficient excision of native exon 23 and thereafter expression of Dnmt3a-R882H, following polyinosinic-polycytidylic acid (pIpC) administration (hereafter Dnmt3aR882H/+; Fig. 1b and Extended Data Fig. 1a,b). Dnmt3aR882H/+ showed no major haematopoietic phenotypes compared to control mice (Dnmt3a+/+or WT), including equal numbers of long-term HSC (LT-HSC) at six weeks post-pIpC (Extended Data Fig. 1c–i), mirroring previous reports16. With age, Dnmt3aR882H/+ mice developed modest increases in the proportions of peripheral blood (PB) myeloid and T cells (Extended Data Fig. 1j) and demonstrated reduced frequencies of megakaryocyte-erythroid progenitors (MEPs) in bone marrow (BM) (Extended Data Fig. 1k). Frequencies of BM LT-HSC, short-term HSC (ST-HSC) and other progenitor compartments and PB parameters, including blood leukocyte counts (WBC), haemoglobin (HGB) and platelet counts (PLT) were unchanged (Extended Data Fig. 1k–s). However, Dnmt3aR882H/+ BM cells displayed a striking self-renewal phenotype in vitro (Fig. 1c) and enhanced BM competitive repopulation in vivo upon transplantation into lethally irradiated mice, in line with other reports17,20. A competitive advantage was observed in both PB (Fig. 1d,e and Extended Data Fig. 2a,b) and BM progenitor compartments, including LT-HSC (Fig. 1f–g and Extended Data Fig. 2c).

Fig. 1: Dnmt3aR882H/+ HSPC shows self-renewal phenotype, enhanced BM repopulation and progression to MPN/AML.
figure 1

a, Structure of the Dnmt3aR882H conditional allele. b, RNA sequencing reads from Dnmt3a+/+ and Dnmt3aR882H/+ HSPCs were aligned to mouse exons and human exon 23 with DNMT3A-R882H mutation. Sanger sequencing was performed on cDNA of Dnmt3aR882H/+ amplified with primers detecting human exon 23. Similar data were observed for n = 3 mice per genotype. c, Serial replating of BM-derived colonies from Dnmt3a+/+ (n = 7 mice) and Dnmt3aR882H/+ (n = 8 mice). d, Schematic representation of BM competitive transplant strategy of CD45.2-WT and CD45.2-Dnmt3aR882H/+ transplanted together with CD45.1 competitor. e, Proportion of Dnmt3a+/+ (n = 5 mice) and Dnmt3aR882H/+ (n = 4 mice) cells in PB post-transplant. f,g, Fluorescence-activated cell sorting plots of LT-HSC (Lin−ve, Sca1+, c-Kit+, Cd48Cd150+) and proportion of Dnmt3a+/+ (n = 5 mice) and Dnmt3aR882H/+ (n = 4 mice) transplanted cells in LT-HSC (f) with quantification (g). h, Kaplan–Meier survival curves for Dnmt3aR882H/+ (n = 35) and control (n = 31) mice; P by log-rank (Mantel–Cox) test. i, Histopathological diagnoses of moribund mice. Bars depict cancer/normal diagnosis per genotype/total mice with available histology data (n = 44 mice). j, Characteristic histology from one mouse with MPN/AML. Myeloid cell/AML infiltration in the liver (Li) and blasts in the setting of myeloid hyperplasia and a hypercellular marrow are shown. Similar data were observed for 13 Dnmt3aR882H/+ mice. w, week. Scale bars, 300 µm for inset (Li), 50 µm for others. In ce, g and i, the mean ± s.d. is shown; P by two-sided t-test for comparisons between Dnmt3aR882H/+ and WT. Schematics in a,d were created using BioRender (https://biorender.com).

Source data

On ageing to a humane endpoint, Dnmt3aR882H/+ mice demonstrated decreased survival versus WT (median 487 versus 618 days; P < 0.0029; Fig. 1h), in line with other reports23. At necropsy, 40% of Dnmt3aR882H/+ mice had splenomegaly (Extended Data Fig. 2d); however, mean WBC was not elevated (Extended Data Fig. 2e). Histological examination of BM, spleen, liver, heart and lungs showed a higher incidence of myeloproliferative neoplasm (MPN)/AML in Dnmt3aR882H/+ mice (37%, 13 of 35 Fig. 1i,j), reflecting the higher risk of progression to AML of DNMT3A-R882 compared to other DNMT3A mutations1. Exome sequencing of six Dnmt3aR882H/+ MPN/AMLs showed several somatic mutations in each MPN/AML, including in genes known to co-occur with DNMT3A mutations in human AML, such as Tet1, Setd2 and Cux1 (Supplementary Table 1).

Genetic vulnerabilities of Dnmt3a R882H/+ HSPCs

To understand the genetic vulnerabilities of Dnmt3aR882H/+ HSPC, we exploited the fact that these cells display increased serial replating in cytokine-enriched media, a surrogate for enhanced self-renewal potential (Fig. 1c). To interrogate the basis of this, we crossed Dnmt3aflox-R882H/+, Mx1-Cre and Rosa26-Cas9 (hereafter Cas9)24 mice and induced mutation with pIpC. We isolated Dnmt3aR882H/+; Cas9; Mx1-Cre BM HSPCs and cultured these in cytokine-rich methylcellulose and then liquid media. HSPCs were transduced with our exome-wide CRISPR library24 and selected by flow-sorting for blue fluorescent protein (BFP). Cells were then cultured and collected at several timepoints (Fig. 2a). Sequencing for guide RNA (gRNA) content in genomic DNA of surviving cells on day 30 showed 640 HSPC-essential (dropout) genes at false discovery rate ≤ 20% (Fig. 2b and Supplementary Table 2). We compared these with publicly available CRISPR dropout genes from human cancer cell lines. Strikingly, DNMT3A-mutant cell lines OCI-AML2 and OCI-AML3 were the top two enriched cell lines of 325 tested (Extended Data Fig. 2f), demonstrating cross-species validation and genotype specificity.

Fig. 2: Whole-genome CRISPR screen identifies Dnmt3aR882H/+ drug target candidates.
figure 2

a, Schematic representation of CRISPR-screen strategy. b, Result of the whole-genome CRISPR screen in Dnmt3aR882H/+ HSPC. Selected highly significantly depleted genes are listed. False discovery rate (FDR) was calculated using the MAGeCK statistical package. c, KEGG pathway analysis of 640 depleted genes using Enrichr, reporting P values adjusted (adj.) for several comparisons. d, Venn diagram of Dnmt3aR882H/+ dropouts and HPC-7 dropouts25. e,f, Classification of 201 dropout genes present only in Dnmt3aR882H/+ cells into potential ‘druggable’ gene categories (e) and drug–gene interaction categories, as defined by DGIdb and a literature search (f). Three categories are depicted in e. All categories in f can be found in Supplementary Table 4, and drug–gene categories are shown in Supplementary Table 5. Schematic in a was created using BioRender (https://biorender.com).

Source data

Pathway analyses of depleted genes showed enrichment in pathways including ribosome, spliceosome, proteasome, cell cycle and OXPHOS (Fig. 2c). To remove pan-essential genes, we compared our Dnmt3aR882H/+ dropouts with a similar screen performed on non-transformed haematopoietic precursor cell line 7 (HPC-7)25, which retains many characteristics of primary murine HSPCs26, and found 201 genes specifically depleted in Dnmt3aR882H/+; Cas9; Mx1-Cre HSPCs (Fig. 2d and Supplementary Table 3). Among these 201 genes, 35 belonged to the ‘druggable genome’ category. These included kinases, chromatin regulators and several genes regulating cellular metabolism, including Slc25a1, Slc6a9, Ldlr and Ffar3. In addition, among 34 genes with enzymatic functions, 14 have known roles in metabolism, including OXPHOS and the tricarboxylic acid (TCA) cycle: including Ndufb11, Cs, Pdhb, Hk3 and Coq2 (Fig. 2e). Many genes in these categories have available inhibitors (Fig. 2f and Supplementary Tables 4 and 5) and represent attractive targets for pharmacological interventions. Because Dnmt3aR882H/+ vulnerabilities also correlated with human DNMT3A-mutant AML cell line dropouts, we asked whether expression of any Dnmt3aR882H/+-specific vulnerability genes correlated with AML survival. We identified 17 genes, including SLC25A1, LDLR, HK3, TGFB1, ICMT and PSMA4, where higher expression correlated with significantly poorer patient survival (Supplementary Fig. 1).

Metabolic vulnerabilities of Dnmt3a R882H/+ HSPCs

To validate our screen, we used gRNAs against selected ‘druggable’ candidates and observed that in most instances, gene knockout specifically reduced growth of Dnmt3aR882H/+ more than WT HSPCs (Fig. 3a,b). As genes implicated in cellular metabolism showed specificity towards Dnmt3aR882H/+ HSPCs, we focused on them as potential therapeutic targets in DNMT3A-R882 CH. We first noted that genes involved in sequential pathways of pyruvate conversion to Acetyl-CoA (Pdhb), citrate synthesis (Cs) and citrate/malate transport (Slc25a1) were Dnmt3aR882H/+ vulnerabilities important for the TCA cycle. Other mitochondrial targets included electron transfer chain (ETC) components Ndufb11 and Coq2. We focused on the two most significantly depleted Dnmt3aR882H/+-specific candidates, Slc25a1 and Ndufb11 (Supplementary Table 3). To address the consequences of their depletion in vivo, we transduced HSPCs from WT, Cas9 and Dnmt3aR882H/+, Cas9 mice with lentiviral vectors expressing gRNA for Slc25a1, Ndufb11 and Empty gRNA control. Transduced cells were transplanted into irradiated recipients alongside competitor cells. Four weeks after transplantation, we observed significantly higher reconstitution of Dnmt3aR882H/+ over WT HSCPs transduced with control gRNA. Knockout of Slc25a1 and Ndufb11 strongly supressed the transplantation ability of Dnmt3aR882H/+. Of note, we also observed a negative effect of Slc25a1 and Ndufb11 knockout in WT cells, albeit to a lesser extent (Extended Data Fig. 3a–c). With prevention studies in mind, we next tested available pharmacological inhibitors of these two targets: Slc25a1 is directly inhibited by CTPI227,28, whereas Ndufb11 can be inhibited by metformin, a complex I ETC inhibitor that also inhibits mitochondrial glycerophosphate dehydrogenase and IACS-010759, a more specific complex I inhibitor29. CTPI2 and metformin significantly reduced colony formation of both preleukaemic and leukaemic Dnmt3aR882H/+; Cas9 HSPCs (Extended Data Fig. 3d,e). Furthermore, we observed that metformin significantly suppressed the self-renewal potential (plating 3) of Dnmt3aR882H/+ cells (Extended Data Fig. 3f, g), whereas CTPI2 completely abolished enhanced self-renewal of Dnmt3aR882H/+ (platings 2, 3) even at lower concentrations (Extended Data Fig. 3h). We also observed that NDUFB11 and COQ2 showed very high dependency scores in human DNMT3A-mutant AML cell lines among the 37 myeloid cell lines with available data (Extended Data Fig. 4a), and SLC25A1 was also depleted in all three DNMT3A-mutant lines, albeit more modestly. Furthermore, another two genes involved in citrate synthesis, PDHB and CS, showed high DNMT3A-mutant dependency among myeloid cell lines (Extended Data Fig. 4b).

Fig. 3: Enhanced mitochondrial respiration in transplanted Dnmt3aR882H/+.
figure 3

a,b, Proliferation of Dnmt3aR882H/+ (a) and WT (b) HSPCs after editing of the indicated gene. The BFP-positive fraction was compared with the non-transduced population and normalized to day 4 and Empty vector for each gRNA. For a and b, the mean ± s.d. is shown; n = 3 biological replicates for Kdm1a gRNA and n = 4 biological replicates for the remining gRNAs in Dnmt3aR882H/+. n = 4 biological replicates for Mlst8, Jak1, Brd2 gRNAs and n = 6 biological replicates for the remining gRNAs in WT. Asterisk indicates significant depletion in Dnmt3aR882H/+ versus WT HSPCs calculated for each day. P by two-sided t-test. c, Schematic representation of the experimental setup for mitochondrial respiration analysis of WT and Dnmt3aR882H/+ HSPCs extracted eight weeks post-transplantation performed with the Seahorse analyser. d, Example of OCR in transplanted WT and Dnmt3aR882H/+ HSPCs, measured using a Seahorse extracellular flux analyser; mean ± s.e.m. n = 18 for Dnmt3aR882H/+, n = 15 for WT representing three independent biological replicates, for each Dnmt3aR882H/+ performed in six replicates, for WT performed in four, five and six replicates respectively. R + A indicates rotenone and antimycin A. e, Basal respiration, maximal respiration, ATP production and spare respiration capacity were calculated; mean ± s.d. P by two-sided t-test. n = 3 biological replicates per genotype. Similar results were obtained when Dnmt3aR882H/+ versus WT cells were transplanted separately into recipients (Extended Data Fig. 4g–i). *Higher depletion in Dnmt3aR882H/+ versus WT. d, day. Schematic in c was created using BioRender (https://biorender.com). Seahorse picture in c adapted with permission from K. Gozdecki.

Source data

Reliance of Dnmt3a R882H/+ HSPCs on OXPHOS

To determine the effect of Dnmt3a-R882H on cellular metabolism, we isolated primary HSPCs from Dnmt3aR882H/+ and WT mice and compared the use of glycolysis and OXPHOS for cellular bioenergetics using metabolic flux analysis. First, in homoeostatic conditions, we observed no differences in mitochondrial respiration between Dnmt3aR882H/+ and WT HSPCs (Extended Data Fig. 4c,d). This finding is in consonance with the observation that our Dnmt3a-R882H model demonstrates relatively little alteration of steady-state haematopoiesis. However, in response to stressful stimuli, such as stem cell transplantation, inflammation or in vitro culture, Dnmt3aR882H/+ cells display a distinct clonal advantage reminiscent of that seen in individuals with CH. To understand whether this stress-related clonal advantage correlates with altered metabolism, we co-transplanted Dnmt3aR882/+ and WT BM cells into congenic recipients and isolated Dnmt3aR882H/+ and WT HSPCs from the same recipient animals eight weeks after transplantation. This showed enhanced basal mitochondrial respiration and increased ATP production in Dnmt3aR882H/+ versus WT cells (Fig. 3c–e). Concomitantly, glycolysis was unchanged, and glycolytic capacity was reduced in Dnmt3aR882H/+ (Extended Data Fig. 4e,f), indicating a switch in metabolism and greater dependence of mutant cells on mitochondrial respiration. Similarly, WT and Dnmt3aR882H/+ HSPCs transplanted independently into recipient mice also showed enhanced OXPHOS (Extended Data Fig. 4g–i). We then asked whether CTPI2 and metformin could reverse this aberrant respiration, observing that both drugs significantly reduced mitochondrial respiration in Dnmt3aR882H/+ HSPCs (Extended Data Fig. 5a–d).

Slc25a1 block curtails Dnmt3a R882H/+ HSC expansion

To investigate the effect of Slc25a1 inhibition in vivo, we treated Dnmt3aR882H/+ and WT mice with CTPI2 three times per week for two weeks (Extended Data Fig. 6a). CTPI2 treatment had no effect on the frequencies of LT-HSC, total BM cell or progenitor compartment frequencies, including Lin−ve, LSK and multipotent progenitors (MPP) cells (Extended Data Fig. 6b–g). Interestingly, CTPI2 did decrease the frequencies of myeloid cells in Dnmt3aR882H/+ BM but not in WT mice (Extended Data Fig. 6h). Because Dnmt3aR882H/+ HSPCs show enhanced contribution to haematopoiesis after transplantation, we aimed to determine the effect of previous CTPI2 treatment on LT-HSC potential to repopulate irradiated recipient mice. We collected BM from CTPI2/vehicle-treated Dnmt3aR882H/+ and WT mice, mixed them with CD45.1 BM competitor cells in a 1:2 proportion and transplanted them into irradiated recipients (Fig. 4a). Transplant of vehicle-treated Dnmt3aR882H/+ cells showed an enhanced PB contribution in comparison to vehicle-treated WT donors, as expected (Fig. 4b). CTPI2 treatment significantly inhibited the competitive advantage of Dnmt3aR882H/+ cells in PB. Importantly, CTPI2 treatment had no effect on the reconstitution of WT cells (Fig. 4b). Looking at individual haematopoietic lineages, vehicle-treated Dnmt3aR882H/+ cells showed a slightly higher contribution to the B cell lineage, which was reversed by CTPI2 at weeks 8 and 16 (Extended Data Fig. 6i). Conversely, CTPI2 did not affect proportions of T cells and myeloid cells in recipients of either WT or Dnmt3aR882H/+ cells (Extended Data Fig. 6j,k). Blood morphology and indices, including WBC, HGB and red blood cells, were not affected by CTPI2 (Extended Data Fig. 6l–n). PB platelet counts were slightly increased in vehicle-treated Dnmt3aR882H/+ versus WT at weeks 12 and 16. CTPI2 reversed the increased platelet counts in Dnmt3aR882H/+ recipients to the level of WT (Extended Data Fig. 6o). Strikingly, analysis of the BM haematopoietic compartment of recipients transplanted with cells from vehicle-treated mice showed significantly higher frequencies of Dnmt3aR882H/+ LT-HSC versus WT. By comparison, CTPI2 treatment before transplantation significantly reduced the frequencies of Dnmt3aR882H/+ LT-HSC to WT levels (Fig. 4c,d). Importantly, CTPI2 treatment had no effect on frequencies of WT LT-HSC (Fig. 4c,d), total (CD45.1 and CD45.2 combined) LT-HSC, MPPs, LSK or BM cellularity (Extended Data Fig. 6p–t). These data indicate that Slc25a1 is a specific therapeutic target in Dnmt3aR882H/+ CH, and its pharmacological targeting prevents expansion of mutant LT-HSC without affecting WT LT-HSC.

Fig. 4: CTPI2 and complex I inhibitors revert clonal advantage of Dnmt3aR882H/+ LT-HSC.
figure 4

a, Scheme of experimental approach. b, The proportion of CD45.2 cells in PB, normalized to the proportion of injected CD45.2 cells. The mean ± s.e.m. is shown; for Dnmt3aR882H/+ CTPI2, n = 6, for remining groups, n = 5 mice; P by two-sided t-test between Dnmt3aR882H/+ vehicle and Dnmt3aR882H/+ CTPI2 is shown. c, The proportion of transplanted cells in LT-HSC by flow cytometry at week 16. One sample per group is shown; similar results were observed for n  =  5. d, Frequencies of CD45.2-LT-HSC in BM at week 16, normalized to the proportion of injected LT-HSC. The mean ± s.e.m. is shown; for Dnmt3aR882H/+ CTPI2, n = 6 mice, for remining groups, n = 5 mice. e, Schematic summary of the experimental approach. HSPCs isolated 12 weeks post-transplant were plated in semisolid media with 100 nM IACS-010759/vehicle for seven days. f, Quantified colonies; the mean ± s.d. is shown; n = 3 mice per group. g, Schema of experimental approach for IACS-010759/vehicle treatment. h, Frequencies of CD45.2 LT-HSC in mouse BM at endpoint. The mean ± s.e.m. is shown; n = 5 mice for both vehicle groups, and n = 6 for both IACS-010759 groups. i, Schematic representation of metformin treatment model. Dnmt3aR882H/+ were mixed with WT BM cells in 1:2 proportion and transplanted into lethally irradiated recipient mice. Metformin (125 mg kg−1) or vehicle treatment started from week 5 for six weeks. j, Frequencies of transplanted Dnmt3aR882H/+ and WT LT-HSC in BM; the mean ± s.d. is shown; n = 3 mice for vehicle group, and n = 6 mice per metformin group. P in d, f, h, j by one-way ANOVA with Tukey correction. Schematics in a,e,g,i were created using BioRender (https://biorender.com).

Source data

Complex I block curtails Dnmt3a R882H/+ HSC growth

To target Ndufb11 in vivo, we focused on IACS-010759, a clinical-grade, selective small-molecule inhibitor of complex I (ref. 29). We first tested IACS-010759 on colony formation in semisolid media. We isolated HSPCs from transplanted WT mice and Dnmt3aR882H/+ mice. The latter, unlike primary Dnmt3aR882H/+ HSPCs, displayed increased colony formation from first plating. IACS-010759 treatment significantly reduced colony formation in Dnmt3aR882H/+, sparing WT HSPCs (Fig. 4e,f). To investigate IACS-010759’s effect in vivo in the transplant setting, we treated primary Dnmt3aR882H/+ and WT mice with IACS-010759/vehicle for two weeks. BM cells from treated mice were then mixed with competitor cells and transplanted into recipients (Fig. 4g). IACS-010759 had no effect on proportions of total CD45.2-Dnmt3aR882H/+ or WT cells, nor on B, T and myeloid cell reconstitution in the PB (Extended Data Fig. 7a–d). Blood morphology and parameters, including HGB, HCT and PLT, were unchanged in recipients of IACS-010759-treated cells, whereas WBC counts were decreased at weeks 4 and 12, but normalized by week 16 (Extended Data Fig. 7e–h). Analysis of vehicle-treated chimeric BM at 16 weeks showed significantly increased frequencies of LT-HSC from Dnmt3aR882H/+ versus WT cells. Importantly, IACS-010759 treatment significantly decreased the frequency of Dnmt3aR882H/+ but not WT LT-HSC in BM (Fig. 4h). Total (CD45.1 + CD45.2) frequencies of LT-HSC and total BM live cells were unaffected (Extended Data Fig. 7i,j). Because two recent IACS-010759 phase I trials reported unexpected side effects that would preclude use of IACS-010759 to treat CH, at least at the doses used in these trials30, we tested metformin, a safe, well-tolerated and commonly used complex I inhibitor, which also targets further genes/pathways identified as Dnmt3a-R882H-specific vulnerabilities in our screen, including mTOR31, STAT532, TGF-β133, GLS34 and Cyclin D complex34. Given the established safety of metformin, we opted to prolong treatment duration. BM cells from CD45.2-Dnmt3aR882H/+ mice and CD45.1-WT mice were mixed in a 1:2 ratio and transplanted into irradiated mice. Five weeks post-transplantation, mice were treated with metformin for six weeks (Fig. 4i). PB analysis before and after treatment showed no altered proportion of Dnmt3aR882H/+ or WT cells (Extended Data Fig. 7k). However, BM analysis six weeks post-treatment showed a significant increase in the frequencies of Dnmt3aR882H/+ LT-HSC in comparison to WT HSPCs in the vehicle-treated controls, which was significantly reversed by metformin treatment (Fig. 4j). Total frequencies of LT-HSC were also significantly decreased by metformin, reflecting its effect on Dnmt3aR882H/+ (Extended Data Fig. 7l). Metformin also decreased LSK frequencies in Dnmt3aR882H/+ but not WT cells (Extended Data Fig. 7m,n); however, the total frequencies of BM Dnmt3aR882H/+ and WT cells were not affected (Extended Data Fig. 7o). Secondary transplantation of BM from metformin-treated chimeric mice showed a significant decrease in the proportion of Dnmt3aR882H/+ cells in PB compared to vehicle (Extended Data Fig. 7p,q).

Slc25a1 and complex I loss increase DNA methylation

SLC25A1 loss in humans is associated with d, l-2hydroxyglutaric aciduria caused by the accumulation of d and l enantiomers of 2-hydroxyglutarate (2-HG)35,36,37. Using targeted metabolomics, we confirmed increased levels of 2-HG in both Dnmt3aR882H/+ and WT HSPCs on Slc25a1-knockout (Extended Data Fig. 8a,b). Enantiomer analysis showed enrichment in both d- and l-2HG, as reported for humans with SLC25A1 loss (Extended Data Fig. 8c,d). Raised 2-HG levels can affect the activity of TET dioxygenases38 and affect DNA hydroxymethylation (5hmC). We confirmed a global reduction in DNA methylation (5-methylcytosine, 5mC) (Extended Data Fig. 8e,f) and also observed significantly reduced total 5-hydroxymethylcytosine (5hmC) levels in Dnmt3aR882H/+ versus WT (Extended Data Fig. 8e,g), reflecting the globally reduced levels of its precursor, 5mC. Interestingly, on Slc25a1 knockout, we observed a significant increase in DNA methylation in WT (Extended Data Fig. 8h) and, to a greater extent, Dnmt3aR882H/+ cells (Extended Data Fig. 8i). The level of 5hmC on Slc25a1 knockout compared to Empty control was unchanged (Extended Data Fig. 8j,k), indicating that the observed increase in DNA methylation is not dependent on TET inhibition by 2-HG. We also tested whether metformin affects DNA methylation in the Dnmt3aR882H/+ context. For this, we transplanted Dnmt3aR882H/+ BM cells into lethally irradiated recipients and treated them for four weeks with metformin. Interestingly, we observed increased 5mC levels in metformin-treated mice versus vehicle controls (Extended Data Fig. 9a–c). 2-HG levels were not affected by metformin (Extended Data Fig. 9d), indicating that the increase in DNA methylation was not due to TET inhibition. Therefore, in addition to their significant effect on OXPHOS, both Slc25a1-knockout and metformin increased 5mC in Dnmt3aR882H/+ HSPCs.

DNMT3A-R882-mutant CH in metformin users

Metformin is the most commonly prescribed drug for Type 2 diabetes (T2D), with millions taking this drug worldwide39. Encouraged by its ability to curtail Dnmt3aR882H/+ LT-HSC expansion, we interrogated the interaction between metformin and CH prevalence among 412,234 UKB participants. Using logistic regression with age, sex, smoking status and the first four genetic principal components as covariates, we found that the risk of DNMT3A-CH was substantially reduced among the 11,190 individuals taking metformin, compared to those not taking metformin, at the time of recruitment/blood sampling (OR = 0.86 (95% CI, 0.77–0.96), P = 0.0095) (Fig. 5a,b and Supplementary Table 6). Notably, this association was driven by DNMT3A-R882-mutant CH (OR = 0.49 (95% Cl, 0.32–0.74), P = 0.00081), with DNMT3A non-R882-CH showing a non-significant reduction trend (OR = 0.91 (95% CI, 0.81–1.03), P = 0.13). We also stratified non-R882 variants of known functionality40 into high (n = 950) and low (n = 1,550) functionality categories. Again, we found no association between metformin exposure and prevalence of either high- or low-functionality variants (P = 0.38 and P = 0.9, respectively). In addition, there was no association between metformin use and prevalence of other common forms of CH (Fig. 5b). As many individuals on metformin were also taking other antidiabetic medications, most commonly sulphonylureas and insulin, we next restricted the analysis to those taking only metformin and found that they still had a markedly decreased risk of DNMT3A-R882-mutant CH (OR = 0.35 (95% Cl, 0.18–0.71), P = 0.0034; Fig. 5c and Supplementary Table 6). The effect of metformin was greater for smaller clones (variant allele frequency (VAF) ≤ 0.1), but the same trend was observed for large clones (VAF > 0.1) (Extended Data Fig. 10a,b and Supplementary Table 6). We next wanted to ascertain the prevalence of DNMT3A-R882-mutant CH and other CH subtypes among diagnosed diabetics who had not been on metformin at any time before recruitment, but we could not confidently identify these in the UKB, particularly as 90% of individuals diagnosed with T2D in the UK at that time were treated initially with metformin39. Instead, we searched for diabetics who were undiagnosed and untreated at the time of recruitment, using two different approaches: (1) individuals with high levels of glycated HGB (HbA1c) at recruitment41 and (2) those who started metformin at a later time point (‘premetformin’). We observed that neither group (Fig. 5d,e and Supplementary Table 7) had altered rates of DNMT3A-R882-mutant CH (OR = 0.90 (95% Cl, 0.34–2.40), P = 0.83 and OR = 0.82 (95% Cl, 0.45–1.49), P = 0.52, respectively), indicating that T2D per se does not significantly alter DNMT3A-R882-mutant CH risk. Also, individuals taking only sulphonylureas or only insulin did not display an altered prevalence of DNMT3A-R882-mutant CH (OR = 0.83 (95% Cl, 0.21–3.32), P = 0.79 and OR = 1.05 (95% Cl, 0.47–2.34), P = 0.91, respectively; Extended Data Fig. 10c,d and Supplementary Table 6). Furthermore, using Mendelian randomization (MR) instruments, we examined the effect of HbA1c42,43, T2D44, body mass index (BMI) and waist-to-hip ratio adjusted for BMI45 on prevalence of DNMT3A-R882 CH. Reassuringly, we found no significant associations (Extended Data Fig. 10e–h and Supplementary Table 8), providing further and independent genetic support for the premise that the association between metformin use and reduced DNMT3A-R882 CH is causal. Taken together, these findings show that metformin use is associated with significantly reduced risk of DNMT3A-R882-mutant CH and propose this well-tolerated drug as a potential intervention to prevent or retard its clonal expansion and, in so doing, reduce the likelihood of progression to AML.

Fig. 5: Metformin curtails DNMT3A-R882 CH in humans.
figure 5

a, Schematic representation of the UKB analysis. b, Association between metformin and CH risk. In total, 11,190 individuals taking metformin or metformin in combination with other antidiabetic medications and 401,044 individuals not on metformin were analysed. c, Association between metformin and CH risk. Only individuals taking metformin were included in the analysis. In total, 5,644 individuals on metformin only and 398,712 individuals not on any form of antidiabetic medications were analysed. d, Association between undiagnosed/untreated diabetes and overall CH or gene-specific CH risk. HbA1c > 7% (equivalent of 53 mmol mol−1) was used to identify individuals with undiagnosed/untreated diabetes. Diabetic individuals and individuals taking metformin or other antidiabetic medications at recruitment were excluded. In total, 374,873 individuals with HbA1c data available were included (1,195 individuals with HbA1c > 7% and 373,642 with HbA1c ≤ 7%). e, Association between post-recruitment metformin intake and CH risk. Diabetic individuals and individuals taking metformin or other antidiabetic medications at recruitment were excluded. In total, 3,568 individuals who started on metformin at some time after recruitment (post-recruitment metformin) and 389,153 controls were included for analysis. In be, measures of centre represent the ORs, and the error bars represent the lower and upper bound of the 95% CI of the ORs. ORs and two-sided unadjusted P values were derived from logistic regression model with all CH or gene-specific CH as outcome, and with age, sex, smoking and the first four genetic principal components as covariates. Significant P values (<0.05) are indicated with full blue circles. f, Experimental strategy for testing the effect of metformin on DNMT3A-R882 clonal growth (versus WT). g,h, Sanger sequencing (g) and next-generation sequencing (h) of DNA collected from bulk colonies. i, Comparison of the proportion of WT versus DNMT3A-R882 colonies in metformin- versus vehicle-treated cells (P by two-sided Chi-square test). Schematics in a,d,e,f were created using BioRender (https://biorender.com).

Source data

Metformin restrains human DNMT3A-R882 CH

To directly validate our findings experimentally in human primary cells, we plated human blood mononuclear cells from a DNMT3A-R882 CH carrier (VAF 12%) for colony formation in semisolid media in the presence of metformin or vehicle control for 13 days (Fig. 5f). Both Sanger (Fig. 5g) and next-generation (Fig. 5h) sequencing of pooled plate DNA showed significantly decreased DNMT3A-R882 VAF. Furthermore, genotyping of 96 individual colonies per condition confirmed a significant decrease in DNMT3A-R882 versus WT colonies with metformin treatment (Fig. 5i). We also tested the effect of CTPI2 treatment on human primary DNMT3A-R882 CH in the same manner and found that it reduced the proportion of DNMT3A-R882 (versus WT) myeloid (colony-forming unit–granulocyte/macrophage, CFU-G/M), but not erythroid progenitor colonies (burst/colony-forming unit-erythroid, B/CFU-E; Supplementary Fig. 2).

Discussion

Most cases of myeloid neoplasm arise through evolution of CH clones over many years or decades46,47, raising the prospect that timely interventions may prevent malignant progression48, mirroring a paradigm that is well-established in solid cancers49,50. However, unlike solid cancers, the option of physically removing premalignant lesions is not available for the precursors of myeloid neoplasms. Instead, putative therapeutic approaches will need to rely on systemic interventions that curtail or abort malignant progression.

In this manuscript, we focused on DNMT3A-R882 mutations, the single most common AML-initiating CH mutation1. To understand and target the mechanisms driving clonal expansion, we developed a conditional Dnmt3aR882H mouse model that recapitulates the enhanced HSPC self-renewal and delayed/low-penetrance progression to myeloid neoplasm seen with these mutations in humans. Using a genome-wide CRISPR dropout screen, we identified genetic vulnerabilities, many involved in mitochondrial metabolism, including several genes (for example, SLC25A1, LDLR and HK3) whose expression correlates with worse AML patient survival. Using metabolic flux analysis, we confirmed the increased reliance on OXPHOS of Dnmt3aR882H. Normal HSC require respiration to maintain quiescence51 but favour anaerobic glycolysis for energy production52,53,54, whereas leukaemia stem cells displayed a greater reliance on OXPHOS and mitochondrial function55,56,57. Our studies propose that Dnmt3aR882H/+ HSPC partially mirror a leukaemia stem cell pattern in their increased reliance on OXPHOS. The mechanisms through which DNMT3A-R882 mutations may drive a shift towards OXPHOS are inadequately understood and warrant future investigation.

With our focus being the development of therapeutic approaches to avert AML development, we used extant therapeutics to target selected vulnerabilities—namely, the citrate transporter SLC25A1 (with CTPI2) and ETC complex I component NDUFB11 (with metformin or IACS-010759)—and found that these treatments reversed the selective growth advantage of Dnmt3aR882H over WT LT-HSC in transplanted recipients. Significantly, analysis of UKB data demonstrated a markedly lower prevalence of DNMT3A-R882-CH among individuals taking metformin, but not untreated diabetics. Interestingly, we observed an increased risk of TET2-CH in diabetics who later started metformin, corroborating the recently reported increased risk of TET2-CH, but not DNMT3A-CH, in people with high BMI58. Also, using MR, we show that genetic instruments for potential confounders such as HbA1c, diabetes, BMI and waist-to-hip ratio do not affect DNMT3A-R882-CH prevalence, providing further independent support for a causal relationship between metformin and reduced DNMT3A-R882-CH. These findings are compelling and propose metformin as a potential non-toxic intervention to retard or stall DNMT3A-R882-CH expansion and in turn avert/delay progression to AML.

The recognition of aberrant cellular energetics as a hallmark of cancer59 has led to therapeutic targeting of metabolic pathways. In AML, targeting mitochondrial functions including mitochondrial protein synthesis60, mitochondrial DNA replication61 and NADH dehydrogenase29,62 has been proposed as a therapeutic strategy. In fact, both IACS-010759 and metformin have been tested in AML. IACS-010759 has shown impressive efficacy in preclinical models29; however, its use as a single agent at relatively high doses was hampered by dose-limiting neurotoxicity in early clinical studies30. Metformin can promote apoptosis, induce cell cycle arrest and inhibit cell proliferation in AML63,64, and its use has been associated with a lower risk of all-cause mortality in cancer patients with coexisting diabetes65. Mechanistically, metformin targets complex I66,67 as well as activating AMPK, leading to downstream effects on cell cycle arrest, autophagy and mTOR inhibition31,68. Interestingly, increased mTOR signalling has been observed in Dnmt3aR882H/+23 mice; and of note, the mTOR component Mlst8 was a vulnerability in our screen. Growth of Dnmt3aR882H/+ cells has also been associated with increased inflammation, in particular TNF signalling14. Metformin reduces the levels of cytokines including TNF, IL6 and IL-1β69,70, which may also hamper the competitive advantage for Dnmt3aR882H/+ HSPCs. We also observed increased 5mC in metformin-treated Dnmt3aR882H/+, in line with previous reports of an increase in global DNA methylation in WT cells linked to changes in S-adenosylhomocysteine and S-adenosylmethionine71. Further studies are required to determine whether this mechanism is also relevant to its effect on DNMT3A-R882 CH.

Intriguingly, we also found that Slc25a1-ko was associated with raised 5mC as well as moderately increased l- and d-2HG levels. However, we did not find any changes in 5hmC in Slc25a1-ko indicating that TET inhibition was not the main mediator of increased 5mC. It should also be noted that TET2 loss-of-function mutations can co-occur with DNMT3A mutations in both CH1,3,4,72 and myeloid malignancy73,74, making it unlikely that a simple reciprocal relationship exists between them. This is also supported by the co-occurrence of DNMT3A and IDH1/2 mutations in an AML subset75. With regards to other possible mediators of the effect of Slc25a1 loss, it is noteworthy that a recent study showed increased AMPK in Slc25a1−/− cells36, indicating that AMPK activity may also play a role.

Collectively, our findings propose that targeting OXPHOS by existing therapeutics may represent a therapeutic approach for delaying/averting progression of DNMT3A-R882-CH to AML. However, before clinical studies are implemented, two critical considerations need to be made: first, can one assume that curtailing or reversing clonal expansion of DNMT3A-R882-mutant CH will reduce progression to AML? This premise has not been tested, but several lines of evidence support it, including observations that both growth rate46 and clonal size1,2 are key determinants of AML progression risk. In fact, among UKB participants with DNMT3A-R882-mutant CH, those who developed AML had significantly larger CH clones than those who did not1. Also, progression from DNMT3A-R882-CH to AML happens through acquisition of further mutations, particularly in the NPM1 gene73. Such mutations are more likely to be acquired in larger clones76. Therefore, individuals with larger DNMT3A clones, in whom AML risk is significantly higher1,2, could be prioritized for interventions to avert progression. On the other hand, given the strong effect of metformin on smaller clones, consideration could also be given to offering it earlier. Although these observations do not prove the premise, they offer sufficient support to justify a clinical study. Second, in contrast to treating a rapidly fatal disease like AML, preventive treatment administered to someone with a relatively benign state like CH dictates that it should be non-toxic and lack potential long-term side effects. Metformin has been an established treatment for T2D for decades and is currently taken by millions of people worldwide with an acceptable and predictable side-effect profile overall. This, along with our mechanistic, genetic and preclinical data, calls for the initiation of clinical trials of metformin use specifically in individuals with DNMT3A-R882-mutant CH1,47.

Methods

Mouse models

The in vivo experiments were performed under project licences PPL 80/2564 and PP3797858 issued by the United Kingdom Home Office, in accordance with the Animal Scientific Procedures Act 1986. Murine ethical compliance was approved by the University of Cambridge Animal Welfare and Ethical Review Body. Dnmt3afloxR882H/+ was constructed by flanking native mouse exon 23 of Dnmt3a with loxP sites and introducing human exon 23 containing the DNMT3A-R882 mutation (in C57/BL6 embryonic stem cells). A PGK-Puro cassette flanked with Rox sites was inserted after the human exon 23. The Dnmt3afloxR882H/+ mouse model was crossed with Mx1-Cre mice77. Cre expression was induced by intraperitoneal injection of five- to six-week-old mice with pIpC (20 mg kg1, Sigma, P1530): five doses over a period of 10 days. Cas9-expressing mice were reported previously24. CD45.1 mice: B6.SJL-Ptprca Pepcb/BoyJ mice (Jackson Laboratory, 002014) were used a competitor in transplantation studies. Wild type mice: C57BL/6 J (CD45.2, Jackson Laboratory #000664) were then crossed with CD45.1 to generate CD45.1/CD45.2 recipient mice that were used as transplant recipients between 8 and 18 weeks of age. Mice were housed in specific pathogen-free conditions. All cages were on a 12:12-h light to dark cycle (lights on, 07:30) in a temperature-controlled and humidity-controlled room. Room temperature was maintained at 72 ± 2 °F (22.2 ± 1.1 °C), and room humidity was maintained at 30–70%.

Sample sizes were chosen on the basis of power calculations of expected differences and previous experience with these types of experiments. Sample sizes of n = 3 mice per genotype/cell culture were chosen for most experiments with the exception of flow cytometric analysis and mouse experiments where sample size was increased to account for variation between individuals or for the need to carry out experiments at more than one time point. Mice were allocated to the study groups by genotype. Both sexes were used. For transplantation experiments, animals of the same sex and similar age range were randomly assigned to study groups. Dnmt3aR882H/+ and control mice were 12–16 weeks in most experiments, except aging experiments and characterization of blood compartments at later stages (one-year-old mice), as indicated in the main text. Recipient mice for transplantation experiments were aged 8–16 weeks.

Although the investigators were not blinded to the genotype of the animals, the animal technicians who provided the animal care, supervision and identification of sick animals at humane endpoint (and therefore making the decision to kill sick animals) were blinded. For transplant experiments, flow cytometry data collection and analyses were performed blind (by assigning a number to each animal ID). For in vitro experiments, analyses were performed in batches of animals from both test and control groups using numbers instead of genotypes at the time of data acquisition. For statistical analyses, samples were grouped into test versus control.

Genotyping

Mouse ear snips were lysed with DirectPCR Lysis Reagent (Viagen, 401-E) according to the manufacturer’s instructions. Lysed DNA was used for genotyping each allele with the primers listed in Supplementary Table 9. Polymerase chain reaction (PCR) was performed with REDTaq ReadyMix PCR Reaction Mix (Sigma, R2523) with the following conditions: initial denaturation at 95 °C for 1 min, followed by 35 cycles of 95 °C for 15 s, annealing at 57 °C for 15 s and elongation at 72 °C for 15 s. Final elongation was performed at 72 °C for 10 min. PCR products were visualized on 2.5% agarose gel.

Histological analysis of mouse tissue

The tissues were fixed in 10% formaldehyde and were subsequently paraffin embedded. Bones were decalcified with 0.38 M EDTA pH 7. Tissue sections (4 µm) were stained with hematoxylin and eosin (Thermo Fisher Scientific). Histology assessment was performed with the Bethesda criteria for mouse haematological tumours78,79. The histology slide scanner used for visualization was a Leica Aperio Slide Scanner AT20, running AperioServiceManager.

Blood-count analysis

Blood-count measurement was performed on a VetabC analyser (Horiba ABX).

Mouse exome sequencing

DNA for exome sequencing was extracted with a QiaAmp Mini kit (Qiagen, 56304) and submitted to Novogen for the whole-exome sequencing in-house protocol. Briefly, the genomic DNA was randomly sheared into short fragments with a size of 180–280 bp by sonication. The obtained fragments were end repaired, A-tailed and further ligated with Illumina adaptors. The fragments with adaptors were PCR amplified, size selected and purified. The captured libraries were enriched by PCR amplification. The library was created using the SureSelect Mouse All Exon kit (Agilent, G7550B, G7500B) and checked with Qubit and real-time PCR for quantification and a bioanalyzer for size distribution detection. Quantified libraries were pooled and sequenced on the Illumina XPlus platform with a PE150 strategy producing an expected data output of 12 G raw data per sample, according to the effective library concentration and data amount required.

Whole-exome sequencing data analysis

Whole-exome sequencing reads of AML and control samples were mapped to the mouse genome assembly GRCm39 using BWA v.0.7.18 (ref. 80 (q-bio.GN)) under default parameters, and duplicated reads were flagged using Samtools v.1.9 (ref. 81). Somatic mutations were called by GATK Mutect2 v.4.5 (ref. 82) using AML samples paired with the control sample. Mutation calls were annotated with Ensembl VEP v.112 (ref. 83), and only mutations in exon regions were retained.

Targeted amplicon sequencing

DNA from bulk colonies was extracted with a DNeasy blood and tissue kit (Qiagen, 69504). DNA was amplified in a 25 μl reaction using HiFi HotStart ReadyMix (Kapa 07958927001) and primers targeting the DNMT3A-R882 region—DNMT3A_Ex23_t1_fp: 5′-ACACTCTTTCCCTACACGACGCTCTTCCGATCTCTCTCTGCCTTTTCTCCmCC-3′ and DNMT3A_Ex23_t1_rp: 5′-TCGGCATTCCTGCTGAACCGCTCTTCCGATCTTGTTTAACTTTGTGTCGCTAmCC-3′ at a final concentration of about 4 nM (adjusted for individual primer pairs to attain similar coverage between positions)—and placed in a thermocycler under the following conditions: 95 °C for 3 min, 6 cycles of (98 °C for 20 s, 65 °C for 60 s, 60 °C for 60 s, 55 °C for 60 s, 50 °C for 60 s, 70 °C for 60 s). Following this first round of PCR, samples were kept on ice (to reduce non-specific amplification), and 1 μl of 10 μM i5/i7 index primers were added to each reaction, mixed and placed in a thermocycle under the following conditions: 19 cycles of (98 °C for 20 s, 62 °C for 15 s, 72 °C for 30 s), 72 °C for 60 s. Equal volumes of up to 24 samples (amplified with unique i5/i7 index primer combinations) were pooled, and a 0.55–0.75× double-sided solid-phase reversible immobilization bead cleanup was performed. In most cases, a second solid-phase reversible immobilization bead cleanup (0.75× left-sided) was necessary to reduce contamination of the library (300–400 bp) with adaptors (180–200 bp), as the latter can interfere with sequencing. Libraries were quantified using a Bioanalyzer 2100 (Agilent) and sequenced at 150 bp PE on MiSeq Nano.

Targeted sequencing data analysis

Targeted sequencing reads were mapped to the human genome assembly GRCh38 using BWA v.0.7.18 (ref. 80 (q-bio.GN)) under default parameters. Samtools mpileup81, under the parameter -ABQ0, was used to detect mutant reads at the DNMT3A-R882 locus.

Isolation of mouse BM and hematopoietic progenitors

Femur, tibia and hip bones were collected, crushed with a pestle and mortar and lysed with erythrocyte lysis buffer (0.85% NH4Cl; Sigma, A9434). Cells were incubated in the erythrocyte lysis buffer for 5 min at room temperature, spun down, resuspended in PBS containing 2% FBS and filtered through a 40-µm cell strainer (Falcon, 352340). The HSPC compartment was isolated using magnetic-bead selection using a Direct Lineage Cell Depletion Kit (Miltenyi Biotec, 130-110-470) according to the manufacturer’s instructions.

Serial replating assays

For replating assays of mouse cells, 50,000 BM cells were plated in two wells of six-well plates of methylcellulose (Stem Cell Technologies, M3434). The colonies were counted seven days later, and a further 30,000–50,000 cells were reseeded and recounted after one week until no colonies were observed. In drug treatment experiments, the semisolid media were supplemented with the selected drug or vehicle and replenished with each plating. Human primary DNMT3A-R882 MNCs PB were plated at ~300,000 per well of a six-well plate of methylcellulose (Stem Cell Technologies, H4034) together with indicated concentrations of metformin/CTPI2 or vehicle control and counted at day 13.

Human samples

Ethical approval for the human DNMT3A-R882 CH sample used in this study was granted by the East of England (Cambridge East) Research Ethics Committee (REC reference: 24/EE/0116). Informed consent was provided by the participant.

Colony genotyping

Colonies were collected into DirectPCR Lysis Reagent (Viagen, 401-E) and processed according to the manufacturer’s instructions. Lysed DNA was used for genotyping of DNMT3A with the following primers: F: 5′-CTGAGTGCCGGGTTGTTTAT-3′, R: 5′-GGAAGGGAGCTTGGTTTTGT-3′. PCR was performed with REDTaq ReadyMix PCR Reaction Mix (Sigma, R2523) with the following conditions: initial denaturation at 95 °C for 1 min, followed by 35 cycles of 95 °C for 15 s, annealing at 57 °C for 15 s and elongation at 72 °C for 15 s. Final elongation was performed at 72 °C for 10 min. The presence of DNMT3A-R882C generates further restriction sites for the AluI enzyme, and thus PCR products were subsequently digested with the restriction enzyme AluI (New England Biolabs, R0137S) for 1 h at 37 °C and visualized on 2.5% agarose gel.

Lentiviral-vector production and transduction

Lentiviruses were produced in 293-FT (Invitrogen, R70007) cells with pMD2.G (Addgene, 12259) psPAX2 (Addgene, 12260) and the mouse v.2 whole-genome gRNA library24 in a 1.5:0.5:1 ratio. Plasmid transfection was performed with Lipofectamine LTX (Invitrogen, 15338100) according to the manufacturer’s instruction. Viral supernatants were concentrated by centrifugation at 6,000 rcf, 16 h, at 4 °C. The cells were transduced by spinoculation (60 min, 800g, 32 °C) in culture medium supplemented with polybrene (4 µg ml−1; Sigma, TR-1003-G) and further incubated overnight at 37 °C. The media was fully changed the following day.

gRNA cloning and competitive proliferation assay

gRNAs were cloned into a BbsI-digested pKLV2-U6gRNA(BbsI)PGKpuro2ABFP backbone84. Sequences of gRNAs used in the study are provided in Supplementary Table 10. For the competitive proliferation assay, the gRNA lentiviral vector was transduced into primary Dnmt3aR882H/+, Cas9 or WT, Cas9 HSPCs with 50% of transduction efficiency, which was verified on day 4 post-transduction. Cells were then cultured and passaged three times per week. At each passage, the BFP versus non-BFP cell proportion was compared by flow cytometry.

Bioenergetics analysis with Seahorse extracellular flux analyser

Cellular oxygen consumption was assessed using the Seahorse XF96 analyser according to the manufacturer’s instructions. Briefly, 100,000–125,000 HSPCs were plated into a well of CellTak-coated XF96 cell culture microplates in XF DMEM medium (Agilent, 103575-100). For the Mito-stress test assay, the media was supplemented with 10 mM glucose (Sigma, G8644), 1 mM pyruvate (Agilent, 103578-100) and 2 mM glutamine (Agilent, 103579-100,). For the glycol-stress test assay, the media was supplement with 1 mM pyruvate and 2 mM glutamine. Cells were cultured in a non-CO2 incubator at 37 °C for 1 h. The oxygen consumption rate (OCR) and extracellular acidification rate were measured to assess mitochondrial and glycolytic activity. For the Mito-stress cell assay, OCR was measured in basal conditions, and the cells were then treated sequentially with 1 µM oligomycin (Sigma, 495455) and 2 µM carbonyl cyanide p-(trifluoromethoxy) phenylhydrazone (Sigma, C2920) followed by the addition of a final solution containing 0.5 μM rotenone (Sigma, R8875) and 0.5 μM antimycin A (Sigma, A8674). Glucose-stimulated respiration was measured by the sequential addition of 10 mM glucose, 2 µM oligomycin and 50 mM 2-Deoxy-d-glucose (Sigma, D8375). Drugs provided in manufactures’ kits were also used in the study, including the Seahorse XF Cell Mito Stress Test Kit (Agilent, 103015-100) and Seahorse XF Glycolysis Stress Test Kit (Agilent, 103017-100). Results were analysed with Wave software (Agilent Technologies) and normalized to cell concentration. For metformin/CTPI2 treatment experiments, WT and Dnmt3aR882H/+ HSPCs were plated in X-vivo20 media with cytokines and pretreated with selected inhibitors for 2 h. The drug was washed off, and cells were plated into CellTak-coated XF96 cell culture microplates in XF DMEM medium and processed for OCR analysis.

Genome-wide CRISPR screen, sequencing and data analysis

HSPCs were extracted from two primary Dnmt3R882H/+ mice and used for the screen. HSPCs were plated into M3434 for seven days and then cultured in vitro for another 10 days in X-VIVO 20 media (BE04-448Q, Lonza Bioscience) supplemented with 5% serum (Gibco, 10438-026), 10 ng ml−1 IL3 (Peprotech, AF-213-13-1000), 10 ng ml−1 IL6 (Peprotech, AF-216-16-1000), 50 ng ml ml−1 SCF (Peprotech, AF-250-03-1000) and 1% penicillin–streptomycin–glutamine (Gibco, 12090216). On day 18, we transduced with the genome-wide library, choosing this time to reflect the clonal replating advantage observed with these cells in semisolid media culture (replating 3). For the screen, we used our murine v.2 lentiviral gRNA library containing 90,230 gRNAs against 18,424 genes, as described previously24,84. Then, 3 × 107 cells were transduced with a predetermined volume of the mouse v.2 genome-wide gRNA lentiviral supernatant that gave rise to a 30% transduction efficiency measured by BFP expression. Two days after transduction, BFP expression cells were flow-sorted and cultured for a total of 30 days. On day 6 post-transduction, 20% of the cells were collected for DNA extraction, and the remaining cells were replated in fresh X-vivo20 media with supplements. On day 10, 40% of the cells were collected for DNA extraction and the remaining cells replated. From day 15 to day 30, 50% of the cells were collected for DNA analysis and 50% replated in fresh media. Genomic DNA extraction and Illumina sequencing of gRNAs were described previously24. CRISPR sequencing was performed with the HiSeq2500 Illumina system: 19-bp single-end sequencing was performed with the custom sequencing primer 5′-TCTTCCGATCTCTTGTGGAAAGGACGAAACACCG-3′. Enrichment and depletion of guides and genes were analysed using the MAGeCK statistical package85 by comparing read counts from each cell line with counts from the plasmid as the initial population. gRNA counts of CRISPR screen data are provided in Supplementary Data 1. CRISPP dropout data for human cancer cell lines were obtained from DepMap86,87. Gene set enrichment analyses were performed with the Enrichr online software88. Drug–gene interactions were investigated using the Drug Gene Interaction database (DGIdb)89.

Flow cytometry

The flow cytometry staining and gating strategy for progenitor and differentiated blood panels shown in Fig. 1c,d and Extended Data Fig. 1c–l, were reported previously90 and are supplied in Supplementary Fig. 3. Briefly, BM cells, post erythrocyte lysis, were blocked with antimouse Cd16/32 (mouse BD FC block, BD Pharmigen, 553142, 1:500 dilution) and 10% mouse serum (Sigma, M5905) for LSK and common lymphoid progenitor (CLP) staining or 10% mouse serum alone for LK staining. LSK, MPP, lymphoid primed multipotent progenitor and LT/ST-HSC flow cytometry staining was performed with antibodies against CD4-PE-Cy5 (BioLegend, 100514, 1:800), Cd5-PE-Cy5 (BioLegend, 100610, 1:600), Cd8a-PE-Cy5 (BioLegend, 100710, 1:800), Cd11b-PE-Cy5 (BioLegend, 101210, 1:400), B220-PE-Cy5 (BioLegend, 103210, 1:200), Ter119-PE-Cy5 (BioLegend, 116210, 1:300), Gr-1-PE-Cy5 (BioLegend, 108410, 1:400), Sca1-PB (BioLegend, 122520, 1:100), c-Kit-APC-Cy7 (eBioscience, 47-1171, 1:200), Cd48-APC (BioLegend, 103411, 1:150), Cd150-PE-Cy7 (BioLegend, 115913, 1:100), Cd34-FITC (BD Pharmigen, 553733, 1:50) and Flt3-PE (eBioscience, 12-1351, 1:50). Granulocyte-monocyte progenitor, MEP and common myeloid progenitor staining was performed with the following biotin-conjugated lineage markers: Mac1, Gr1, Cd3, B220, Ter119 (BD Pharmigen, 559971, 1:300 each), Il7Ra (BioLegend, 121103, 1:300) and streptavidin-PE-Cy7 (BioLegend, 405206, 1:300) alongside Cd34-FITC (BD Pharmigen, 553733, 1:50), Cd16/32-PE (BD Pharmigen, 553145, 1:50), c-Kit-APC (BioLegend, 105812, 1:100) and Sca1-PB (BioLegend, 122520, 1:100). For the detection of the CLP population, cells were stained for Flt3-PE (eBioscience, 12-1351, 1:50), IL7Ra-FITC (BioLegend, 135008, 1:50) and the lineage biotin-conjugated markers Mac1, Gr1, Cd3, B220, Ter119 (BD Pharmigen, 559971) and NK (LSBio, LS-C62548, 1:300) and streptavidin-PE-Cy7 (BioLegend, 405206, 1:300) alongside c-Kit-APC (BioLegend, 105812, 1:100) and Sca1-PB (BioLegend, 122520, 1:100). Differentiated BM, spleen and PB cells were stained with CD11b-APC-Cy7 (BD Pharmigen, 557657, 1:300), B220-PE-Cy5 (BioLegend, 103210, 1:200), Gr1-BV450 (BD Pharmigen, 560603, 1:600) and CD3e-PE (eBioscience, 12-0031-82, 1:300). LT-HSC were defined as Linc-Kit+Sca1+Flt3Cd48Cd150+CD34. ST-HSC were defined as Linc-Kit+Sca1+Flt3Cd48Cd150+CD34+. MPPs were defined as Linc-Kit+Sca1+Flt3+. Lymphoid primed multipotent progenitors were defined as Linc-Kit+Sca1+Flt3high. CLPs were defined as LinFlt3hiIl7Ra+c-KitloSca1lo. Granulocyte-monocyte progenitors were defined as LinIl7Rac-Kit+Sca1Cd34+Cd16/32+. Common myeloid progenitors were defined as LinIl7Rac-Kit+Sca1Cd34+Cd16/32. MEPs were defined as LinIl7Rac-Kit+Sca1Cd34Cd16/32.

For analysis of progenitor compartment in transplant experiments, freshly isolated BM cells post erythrocyte lysis were washed and stained with the following antibodies: biotin-conjugated antibody for B220, Ter119, Mac1, Cd3, Gr1 (BD Biosciences, 559971, 1:300 each), detected with Streptavidin-BV510 (BioLegend, 405233, 1:900), as well as Sca1-PB (BioLegend, 122520, 1:100), c-Kit-AF780 (Thermo Fisher Scientific, 47-1171-82, 1:100), Cd48-BV605 (BioLegend, 103441, 1:200), Cd150-BV650 (BioLegend, 115932, 1:100), Cd135-PE (BD Biosciences, 553842, 1:100), Cd34-FITC (Thermo Fisher Scientific, 11-0431-82, 1:50), Cd16/32 PerCP-Cy5.5 (BioLegend, 101324, 1:100), CD45.1-APC (BioLegend, 110714, 1:100), CD45.2-PE-Cy7 (Thermo Fisher Scientific, 25-0454-82, 1:100) and 7AAD (BioLegend, 420404, 1:100). LT-HSC were defined as 7AADLinc-Kit+Sca1+Cd48Cd150+, ST-HSC as 7AADLinc-Kit+Sca1+Cd48Cd150, MPP2: ST-HSC as 7AADLinc-Kit+Sca1+Cd48+Cd150+, MPP2: MPP3 as 7AADLinc-Kit+Sca1+Cd48+Cd150 and LSK as 7AADLinc-Kit+Sca1+. The gating strategy is shown in Supplementary Fig. 4a. CD45.1 and CD45.2 markers were used to distinguish injected cells in each of the gated populations. For differentiated blood panel staining in PB, BM and spleen post-transplantation, cells were stained with the following antibodies: CD45.1-BV605 (BioLegend, 110738, 1:200), CD45.2-FITC (BioLegend, 109806, 1:200), Cd4-PE-Cy5 (BioLegend, 100514, 1:400), Cd8-PE-Cy5 (BioLegend, 100710, 1:400), B220-PE-Cy5 (BioLegend, 103210, 1:400), Gr1-AF700 (BioLegend, 108422, 1:400), Mac1-AF700 (BioLegend, 101222, 1:400) and B220-AF700 (BioLegend, 103232, 1:400). B cells were defined as B220-PE-Cy5+B220-AF700+, T cells as Cd4- Cd8-PE-Cy5+AF700 and myeloid cells as Mac1-Gr1-AF700+ PE-Cy5. The gating strategy is shown in Supplementary Fig. 4b. Flow cytometry analysis was performed with an LSRFortessa instrument (BD) and BD FACSDiva Software, and data were analysed in FlowJo software. Figures 1h and 4c show fluorescence-activated cell sorting (FACS) pseudocolour smooth plots, and Supplementary Fig. 4 shows pseudocolour plots.

Transplantation

BM was isolated from sex/age-matched CD45.2-Dnmt3aR882H/+ or CD45.2-WT post-pIpC and CD45.1-WT, and erythrocytes were lysed with 0.85% ammonium chloride. For competitive transplantation, CD45.2 BM was mixed with CD45.1 BM in a 1:2 proportion and 1 × 106 cells injected intravenously into lethally irradiated (2 × 5.5 Gy) CD45.1/2 recipient mice. For non-competitive transplant, 1 × 106 Dnmt3aR882H/+ or WT BM cells were injected intravenously into lethally irradiated CD45.1/2 recipient mice.

Drug treatment in vivo and in vitro

CTPI2 (2-(4-Chloro-3-nitro-benzenesulfonylamino)-benzoic acid; sc-339832) was administered at 40 mg kg1 by means of intraperitoneal injection three times per week for two weeks. CTPI2 preparation was reported previously28. Briefly, CTPI2 was prepared in 0.47% sodium bicarbonate (NaHCO3, Biochrom, L-1703) at a final concentration of 11.2 mM; 0.47% NaHCO3 was used as vehicle control. For in vitro studies, CTPI2 was dissolved either in 0.47% NaHCO3 or in DMSO.

IACS-010759 (Stratech Scientific, S8731-SEL-100mg) was prepared in 0.5% methylcellulose (Sigma, M0262) as reported before29. Briefly, the drug was dissolved in 0.5% methylcellulose solution and sonicated for 8–15 cycles (30 s on, 30 s off) in a Bioruptor Pico instrument (Diagenode), followed by homogenization at 5,000 rpm for 3 min using an IKA Ultra‐Turrax homogenizer. IACS-010759 was applied by oral gavage daily (five days on, two days off) at dose 8 mg kg1; 0.5% methylcellulose was used as vehicle control. For in vitro studies, IACS-010759 was prepared in DMSO.

Metformin hydrochloride (Sigma, PHR1084 or TRC-M258815-100G) was dissolved in PBS and administrated at dose 125 mg kg1 by means of intraperitoneal injection daily (five days on, two days off) as previously reported32. Treatment was performed for six weeks in total. For in vitro studies, metformin was prepared in sterile water. A dose of 125 mg kg1 per day, when translated to human equivalents using the conversion coefficient proposed by ref. 91, corresponds to 10.4 mg kg1 per day. Notably, this pharmacological dosage is lower than the levels commonly prescribed for the treatment of T2D.

Dot blots for 5mC and 5hmC

Genomic DNA was extracted with a Puregene Cell Kit (8 × 108) (Qiagen, 158043). The DNA was denatured and spotted onto a Hybond NX membrane (Amersham Biosciences, RPN303 T). The membrane was air dried for 15 min and incubate in 2× saline-sodium citrate buffer for 5 min. The membrane was air dried for 30 min, wrapped in cling film (Bakewell) and UV-crosslinked using the automatic setting on a Stratagene UV Stratalinker 2400 (120,000 µJ cm2 for 150 s). The membrane was then blocked with 1% BSA and 5% skim milk in PBS containing 0.2% Tween (PBST) for 1 h at room temperature and washed three times for 5 min with PBST. For 5hmC detection, the membrane was incubated with anti-5hmC Rb antibody (Active Motif, 39769, 1:10,000 dilution) in 3% BSA in PBST overnight at 4 °C. For 5mC detection, the membrane was incubated with anti-5mC Ms antibody (Active Motif, 61479, 1:2 000). The membrane was washed three times for 5 min with PBST and incubated in anti-rabbit (Jackson ImmunoResearch Laboratories Inc., 111-035-003) or anti-mouse (Jackson ImmunoResearch Laboratories Inc., 115-035-146) HRP-conjugated secondary antibody (diluted 1:20,000 in 1% BSA and 5% skim milk in PBST) for 1 h at room temperature. The membrane was then washed three times for 5 min with PBST and imaged using a SuperSignal West Dura Extended Duration Substrate (Thermo Scientific, 37071). For total DNA detection, the membrane was washed with water and incubated in a 0.1% methylene blue solution containing 0.5 M sodium acetate, pH 5.2 overnight at room temperature. The membrane was then washed three times for 5 min with water and air dried for 30 min. The membrane was scanned and analysed using Image Studio Lite software. Uncropped dot blots and gel images are available in Supplementary Figs. 57.

LC–MS analysis of aqueous metabolites

Liquid chromatography–mass spectrometry (LC–MS) was used to measure metabolites in aqueous extracts from cells. To this end, a Q Exactive Plus orbitrap mass spectrometer coupled to a Vanquish Horizon ultra-high performance liquid chromatography system (both Thermo Fisher Scientific) was used. Customized methods were used to separate and detect distinct metabolites of interest, as detailed below.

LC–MS sample preparation

Cells were collected into 2-ml tubes, spun down and washed twice with cold PBS. The cell pellet was then resuspended in 500 µl 4:1 methanol water. Aqueous metabolites were extracted from cells using a method modified from ref. 92. Briefly, 800 µl chloroform (Fisher Scientific, 15643700) and 380 µl Milli-Q water was added to cells suspended in 500 µl 4:1 methanol water in 2 ml tubes. This was followed by vortexing, sonication and centrifugation for 5 min at 21,300g. The top, aqueous phase was then collected from each sample into eppendorf tubes and subsequently dried using an SC210A SpeedVac vacuum centrifuge (Savant, Thermo Fisher Scientific). Dried metabolite extracts were reconstituted in 10 mM ammonium acetate (Fisher Chemical, 10598410) LC sample buffer or derivatized with diacetyl-l-tartaric anhydride (DATAN, Merck, 336040050) for enantiomer analysis (see ‘l- and d-2-HG enantiomer analysis’) as described previously93. When necessary, further internal standards were added to the LC sample buffer: universal 15N13C amino acid mix, succinate 13C4, AMP 15N513C10, ATP 15N513C10, putrescine D8, dopamine D4.

LC–MS measurement of 2HG

An ACE Excel C18-PFP column (150 × 2.1 mm, 2.0 µm, Avantor, EXL-1010-1502U) was used to separate species such as TCA cycle intermediates, amino acids and related compounds, as previously94. For positive ion mode analyses, mobile phase A consisted of water with 10 mM ammonium formate (Merck, 70221-100G-F) and 0.1% formic acid (Optima grade, Fisher Chemical, 10596814); and for negative ion mode analyses, mobile phase A consisted of water with 0.1% formic acid. Mobile phase B was acetonitrile (Chromasolv, Honeywell, 34851-2.5L) with 0.1% formic acid for both positive and negative ion modes. The C18pfp LC gradient used is detailed in Supplementary Table 11a. The flow rate was 0.5 ml min−1, and the injection volume was 3–3.5 μl. The needle wash used was 1:1 water to acetonitrile.

Source parameters used for the orbitrap were an auxiliary gas heater temperature of 450 °C, a capillary temperature of 275 °C, an ion spray voltage of 3.5 kV (2.5 kV for negative ion mode) and a sheath gas, auxiliary gas and sweep gas of 55, 15 and 3 arbitrary units, respectively, with an S-lens radio frequency of 50%. A full scan of 60–900 m/z was used at a resolution of 70,000 ppm in positive ion mode and a full scan of 55–825 m/z at a resolution of 140,000 ppm in negative ion mode.

l- and d-2-HG enantiomer analysis

An Acquity Premier HSS T3 column (1.8 µm, 2.1 × 100 mm, Waters, 186009468) was used for metabolite extracts derivatized with DATAN, for the purpose of measuring l and d enantiomers of 2-HG, as previously93. An internal standard of U-13C algal lyophilized cells (CK Isotopes, CLM-2065-1) was added to each sample before drying and derivatization; 10 µl of 0.5 µg µl−1 was added to each sample post-Folch extraction. Standards of l- and d-2-HG (Merck, 90790-10MG and H8378, respectively) were prepared (100 µM in Milli-Q water), to confirm the retention times of l- and d-2-HG. These standards (100 µl of each) were dried, derivatized and run in parallel alongside the samples. Mobile phase A was 1.5 mM ammonium formate (to pH 3.6 with formic acid), and mobile phase B was acetonitrile with 0.1% formic acid. The LC gradient used is detailed in Supplementary Table 11b. The flow rate was 0.4 ml min−1, and the injection volume was 5 µl for samples and 1 µl for l- and d-2-HG standards. The needle wash used was 1:1 water to acetonitrile.

Source parameters used for the orbitrap were as for the C18pfp method. In addition, parallel reaction monitoring of the transitions m/z 363.0569 (2-HG + DATAN) to 147.0299 (2-HG) and m/z 368.0737 (2-HG 13C5 + DATAN) to 152.0467 (2-HG 13C5) was performed with a collision energy of 20 in negative ion mode.

LC–MS data processing

Targeted analysis of LC–MS data was performed using Thermo Scientific Xcalibur (Quan Browser and Qual Browser) as previously94. Peak areas were normalized to an appropriate internal standard where possible. For 2HG, area ratios to 13C15N glutamate internal standard were calculated. For l- and d-2HG enantiomer analysis, the m/z of 147.0299 (2-HG) was extracted as previously93, in addition to the m/z of 152.0467 for the 2-HG 13C5 internal standard. Area ratios of l- and d-2-HG to respective 13C5 labelled l- and d-2HG were calculated.

TCGA-AML data analysis

Gene expression and survival data of the TCGA-AML cohort (151 samples) were downloaded from the GDA data portal95. To plot the Kaplan–Meier curves, samples were divided into two groups according to the gene’s expression level: high-expression group, expression levels above the median; and low expression group, expression below the median. Survival curves and statistics were generated by the ‘survival’ package in R96).

UKB analysis

UKB is a prospective cohort of about 500,000 adults, aged between 40 to 70 years and recruited between 2006 and 201041,97. We identified 419,389 individuals of genetically determined European ancestry from the UKB with whole-exome sequencing data passing quality control criteria98. Individuals with CH were identified by somatic variant calling of established driver mutations, as previously described72,98. Among the 419,226 individuals with CH data available, we retained 416,118 individuals without prerecruitment blood cancer diagnosis. Eight hundred and forty-six type 1 diabetes individuals were identified as previously described99 and were removed. Finally, 412,234 individuals with complete covariate data were included for analysis. Diabetes status and intake of antidiabetic medications (metformin, sulphonylureas, insulin, thiazolidinediones, meglitinides and acarbose) were ascertained from self-reported data. More individuals on metformin were ascertained from general practitioner records. In total, 11,190 individuals were identified as metformin users at recruitment and 19,287 individuals were identified as diabetic at recruitment, of whom 11,001 (57.0%) were on metformin, consistent with a recent study100. Circulating HbA1c levels were measured for each UKB participant at recruitment and used to identify undiagnosed diabetics (HbA1c ≥ 7%). We used logistic regression to model the association between antidiabetic medication at recruitment (predictor) and overall CH or gene-specific CH (outcome) and included age, sex, smoking and the first four genetic principal components as covariates. Metformin, sulphonylureas (glibenclamide, gliclazide, glipizide, glimepiride, tolbutamide) and insulin were included as antidiabetic medications in the association analysis. The UKB study has approval from the North-West Multi-centre Research Ethics Committee (11/NW/0382). The present study has been conducted under approved UKB application numbers 56844 and 26041. As we did not find an association between metformin use and DNMT3A non-R882 CH, we classified DNMT3A non-R882 mutations into high- and low-functionality variants as determined by ref. 40 and assessed the association between metformin and DNMT3A variants stratified by functionality.

MR analysis

For polygenic MR analysis, independent germline genetic variants associated with HbA1c, BMI, T2D and BMI-adjusted waist-to-hip ratio were retrieved from previous publications that included data from the UKB42,43,44,45. For HbA1c, we also used the instruments from ref. 42 with independent betas from the Meta-Analyses of Glucose and Insulin-related traits Consortium43 (two-sample MR). Genetic associations using these genetic instruments with CH as the outcome were performed using Firth logistic regression implemented by REGENIE software101, assuming an additive effect, adjusted for age, sex and the first 10 genetic principal components. MR analyses were performed using the TwoSampleMR v.0.5.6 R package102,103 (valid in both one-sample and two-sample settings104), with glycaemic-related traits as the exposures and CH as the outcome, and the test statistics reported were derived from inverse variance weighting. MR summary statistics are provided in Supplementary Table 8.

Illustrations and graphs

Schematic illustrations were made with BioRender under University of Cambridge – Central License’s Plan. The image of a seahorse in Fig. 3c and Extended Data Fig. 4g was hand-drawn by Kazimierz Gozdecki. Data were analysed and plotted using GraphPad Prism v.10. IGV software105 was used to visualize the data shown in Fig. 1b.

Statistical analysis

All statistical analyses were performed with two-sided Student’s t-test, log-rank (Mantel–Cox), one-way analysis of variance (ANOVA), two-way ANOVA or Chi-square test, as specified in figure legends. Samples were tested for normal distribution before statistical analysis. Error bars represent the s.e.m. or the s.d. P values ≤ 0.05 were considered statistically significant. Representative data/images were replicated as specified in the relevant figure legend.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.