Introduction

Clonal hematopoiesis (CH) is a phenomenon where a proportion of blood cells are derived from a single hematopoietic stem cell clone that has acquired a proliferative advantage, normally as a consequence of somatic mutations1. CH is a highly prevalent phenomenon of normal aging and also occurs in a spectrum of disorders such as Clonal Hematopoiesis of Indeterminate Potential (CHIP), Clonal Cytopenia of Undetermined Significance (CCUS), and overt myeloid neoplasms, with marked differences in clinical implications.

CHIP is defined as a clonal expansion of hematopoietic stem cells secondary to the acquisition of somatic mutations that confer a proliferative advantage, in the absence of alterations in blood counts or hematologic malignancy2,3. The incidence of CHIP increases with age and affects around 10% of individuals of 65 years old, being associated with an increased risk of hematologic neoplasms2,4,5, osteoporosis6 and cardiovascular diseases7,8,9,10 such as acute myocardial infarction (AMI) and ischemic stroke.

The association of CHIP with the development of hematologic malignancies is well described, with an estimated absolute risk increase of 0.5-1.0% per year and the genes most frequently involved in CHIP are DNMT3A, TET2, and ASXL111. However, mutations in these genes are associated with a lower risk of malignant transformation, while others such as TP53 and U2AF1 are associated with an increased risk of developing myeloid neoplasms, including AML (acute myeloid leucemia).

Another well-characterized entity is CCUS, which is defined as clonal hematopoiesis associated with persistent cytopenia(s), including anemia (Hb < 12 g/dL in females and < 13 g/dL in males), neutropenia (< 1.8 × 109/L), or thrombocytopenia (< 150 × 109/L)12,13,14, that are unexplained by hematological or non-hematological conditions. Concerning clinical relevance, it is acknowledged that CHIP and CCUS are precursor conditions that can develop into myelodysplastic syndromes (MDS) and AML, CCUS being at higher risk 14,15. Thus, in addition to cytopenias, to estimate the risk of progression to hematologic neoplasms, it is important to analyze the affected genes, the number of driver mutations, and their respective allelic frequencies.

Until a few years ago, most findings described in the literature on this topic originated from retrospective exome sequencing studies, systematic reviews, and analyses of prospective databases16,17,18. However, more recently, there has been a growing body of large-scale studies evaluating the longitudinal dynamics of CH, including in older adult populations1,19,20,21,22,23.

This study aimed to evaluate the prevalence of CHIP and CCUS in a prospective cohort of individuals over 75 years old without prior hematologic disease and also monitor clinical outcomes and changes in allelic frequency over time. Furthermore, it aimed to investigate mortality rates between carriers and non-carriers of clonal hematopoiesis and clinical differences between patients with CHIP and CCUS over the study period. Lastly, the study intended to explore the potential correlations between CHIP and conditions such as osteoporosis, dyslipidemia, cardiovascular disease, hematopoietic neoplasms, and any type of cancer.

Materials and methods

Study population

The inclusion criteria for the study comprised patients aged 75 years or older with no history of myeloid hematologic neoplasms who were receiving medical care for a wide range of aging-related conditions, including chronic cardiovascular, neurological, psychiatric, orthopedic, and other diseases, at the Geriatrics Outpatient Clinic of the Federal University of São Paulo (UNIFESP) or at the Residential Israelita Albert Einstein. All patients were recruited between 2015 and early 2016 and underwent regular follow-up visits at these institutions. This recruitment strategy enabled consistent longitudinal monitoring, as some individuals at the Residential Israelita Albert Einstein were permanent residents, and the majority of patients at UNIFESP were also enrolled in other research projects conducted by the geriatrics department, which further facilitated long-term follow-up. These patients had their names anonymized and were further identified by a numeric code. Informed consent was obtained from all patients or their legal guardian.

Sample collection and study design

Initially, peripheral blood samples totaling 3 tubes of EDTA were collected from all 173 patients at the beginning of the study. These initial samples had a complete blood count analysis and troponin plasma dosage. After sequencing analysis, patients were divided into 2 groups: the CH group (patients with a clonal hematotoiesis-related mutations with variant allele frequency (VAF) above 1.5% for substitutions and above 2% for insertions and deletions) and the no clonal hematopoiesis group (no-CH).

After a 2-year interval, the study’s second phase collected peripheral blood samples from patients with CH in the first analysis. These samples underwent complete blood count analysis and next-generation sequencing, following the same protocol as the initial phase.

Sample processing and DNA extraction

The details regarding sample processing and DNA extraction are described in the Supplementary Material.

Library Preparation

Libraries were prepared from 50 ng of extracted genomic DNA from each patient and subjected to the TruSight Myeloid Sequencing Panel protocol (Illumina), following the manufacturer’s instructions. The panel covers 54 genes, including: ABL1, ASXL1, ATRX, BCOR, BCORL1, BRAF, CALR, CBL, CBLB, CBLC, CDKN2A, CEBPA, CSF3R, CUX1, DNMT3A, ETV6/TEL, EZH2, FBXW7, FLT3, GATA1, GATA2, GNAS, HRAS, IDH1, IDH2, IKZF1, JAK2, JAK3, KDM6A, KIT, KRAS, MLL, MPL, MYD88, NOTCH1, NPM1, NRAS, PDGFRA, PHF6, PTEN, PTPN11, RAD21, RUNX1, SETBP1, SF3B1, SMC1A, SMC3, SRSF2, STAG2, TET2, TP53, U2AF1, WT1, and ZRSR2. The protocol was used as described previously24.

Next-generation sequencing and bioinformatics pipeline

The description of the next-generation sequencing and Bioinformatics Pipeline is provided in the Supplementary Material.

Identification of CH mutations

After generating the annotated VCF tables, we classified the variants found. We considered CH mutations to be those occurring in genes commonly related to hematologic diseases and with a minimum VAF of 1.5% for substitutions and 2% for insertions and deletions. The additional filtering criteria included a minimum read depth of 250 reads, with an average coverage of 4,000 reads, and the exclusion of candidate variants with a population allele frequency greater than or equal to 1% in public databases such as ExAC and gnomAD to filter out potential germline variants. After this classification, patients were divided into CH and no-CH groups. We consider patients with CH to be those who have oncogenic or likely oncogenic variants according to the Variant Interpretation for Cancer Consortium (VICC) classification25.

High-risk CH classification

The classification for high-risk CH used in this study was based on both the definition of VAF higher than 10% proposed by the study by Abelson et al. (2018)11 and the definition of the Clonal Hematopoiesis Risk Score (CHRS) defined in the article by Weeks et al. (2023)15, which defined high-risk CH as patients scoring 12.5 points or more on the CHRS. Details on the calculation of CHRS can be found in the Supplementary Material.

Clinical data collection

Clinical and demographic data were analyzed by comparing CH and no-CH groups. Collected data included age, weight, height, BMI, sex, smoking status, diabetes, hypertension, osteoporosis, and cardiovascular events such as acute myocardial infarction, coronary insufficiency, stroke, transient ischemic attack, congestive heart failure, and dyslipidemia.

Survival analysis

Patients in both groups (CH and no-CH) were followed for 8 years (between 2016 and 2024). The primary outcome was all-cause mortality, with patients censored at the date of last follow-up or loss to follow-up. Survival was analyzed using age-adjusted predicted survival curves from multivariate Cox regression models.

Statistical analysis

The detailed description of the statistical analysis of this study can be found in the Supplementary Material.

Results

Follow-up and demographics

In 2016, 173 patients were recruited and followed until 2024. Follow-up included medical record analysis from care centers and direct contact with patients or family members through researcher-led telephone calls at various times throughout the study.

The CH group included 54 patients and the no-CH group 119. Of the 54 patients with CH, only 27 samples could be re-collected due to deaths (n = 13), loss to follow-up (n = 11), and discontinuation of care (n = 3), with additional constraints imposed by the COVID-19 pandemic. Ultimately, sequencing was performed for 17 of these, as 10 samples were excluded due to insufficient variant classification confidence.

The median follow-up time for the no-CH group was 59.63 months, while for the CH group it was 39.43 months, with no statistically significant difference between the two groups (Table 1).

The overall median age was 86.21 years, with a statistically significant difference between the CH (87.49 years) and no-CH (85.84 years) groups (p = 0.048), although the absolute age difference was less than two years. No statistically significant differences were found between the CH and no-CH groups in weight, height, BMI, serum troponin levels, sex distribution, smoking or follow-up duration (Table 1).

Table 1 Characteristics of the study population at the beginning of the study.

Prevalence of clonal hematopoiesis in the study population

The prevalence of CH in the study population was 30.6% (54 patients), with 44.6% classified as CCUS, 55.3% as CHIP and for 7 patients (12.9%), complete blood count data were not fully available, so they were not included in this analysis. The overall prevalence of CCUS and CHIP was 13.5% and 16.7%, respectively. The most mutated genes were DNMT3A, TET2, and ASXL1, respectively (Fig. 1A).

Prevalence of cytopenias in the study population

An analysis was also conducted to compare the presence of cytopenias (1 or more, including anemia, neutropenia, or thrombocytopenia) within the study population (Fig. 1D). While 29.6% of patients in the no-CH group exhibited cytopenias in the initial complete blood count, the prevalence increased to 44.6% in the CH group. In the subset of CH patients with VAF > 10% (n = 19), the prevalence of cytopenias rose to 57.8%. Nevertheless, statistical analysis revealed no significant difference in the prevalence of cytopenias between the CH and no-CH groups (Fig. 1B). However, a statistically significant difference was found when comparing the CH subgroup with VAF > 10% (p = 0.03) using Fisher’s exact test (Fig. 1C). We also conducted an analysis of CH prevalence stratified by 5-year age intervals (Fig. 1A), presenting the relative prevalence percentages of clonal hematopoiesis according to each age group.

A prospective comparative analysis of complete blood count parameters was performed in CH patients with follow-up samples using paired Student’s t-tests. However, no statistically significant changes were detected in any of the blood count variables.

Fig. 1
figure 1

(A) Clonal hematopoiesis prevalence and distribution of mutations in the eldery. (B) and (C) Prevalence of cytopenias in the study population; (D) Prevalence of cytopenias in the study population divided by subtype; Statistical significance: *p < 0.05.

CH and clinical associations

The prevalence of osteoporosis in the study population evaluated at the end of the study was 56.8%, with 68.6% of these patients being CH carriers and 31.4% belonging to the no-CH group. This indicates a statistically significant higher prevalence of osteoporosis in the CH group compared to the no-CH group (p = 0.039) (Table 2).

Table 2 Diverse clinical associations with clonal hematopoiesis at the end of the study.

No significant correlation was found between CH and diabetes, hypertension, dyslipidemia, myocardial infarction, coronary insufficiency, cerebrovascular accident, transient ischemic attack, or heart failure. Additionally, baseline serum troponin levels were analyzed, with the hypothesis that CH patients might have elevated levels. However, no statistically significant difference in troponin levels was observed between the CH and no-CH groups (p = 0.647) (Table 1).

An analysis of solid tumor prevalence between the CH and no-CH groups showed no significant difference (p = 0.991). The most common cancers in the study population were skin cancers, followed by prostate and breast cancers (Table 2).

No multiple testing corrections were applied, as each variable was considered an independent hypothesis. Nonetheless, the potential for false positives is acknowledged. For transparency, Benjamini Hochberg adjusted p-values are provided in (Supplementary Table 3), although these corrections were not applied in the primary analyses.

CH and development of hematologic malignancy

During the study, it was observed that one out of 53 patients with CH developed hematological disease, corresponding to an incidence of 1.88%. On the other hand, no patient without CH developed hematological neoplasms.

One patient with mutations in DNMT3A (VAF = 40.5%) and U2AF1 (VAF = 36.7%) developed leukemia. Classified as high-risk due to VAF values above 10% and also classified as high-risk according to the CHRS with a score of 14. The patient met criteria for both the CH cohort and CCUS, presenting with anemia, neutropenia, and thrombocytopenia at the study’s onset (hemoglobin 10.9 g/dL, neutrophil count 620 cells/µL, platelet count 138,000/µL). The patient was diagnosed with leukemia in April 2021 and passed away in May 2021.

Survival analysis

Survival analysis was conducted comparing the CH and no-CH groups adjusted for age. This analysis did not demonstrate a statistically significant difference (p = 0.094) in survival between the two groups over the 8-year follow-up period of the study (with a hazard ratio of 1.62 and a 95% confidence interval ranging from 0.92 to 2.83) (Fig. 2).

Survival data were available for 44 of 54 patients in the CH group and for 98 of 119 patients in the no-CH group, with 10 and 21 patients, respectively, lost to follow-up.

Fig. 2
figure 2

(A) Box plot illustrating the age comparison between the CH and no-CH groups. (B) Age-adjusted predicted survival curve comparison between the CH and no-CH groups.

Prospective sequencing analysis

Prospective sequencing of CH patients revealed 10 mutations with increased allele frequency and 13 with decreased frequency when comparing VAFs from the study’s beginning to its conclusion (Supplementary Fig. 1), with a median interval of 4.35 years between the initial and follow-up sample collections.

Further details on this aspect can be found in the Supplementary Material.

Discussion

This is a prospective follow-up cohort that studied a genetically diverse population of Brazilian patients aged over 75 years. In our study, we observed a 30.6% prevalence of CH, which corroborates the results reported by Rossi et al. (2021)19 but diverges from the study by van Zeventer et al. (2021)1, which found a prevalence of 62% in individuals over 80 years of age. It is important to note that the classification of CH considered a minimum allele frequency of 1.5% for substitutions and 2% for insertions and deletions in our study, whereas both of the cited studies employed a minimum allele frequency of 1%. In addition, we observed a higher prevalence of variants in the DNMT3A (34.2%), TET2 (24.7%), and ASXL1 (8.2%) genes, which also agrees with previous reports1,19,20,26.

No significant difference in cytopenia prevalence was found between CH and no-CH groups (Fig. 1B). A difference was observed only when comparing patients with CH harboring high variant allele frequencies (VAF > 10%), highlighting the need for cautious interpretation of mutational screening results and unexplained cytopenias, particularly when mutations are detected at lower VAFs. Importantly, findings related to cytopenias may have been influenced by the small cohort size, population heterogeneity, and the limited scope of the sequencing panel.

With respect to the link between CH and the onset of hematologic malignancies, the research by Abelson et al.11 revealed that 39% of pre-AML cases in individuals over 50 years old had a driver mutation with a VAF greater than 10%. Our study identified one leukemia case in a high-risk CH patient, classified by CHRS and VAF > 10%. This patient had anemia (Hb 10.9 g/dL), neutropenia (620/µL), thrombocytopenia (138 × 10³/µL), elevated RDW (16.4%), and increased MCV (107.5 fL). Mutational analysis revealed a mutation in U2AF1 (VAF 36.7%) and a splicing gene mutation, alongside a mutation in DNMT3A with a VAF of 40.5%, meeting high-risk criteria across all established risk scores described to date15,19. For this patient, it was not possible to definitively exclude the presence of myelodysplastic syndrome (MDS) at the time of study enrollment due to the lack of bone marrow biopsy data. Although pancytopenia was observed in her last follow-up in 2017, no abnormal circulating cells were detected, and the diagnostic workup was likely delayed due to limited resources in the public healthcare setting where she was followed. The patient was later reported to have been hospitalized with a diagnosis of leukemia in 2021; however, additional clinical details and diagnostic data from that hospitalization could not be accessed, as she was admitted to a non-affiliated institution and medical records were unavailable.

The relationship between CHIP and an increased incidence of cardiovascular disease has been demonstrated in several studies27,28. Mutations in DNMT3A, TET2 and ASXL1 (DTA mutations) have been associated with an increased risk of atherosclerotic disease, worse outcomes in aortic stenosis and heart failure28. Regarding the pathway through which CH correlates with cardiovascular outcomes, several experimental studies in mice have linked CH to increased inflammatory mediators and pathological cardiac remodeling, as well as a heightened association with the development of atherosclerosis and age-related cardiomyopathies29,30,31,32,33. Interestingly, we did not find a correlation between clonal CH and cardiovascular diseases, a finding consistent with the study by van Zeventer et al.1, although this result may be influenced by the small cohort size. Our finding is also in line with the study by Kessler et al.34, which, although reporting a modest association between CHIP and cardiovascular disease, showed that this association was absent in cases involving mutations in DNMT3A. This suggests that the link between CH and cardiovascular disease may be gene-specific and requires further investigation to be fully established.

The retrospective study by Scolari et al.35 found elevated troponin I levels in hypertrophic cardiomyopathy patients with CH, suggesting a link between CH and troponin elevation. Our study aimed to compare serum troponin levels between CH and no-CH individuals but found no significant difference. Our study’s median age is approximately 30 years higher than that of Scolari et al.’s cohort, which may increase the baseline prevalence of cardiovascular and non-cardiac conditions known to elevate troponin levels. While this age difference could enhance the statistical power to detect an association, the overall cohort size likely remains insufficient to observe statistically significant differences.

On the other hand, an association between CH and osteoporosis was confirmed by our study, consistent with the findings of Kim et al. (2021), which showed that patients with CHIP had a 1.44-fold increase in the incidence of osteoporosis diagnosis compared to the general population, particularly in cases with DNMT3A mutation with an allelic fraction ≥ 10%6.

Regarding changes in allelic frequency over time, with a median interval of 4.35 years between baseline and follow-up sampling, no consistent increase in mutation allelic frequency was observed. This suggests that VAFs may remain relatively stable over short-term periods, as supported by previous studies36,37. However, this finding could be influenced by the limited sample size of our cohort and the potentially insufficient follow-up duration.

This study presents several limitations. The relatively small sample size (n = 173) restricted statistical power, particularly for subgroup analyses and detection of subtle associations. Furthermore, only a limited number of participants had follow-up DNA samples available, constraining the evaluation of clonal evolution. Recruitment from geriatric outpatient clinics, rather than through a population-based strategy, may have introduced selection bias and limits the generalizability of the findings, including the prevalence of cytopenia.

A significant difference in median age between the CH and no-CH groups was observed, with the CHIP group being older - a potential confounder that may have influenced results and further limited statistical power. The study was also conducted across two distinct healthcare settings - one public and one private - which led to inconsistencies in medical record documentation. In particular, many records from the public hospital (UNIFESP) were paper-based and manually completed by different professionals, which often resulted in missing information, resulting in incomplete clinical data collection. Due to the unavailability of precise dates for clinical events, it was not feasible to perform a longitudinal analysis - consequently, clinical outcomes were evaluated solely based on their prevalence at the conclusion of the study, when data collection for all patients had been completed and verified.

Additionally, the COVID-19 pandemic disrupted the second phase of recruitment and follow-up, as many patients declined in-person visits or sample recollection due to safety concerns. Although home sample collection was offered to mitigate this issue, uptake was low. Moreover, the presence of current or prior solid tumors, not excluded in this study, may have influenced CH detection - particularly therapy-related cases. Additionally, the PPM1D gene, commonly mutated in such contexts, was not included in the 2016 institutional sequencing panel, potentially leading to underestimation of CH prevalence.

Finally, although the survival curve analysis did not reach statistical significance (p = 0.09), the graphical trend suggests a potential difference in survival between the CH and no-CH groups - a finding that has been reported in previous studies2,18,19,38 - thus warranting further investigation in larger prospective cohorts.