Extended Data Fig. 3: Clinical Relevance of LiFT - Supporting Material.
From: Longitudinal dynamics of clonal hematopoiesis identifies gene-specific fitness effects

a. Distribution of fitness by gene category. Genes are grouped according to their biological function into DNA methylation (TET2, DNMT3A), Splicing (SF3B1, U2AF1, SRSF2, U2AF2, ZRSR2, LUC7L2, DDX41), mitogenic function (KRAS, NF1, JAK2, JAK3, SH2B3, PTEN, PTPN11, NRAS), cohesin (RAD21, STAG2), DNA damage (TP53, CDKN2A, PPM1D, ATRX) and Transcription factors (TF) important during development (GATA2, RUNX1, NOTCH1, CUX1, ETV6). The sample size, n, of each gene category is denoted in brackets. For each gene category we display a boxplot showing the maximum a posteriori (MAP) estimates of fitness for variants in the category, as well as the median and exclusive interquartile range. b. Analysis of variance of the maximum posterior fitness estimates across gene categories. Heatmap of all statistically significant (p < 0.05) Kruskal-Wallis H statistics, labelled by effect size, computed for all combinations of pairs of genes. The effect size is only shown for statistically significant relations. Variants with a fitness below 2% were left out of this study as our prediction classifies them as conferring no or a negligible fitness advantage. c. Influence of the number of time-points in a trajectory on the inferred fitness distributions. We show the maximum posterior estimates for genes DNMT3A and TET2 and for all LiFT variants split according to the number of time-points. d. Survival analysis (Cox proportional hazards regression model) broken down by cohort and covariates. LBC1921 and LBC1936 are analysed separately given their difference in age during the observed time-span. (left) Error bar showing the inferred hazard ratio coefficient and 95% CI for each regression study, as well as the sample size, n, and the number of observed events in each analysis. Note that none of the survival analyses shown are statistically significant. The complete summary for each analysis is found in Supplementary Table 7. (right) Kaplan-Meyer survival plots for the LBC cohort stratified using 2 standard deviations of the analysed covariate.