Extended Data Fig. 2: Analysis of clonal architecture by disease type and gene mutation.
From: Single-cell mutation analysis of clonal evolution in myeloid malignancies

a, scDNA-seq data processing and analysis workflow. FASTQ sequencing files for each sample were uploaded and processed through Mission Bio Tapestri Insights platform for variant calling and cell finding (Commercial Platform). Included samples for further analysis harboured ≥ 1 variant which leads to a protein sequence change (non-synonymous/insertion/deletion) and included 50 cells with definitive genotyping for all protein coding variants within the sample (n = 146). This data was used for analysis in Fig. 1. Clones present in each sample were identified and samples removed if they contained less than 2 clones for clonal analysis studies. Samples were subjected to random resampling of cells using a bootstrapping approach to identify the stability of identified clones (n = 132). Following bootstrapping, clones with lower 95% confidence intervals <10 were removed as were variants identified only within those clones. Samples which harboured only 1 variant or presented with <2 clones after bootstrapping analysis were removed (n = 111). The number of samples at each step of processing is shown below the different steps of the workflow. b, Number of mutations in the most dominant clone identified in each sample (n = 111 biologically independent samples) stratified by cohort. Mean value for each cohort shown by height of bar with standard error of measurement (SEM) depicted with error bars. A two-sided t-test with FDR correction was used to determine statistical significance pairwise between all groups. For clarity, only significant P values referenced in the text are shown. *P < 0.1; **P < 0.01; ***P < 0.001. c, Association between clone size and the number of mutant alleles in the clone. Every clone (n = 111 biologically independent samples) identified in clinical cohort is depicted by black circle. Centre line: median; box: IQR; whiskers 1.5 × IQR. d, Bar plot depicting the prevalence of dominant clones for each DTAI gene across patient cohorts. Colour of bar plot annotates if mutation occurs in the dominant clone (red) or subclone (grey). Absence of bar denotes no clones were identified with the indicated mutation in a given cohort. e, Association of VAF with presence of mutation in either the dominant clone (red) or subclone (grey) for select genes (n = 101 biologically independent samples). Standard error of measurement depicted with error bars. A two-sided t-test with FDR correction was used to determine statistical significance pairwise between all groups. *P < 0.1; **P < 0.01; ***P < 0.001. Absence of P value for IDH2 and JAK2 due to lack of samples with subclonal mutations. f, Pairwise interaction matrix of mutually exclusive (red square) and inclusive (blue square) on a per-sample basis. Pairwise interactions with no colour did not garner a significant P value.