Extended Data Fig. 3: Supporting information for somatic mtDNA mutation calling via mgatk.
From: Massively parallel single-cell mitochondrial DNA genotyping and chromatin profiling

(a) Venn diagrams depicting comparisons of heteroplasmic mutations identified by mgatk, samtools/ bcftools, and (b) FreeBayes. (c) Comparison of heteroplasmy estimated from reads aligned to either strand. The top row are three variants called specifically by mgatk; 3549 C > A was identified only by FreeBayes. 7399 C > G and 546 A > C were called specifically by bcftools. (d) Identification of 67 and (e) 36 heteroplasmic variants from previously published Smart-seq2 hematopoietic colony data. Blue variants represent known RNA-editing events. (f) Comparison of population heteroplasmy values for variants replicated by mgatk from a previous supervised approach. Boxplots: center line, median; box limits, first and third quartiles; whiskers, 1.5x interquartile range. Statistical test: two-sided Mann-Whitney U Test. (g) Concordance between discerning cells sharing a clonal origin based on colony-specific mtDNA mutations and their unsupervised identification using indicated algorithms (mgatk, bcftools, FreeBayes) and previously described supervised approach6. Receiver operating characteristic (ROC) using the per cell pair mtDNA similarity metric to identify pairs of cells sharing a clonal origin based on sets of mtDNA variants. The number of variants in each set is also depicted. (h) Area under the ROC (AUROC) is denoted for each donor group and indicated variant caller as depicted in (g). Each bar represents the statistic from one evaluation per donor per tool. (i) Estimated sensitivity (y axis, left), positive predictive value (y axis, right), and (j) estimated % dropout (y axis) for mtscATAC-seq at different simulated levels of heteroplasmy (x axis; Methods). Vertical line: 5% heteroplasmy for a subclonal mutation. The in-graph numbers indicate the values from the curve at a single-cell heteroplasmy of 5% with colors corresponding to different per-cell coverage values in the simulation.