Fig. 3

Regional distributions of somatic mutations and phylogenetic trees of typical cases. a–f Heat maps showing the regional distribution of all mutations among different samples (normal epithelial tissue, dysplasia and carcinoma) in each case. Mutations are further divided into three categories: present in all samples (dark blue), present in more than one but not all samples (orange), and present in only one sample (green). Phylogenetic trees are constructed based on the maximum parsimony algorithm for each case. The color of each line is corresponding to the categories of mutations shown in the heat map. The length of the trunk and branch is proportional to the number of mutations in each sample. There exist some mutations in the heat maps that do not fit the phylogenetic trees. The possible causes of this may be the absence of mutations due to copy number loss or geographically low level of clone intermixing, as well as other technical noise. g–i Box plots depicting the heterogeneity index (HI), the Euclidean distance and the density of intersecting mutations (in minus value) of pairwise samples in each case. All pairwise samples were divided into three groups: ESCC–ESCC (C-C), dysplasia-ESCC (C-D), and dysplasia–dysplasia (D-D). Kruskal–Wallis test, P = 4.50e−07 (HI), P = 0.00046 (Euclidean distance) and P = 2.45e−06 (-Intersection density)