Figure 7
From: Imaging-based representation and stratification of intra-tumor heterogeneity via tree-edit distance

Choice of \(\mu\): (a) costruction of qualitative densities of the vertices heights in three example dendrograms: the velocity with which leaves get merged in a dendrogram, i.e., edges length variability, reflects the heterogeneity characterization of lesions. Per every dendrogram, branches heights (rescaled on [0, 1] dividing by the highest value) are annoted on the left and their associated density is inspected. The vertices heights of a patient exhibiting homogeneous lesions concentrates in a small real interval [0, a]—with \(a>0\) (blue tree); the vertices heights of a patient exhibiting heterogeneous lesions spread in a range of values far from zero [a, b], with \(a,b>0\) (green tree); a patient showing groups of homogeneous lesions, the one heterogeneous to the others, is associated to a dendrogram with an explicit clustering structure with clusters with multiple close leaves (orange tree). The vertices heights distribution displays two components, reflecting both the homogeneity of similar lesions—with values close to 0—and the heterogeneity of dissimilar clusters—with values far from 0; b) \(\mu\) provides the coefficients with which to weight the different pruning cutoffs \(\varepsilon\), to neglect the homogeneity within clusters of similar lesions’ phenotypes and bring out the informative heterogeneity between different phenotypes. To efficient the computation, a parametric shape of \(\mu\) is used and empirical heights distributions of all patients (black line) is exploited to model the distribution. In the population heights distribution, we discern both homogeneous and heterogeneous phenotypes. The two components are demarked with a saddle point on 0.15. Accordingly, low weights of \(\mu\) should be associated to \(\varepsilon \ll 0.15\) and \(\varepsilon \gg 0.15\) and high weights to \(\varepsilon \simeq 0.15\). In fact, low \(\varepsilon\) values entail pure homogeneity information while high \(\varepsilon\) values would lead to discarding useful heterogeneity information. We thus infer to model \(\mu\) as an asymmetric bell-shaped density function with one peak centered in the saddle point of the heights distribution. The Beta family of distributions, supported in [0, 1], well meets the requirements; it simplifies both the numeric integration procedure and the results’ interpretation. The Beta-shaped \(\mu\) is centered on 0.15 (grey line), properly tuning \(\alpha\) and \(\beta\) shape parameters (\(\alpha =2.5,\;\beta =15\)).