Fig. 3: Data characterization of the 174 molecular descriptors by means of a hierarchical clustering algorithm using their associated rank correlations.

The computation yields 37 dissimilar clusters of sizes ranging from single descriptor clusters (e.g., cluster 37) up to a large cluster of 52 descriptors (cluster 1). The dendrogram in the left-hand side depicts the individual as well as cluster level relationships among the descriptors (single line) and clusters (blue groups), respectively. It also permits the visualization of the cut defining the number of clusters, which was determined by the L-method (see Supplementary Methods). The heat map further highlights the different clusters as well as the relationships between themselves and between individual descriptors via a dissimilarity computation of their associated rank correlations. The type of descriptor is defined in the right-hand side by the color code shown in the legend.