Fig. 3: Selection of optimal clusterings from Markov Multiscale Community Detection using MCA-30 embeddings.

The optimal clusterings at 23, nine and six clusters. The disease similarity graphs obtained with CkNN for the three optimal clusterings are shown above, where the nodes correspond to diseases, coloured by cluster assignment, and edges to strong similarities. In the trace below, the shaded areas correspond to partitions across scales, where darker areas correspond to more robust partitions. The NVI (green line) represents the variation in the assignment of diseases to clusters within each Markov time step, t, and the purple line represents the block NVI across t; minima of these traces represent robustness within and across scales, respectively (see Methods). MCA-30 = Multiple Correspondence Analysis retaining 30 dimensions.