Figure 1 | Scientific Reports

Figure 1

From: Surprise maximization reveals the community structure of complex networks

Figure 1

Global performance of the algorithms.

(a) Behavior of the algorithms in the LFR benchmark. To obtain this figure, the algorithms were first ordered according to the VI results obtained for each μ condition. Then, we plotted the results for the algorithm with the best VI value (black line, indicated with “1”), the average of the top five algorithms (red line), average of the top ten ones (blue line) or average for all the 18 algorithms (green line). The grey region corresponds to the values of μ (0.1 – 0.7) chosen to perform the main comparative analyses (see text). Beyond that region, even the best algorithms obtain VI values considerably higher than zero, meaning that the original structure of the network has been significantly modified by the increase in intercommunity links. (b) An example showing the five largest communities in a LFR network (5000 units) when μ = 0.1. Nodes are distributed into two dimensions with a spring-embedded algorithm46 and drawn using Cytoscape47. Communities are well-isolated groups. (c) The five largest communities when μ = 0.7. They are barely distinguishable in this representation because the mixing of links was quite extreme. However, several algorithms were still often able to detect these fuzzy communities. (d) – (f): The same results for the RC benchmark (512 units). Panel e depicts the five largest communities when R = 10% and Panel f to the same communities when R = 50%. Again, notice in panel d) the sharp increase in VI values when R > 50%. An extreme degree of superimposition among communities is observed already when R = 50% (f). In the LFR benchmark, the rapid increase in VI values when the intercommunity links goes from μ = 0.7 to 0.8 (Panel a) is explained by all communities being of similar sizes. Therefore, they are destroyed at about the same time. On the contrary, the more progressive increase in VI when R grows, which we observed in Panel d, is due to the heterogeneous sizes of the communities present in that benchmark, which break down at different times.

Back to article page