Figure 2

Hierarchical representation of the 35 real-world PSN sets that we use. Each oval shape represents a PSN set. The top line in the given oval indicates the name of the PSN set. The bottom line in the given oval contains two numbers represented as “x; y”, where x is the number of categories (labels) that are present in the PSN set and y is the number of PSNs averaged over all categories in the PSN set. For example, for the CATH database, PSN set CATH-primary has three categories, which on average have 3,170 PSNs. All of the PSN sets at a given level form a PSN set group (see Methods). For example, PSN sets CATH-primary and SCOP-primary form PSN set group 1. A given category of a PSN set in group \(i\) may be present as a PSN set in group \(i+1\). For example, each of the categories of PSN set CATH-primary (in group 1), i.e., \(\alpha \), \(\beta \), and \(\alpha /\beta \), is present as a PSN set in group 2 as CATH-\(\alpha \), CATH-\(\beta \), and CATH-\(\alpha /\beta \). Note that since we select a PSN set if and only if it has at least two categories each with at least 30 PSNs (see Methods), not all of the categories of a PSN set in group \(i\) are necessarily present as PSN sets in group \(i+1\). For example, PSN set CATH-\(\alpha \) has four categories in group 2, but only two of its categories exist as PSN sets in group 3, namely CATH-1.10 and CATH-1.20. Also note that because of our PSN set selection criterion, it is not necessary that a PSN set in group \(i+1\) has to be present as a category of a PSN set in group \(i\). For example, PSN set CATH-3.20.20, which is present in group 4, is not present as a category of any PSN set in group 3. This is because CATH-3.20 contains only one category that has at least 30 PSNs (i.e., 3.20.20) and hence, CATH-3.20 is not considered as a PSN set in our analysis.