Fig. 9: Results of simulation studies with randomized species combinations. | Nature Communications

Fig. 9: Results of simulation studies with randomized species combinations.

From: Evolutionary sparse learning reveals the shared genetic basis of convergent traits

Fig. 9

The Y axis of each panel shows the number of simulated convergent alignments that were in the top 100 ranks of all proteins for each of the methods, averaged over 100 species combinations and sets of seed alignments for each point. For each row of panels, 100 species combinations were chosen at random consisting of the number of simulated convergent species shown at the left along with a control sibling for each one that met the topological requirements for ESL-PSC and CCS (see Supplementary Methods). For the upper three rows of panels, species combinations were chosen from Laurasiatheria (50 species in the dataset). For the lower two rows, combinations were chosen from all of Theria. For each column of panels, the number of amino acid sites shown at the bottom was simulated using the CSUBST simulate function35 for each of 25 randomly chosen alignments for each species combination, with simulations repeated for each of 20 foreground scaling factors that are adjusted for each column of panels such that the product of the number of sites and the scaling factors produces the same range, which we define as the “Evolutionary Impact” of convergence in each panel. For each panel, an additional run was conducted with a scaling factor of 1 and the convergent codon profile turned off as a negative control. For each species combination, for each number of sites for each scaling factor, ESL-PSC and CCS were run on the full proteome data set with the 25 chosen alignments having the given number of sites replaced by those simulated as described above (see Supplementary Methods). This required a total of 52,500 runs of ESL-PSC and CCS and ~1.3 million simulated alignment partitions. Dashed lines at Y = 5 represent a notional detection threshold of 5 genes that may be required to accept a significant enrichment, as we and others have used in order to reduce type 1 error due to many small ontology categories.

Back to article page