Fig. 5: Effective population size and genome young transposable element content relates to GC4 content.
From: Interpreting mammalian synonymous site conservation in light of the unwanted transcript hypothesis

a Neighbour-joining tree of 240 mammals based on genetic distances (sequence dissimilarity) calculated from 93,464 shared 4d sites. Significant negative correlations are seen between GC4 content at highly conserved sites and b historical Ne (two-tailed Pearson’s r = −0.48, P = 5.3 × 10−13) and c mean genetic distance of a species to all other species (two-tailed Pearson’s r = −0.87, P < 2.2 × 10−16). Indicated in (b), the edible dormouse (Glis glis, Order: Rodentia) has the lowest Ne and highest GC4 content of rodents, the grey whale (Eschrichtius robustus, Order: Artiodactyla) has the highest GC4 content of all species, and the Indochinese shrew (Crocidura indochinensis, Order: Eulipotyphla) has the greatest mean distance to all other species, the lowest GC content at conserved 4d sites, and the third highest Ne. Genome young TE (yTE) content (a proxy for unwanted transcript burden) negatively correlates with historical Ne (d two-tailed Pearson’s r = −0.20, P = 0.0034) and positively correlates with GC content at 4d sites (e two-tailed Pearson’s r = 0.16, P = 0.016). The two orders with the strongest correlations between yTE content and GC4 content are Rodentia (f n = 53, two-tailed Pearson’s r = 0.29, P = 0.035) and Chiroptera (g n = 30, two-tailed Pearson’s r = 0.40, P = 0.031). Trend lines in (b–g) were calculated using a linear model with standard error intervals around the mean shown in grey. Species silhouettes in (f) and (g) were obtained from phylopic.org, which are available under a creative commons licence. Source data are provided as a Source Data file.