Figure 3: Repeats in the human proteome. | Nature Communications

Figure 3: Repeats in the human proteome.

From: Positive and strongly relaxed purifying selection drive the evolution of repeats in proteins

Figure 3

(a) Distribution of the most frequent interval (MFI) representing the period/repeats length. (b) Distribution of the number of repeats in a protein. Inset depicts the distribution of recurrence type (FT=fully tandem; PT=partially tandem, that is, at least two repeats recurring in tandem; NT=non-tandem). See examples in Supplementary Fig. 1. (c) Spearman correlation between the physical and evolutionary distances of repeats in a protein. (d) Distribution of dN/dS ratios of all valid pair comparisons in a protein, across the proteome, on log-scale (n=319K, out of all possible 510K pairwise comparisons). Black curve corresponds to the identified repeats (median=0.52) and red curve corresponds to shuffled repeats (median=0.99). Bins equal 0.01. (e) Distribution of the mean dN/dS ratio of all valid pair comparisons in a protein, <dN/dS>, across the proteome, for real repeats (black) and shuffled repeats (red). Bins equal 0.1, shown on linear-scale. Inset shows a scatter plot of <dN/dS> of real repeats versus the respective shuffled repeats, where very short repeats (≤10AA) are superimposed (cyan). (f) The relationship between <dN/dS> and the error on the mean. Inset depicts the relative error.

Back to article page