Figure 9: Family abundance and sequence variability are correlated.

Each data point in the scatter graph represents a miRNA family and each symbol represents a species. Reads are expressed as the sum of raw, non-normalized total reads in all sequencing libraries for a particular family in a particular species. The inset graph uses the same data with a linear scale to show that the correlation is nonlinear, and was truncated at 1 million reads for clarity. Linear regression R2 values for the log-transformed data range from 0.67 to 0.93 with a median of 0.88. In a randomized data set, R2 values range from 0.01 to 0.31 with a median of 0.18.