Extended Data Fig. 2: Paralog pairs expression analysis. | Nature

Extended Data Fig. 2: Paralog pairs expression analysis.

From: Solanum pan-genetics reveals paralogues as contingencies in crop engineering

Extended Data Fig. 2

(a) Schematic of dosage-constrained and dosage-unconstrained orthogroups reflecting different degrees of selection on the total dosage of paralog pairs across species. Orthogroup 1 has paralog pairs with identical total dosage across species, whereas orthogroup 2 has different total dosages in each species. For each tissue, orthogroup and species, the total dosage of two paralogs is compared with that of the two homologues in each of the remaining species, and deviations from the expected ratio of total dosages are classified as “unconstrained”. This is repeated for all species that share the orthogroup and expressed in the tissue of interest, and the majority classification across species is taken as the classification for the entire orthogroup. Therefore, orthogroup 1 is classified as “dosage-constrained” while orthogroup 2 is classified as “dosage-unconstrained”. (b) The fraction of uniquely mapped reads for each tissue sample and species (left), and the average gene expression correlation with other samples from the same tissue and species (right). Red arrows in both cases point to the five outlier samples excluded from further analysis. For all boxplots, the bounds of the box represent the first and third quartiles, the thick line represents the median and the whiskers represent 1.5× the interquartile range. (c) Sankey plot shows the concordance between classification of paralog pairs based on two independent approaches (total dosage conservation and conservation of expression levels and profiles). Thickness of lines connecting each pair of groups shows the odds ratio of enrichment. (d) Line plots showing examples of paralog pairs in each of the four groups of paralog expression patterns. (e) Proportion of expressed paralog pairs classified into one of four expression groups at different coexpression and fold-change thresholds in 15 species. Individual bars are coloured by expression groups. (f) Relationship of protein and cis-regulatory sequence conservation on the different paralog expression groups over increasing evolutionary pressure. For each expression group the predicted mean, 95% confidence interval, and residuals of the normalized LastZ score are shown (see Supplementary Table 5 for statistical analysis).

Back to article page