Extended Data Fig. 1: Pan-genomic analysis of orthogroup conservation and diversity of gene duplications. | Nature

Extended Data Fig. 1: Pan-genomic analysis of orthogroup conservation and diversity of gene duplications.

From: Solanum pan-genetics reveals paralogues as contingencies in crop engineering

Extended Data Fig. 1

(a) Orthogroups expansions and contractions across the pan-genome. The orthogroup-based phylogeny is adapted from Fig. 1c. The estimated expansion (blue) and contraction (orange) rates of orthogroups are shown at each node. (b) Cumulative curves showing detection of the four orthogroup conservation groups as a function of the number of species available in the pan-genome. (c) Schematic of the potential mechanisms underlying different gene duplication categories, also showing non-duplicated single copy genes for context (left). Stacked bar chart showing the number of genes derived from the different types of duplication sorted by orthogroup conservation groups (right). WGD: whole-genome duplication; TD: tandem duplication; PD: proximal duplication; TRD: transposed duplication; DSD: dispersed duplication; SC: single copy. (d) Functional enrichment of gene duplication types detected across the pan-genome. The top five enriched GO terms per duplication type are shown. Gene ratio represents the number of genes with a specific GO term divided by the total number of genes with GO terms in that category. (e) Divergence of protein and cis-regulatory sequences across increasing evolutionary pressure, as measured by Ka/Ks values, for the indicated types of gene duplication. BLASTP (protein sequence conservation) and LastZ (cis-regulatory sequence conservation from the Conservatory algorithm) normalized alignment scores were used to plot the predicted mean and 95% confidence interval (see Supplementary Table 5 for statistical analysis).

Back to article page