Extended Data Fig. 1: Pan-genome of 45 potato accessions.
From: Genome evolution and diversity of wild and cultivated potatoes

a, Assembled size of monoploid assembled contigs (MTGs) and alternate assembled contigs (ATGs). b, Contig N50 of raw assembled contigs and improved contig N50 of MTGs. c, Correlation between raw assembly size and heterozygosity. The grey shaded region indicates the 95% confidence interval using a linear model (‘lm’). d, Simulation of pan- and core-genome sizes, in terms of number of gene clusters and pan-genome composition. At each given number of genomes, the number of combinations is 500 with 30 times of replication. e, Percentage of genes in core, soft-core, shell and accession-specific gene subsets with annotated InterPro protein domains. Orange bars show the proportion of genes with InterPro domains, whereas red bars depict the genes without those domains. f, Expression profiles of genes belonging to core (13,123), soft-core (5,732), shell (5,009) and accession-specific (134) gene families. g, Non-synonymous/synonymous substitution ratios (Ka/Ks) within core, soft-core, and shell genes. Kruskal-Wallis test was used to determine significance. Multiple comparisons were performed, using the Fisher's least significant difference. The level of significance used in the post hoc test was 0.001. Number of gene pairs used in core, soft-core and shell genes are 52,148, 28,363 and 31,654, respectively. The upper and lower edges of the boxes represent the 75% and 25% quartiles, the central line denotes the median and the whiskers extend to 1.5 × IQR in d, f and g. h, InterPro protein domain enrichments of core and soft-core (upper panel) and shell and accession-specific (lower panel) genes relative to pan genes. i, Pfam protein families enriched in core and soft-core (upper panel) and shell and accession-specific (lower panel) genes, relative to pan genes.