Fig. 2

Gene-based pan-genome. Annotated genes (genomic sequence) from all genomes were clustered, and a single representative from each cluster was selected to create a gene-based pan-genome. a Number of pan-gene clusters represented within respective numbers of inbred line annotations. b Number of core, softcore, shell, and cloud pan-genes for individual inbred lines. c Number of core, softcore, and shell pan-genes in the high-confidence pan-genome. Shell pan-genes are divided into reference (R) and non-reference (N). d Percent coverage of 37,886 high-confidence pan-genes by short read data sets from 49 lines. Color-coded bars on the lower axis indicate the population groups described in Fig. 4; pan-genome categories are labeled on the vertical axis. Note that short read coverage supports the classification of the pan-genome compartments and that the clustering of lines by short read coverage matches the population groups identified in Fig. 4