Fig. 5: Pan-genome analysis of 69 A. thaliana accessions. | Nature Genetics

Fig. 5: Pan-genome analysis of 69 A. thaliana accessions.

From: A pan-genome of 69 Arabidopsis thaliana accessions reveals a conserved genome structure throughout the global species range

Fig. 5

a, The annotated protein-coding genes, and composition of core (red), softcore (blue), dispensable (green), and private (purple) genes in each individual accession. b, The number and proportion of core (red), softcore (blue), dispensable (green) and private (purple) gene families in the pan-genome. The split–merge cases are indicated by light colors. cf, The protein length (c), number of annotated Pfam domains (d), gene expression landscape (e) and Ka/Ks (f) of core (red), softcore (blue), dispensable (green) and private (purple) genes of Col-0. The expression level of the gene is defined as the median value in the 79 organs and developmental stages. The expected dataset was made by 1,000 simulations with the same data size of the testing dataset. The two-sided Mann–Whitney test was performed, and the P value is indicated respectively. Intervals for boxplots: center, median (50th percentile); lower bounds of box, 25th percentile (Q1); upper bounds of box, 75th percentile (Q3); lower whisker, maximum of (minima, Q1 − 1.5 × IQR); upper whisker, minimum of (maxima, Q3 + 1.5 × IQR). IQR, interquartile range (range of Q1 to Q3). g, The increase of the pan-genome size and the decrease of core-genome size in the whole population (gray), Europe (red), Asia (blue) and Africa (orange) groups. Accessions were sampled as 2,000 random combinations of each given number of accessions ranging from 2 to 67. The mean number of gene families is shown with the standard deviation. Source data are provided as a source data file.

Source data

Back to article page