Fig. 6: Large-scale GWAS and genomic prediction for 247 sets of phenotypes using SV and SNP markers.
From: A graph-based genome and pan-genome variation of the model plant Setaria

a, Phenotype collection from 13 geographic locations across 11 years. The numbers in parentheses are number of years and traits evaluated at corresponding locations. The map was created by the QGIS software with source data from the National Earth System Science Data Center, National Science & Technology Infrastructure of China. b, Phenotypic variation among different growth conditions. Different letters in heatmap represent significant differences (P < 0.05) according to Duncan’s multiple comparisons test, which was conducted using two-sided ANOVA. Heatmap color represents the scaled phenotype values. Phenotypes from 1 to 41 correspond to Supplementary Table 13. c, Manhattan plots of SV-GWAS (top) and SNP-GWAS (bottom) of 247 sets of phenotypes. The dashed vertical lines indicate Bonferroni-corrected significance threshold (α = 0.05). The triangles indicate the associated signals only detected by SV-GWAS. d, Frequency of phenotype-associated loci detected by different markers. e, Linkage analysis between SVs from the graph-based genome using 680 accessions, and their nearby flanking ( ± 50 kb) SNPs. f, Precision of different phenotypes with different subsets of markers. Gray lines represent different phenotypes, and colored points denote the prediction precision with corresponding markers higher than others. Suffixes cg and gwas represent high-effect marker panels selected based on the feature importance by CropGBM and GWAS, respectively (Methods). g, Improvement percentage of yield (n = 46) and grain quality-related traits (n = 17) using base substitution of the top 20 highest effective variants. In boxplots, the 25% and 75% quartiles are shown as lower and upper edges of boxes, respectively, and central lines denote the median. The whiskers extend to 1.5× the interquartile range.