Fig. 8: GWAS of seed shape parameters.

a, Manhattan plots showing the results of the GWAS. Ratios of length to width and horizontal width to vertical width in 221 soybean lines (Fig. 6) were used as seed shape (roundness) parameters. The −log10(P values) were calculated by the same method described in Fig. 7. The positions of significant −log10(P value) peaks are indicated by square braces and lines together with candidate gene IDs (if present). b,c, Box plots showing the effect of the candidate variant within Glyma.08G168400.v4.JE1 (b) and Glyma.09G048300.v4.JE1 (c), which are soybean homologs of the Arabidopsis genes EPIDERMAL PATTERNING FACTOR 2 (EPF2) and GSO1/SGN3, respectively. The frequencies of Japanese and other world soybeans are indicated separately by different colors. P values calculated by two-tailed Student’s t-test indicate significant differences between groups. ‘Large seed’ indicates eight Japanese landraces with particularly large seeds that were selected by the following criteria: averaged area_size (H) > 80 mm2 and averaged area_size (V) > 60 mm2 (see also Supplementary Data 7). d, Photographs of the seeds of induced soybean mutants. Unlike wild-type Enrei (WT), the EnT-6455 mutant carried the A560T amino acid substitution mutation within Glyma.09G048300.v4.JE1. e, A boxplot showing the frequencies of seed shape in the segregating progenies of the EnT-6455 mutant. Genotypes of each segregant are also shown in the plots; their parent is homozygous ALT (ALT), heterozygote (Htz) or no mutation (REF). The ratio of horizontal length to width was quantified in biologically independent individual seeds with SmartGrain software33 (n = 30 for wild-type Enrei, n ≥ 15 for ALT and Htz genotypes, n = 10 for REF genotype). f, Cartoon illustrating a possible member of the seed-specific TWS1-GSO1/SGN3 signaling component in soybeans. Four genes are described as possible candidates that may have a role in the regulation of seed shape in soybeans. The 3D structure images of soybean GSO1/SGN3 and TSW1 homologs were obtained using the AlphaFold 3 (ref. 45). SERK, SOMATIC EMBRYOGENESIS RECEPTOR-LIKE KINASE. For box plots in b, c and e, the center line denotes the median value; box contains the 25th to 75th percentiles of the dataset; whiskers extend 1.5 times the interquartile range; and dots represent individual phenotype values.