Abstract
Flour color influences the quality of end-use products of common wheat (Triticum aestivum L.). To analyze the genetic basis of flour color, the flour brightness (FL*), red-green level (Fa*), yellow-blue level (Fb*), and whiteness (W) of 341 winter wheat materials grown during the 2019–2020 years and 2020–2021 years in Emin and Qitai were measured. A genome-wide association study was conducted using a wheat 40 K breeding chip with the MLM model. The coefficient of variation and generalized heritability of wheat flour color traits ranged from 0.62% to 22.23% and 55.52% to 83.48%, respectively. There were strong correlations across the flour color traits. GWAS identified 20 significant and stable SNP markers distributed across 16 loci, including 1 for FL* located on chromosome 5D; 6 for Fa*, located on chromosomes 1 A, 5 A, 1B (2), 6B, and 4D; 6 for Fb*, located on chromosomes 2 A (2), 4 A, 4B, 6B, and 5D; and 3 for W, located on chromosomes 2 A, 4 A, and 5D. Two KASP markers were developed for Fa*, which exhibited good genotype and significant phenotypic differences among materials with different genotypes. Seven candidate genes that may affect flour color during grain development were screened, including TraesCS5D02G01340.1, TraesCS5D02G013100, and TraesCS5D02G014300.1 on the 5D chromosome may simultaneously influence W, FL*, and Fa*, TraesCS1B02G269100.1 and TraesCS1B02G269500.1 on the 1B chromosome may impact Fa*, while TraesCS4A02G307200 and TraesCS6B02G034100.1 on the 4 A and 6B chromosomes may affect Fb*.The results provide useful information to enhance the color quality of wheat flour in wheat.
Similar content being viewed by others
Introduction
Flour color is an important characteristic in evaluating flour quality for many final product productions1, reflecting flour quality and milling accuracy, and serves as a significant indicator for flour grading. Therefore, it is necessary to pay attention to the study of flour color to improve the quality of wheat products and meet the market’s development needs2. The color of flour and its products primarily depends on the accumulation of pigment substances, such as yellow pigments and carotenoids, in the flour. Previous studies have shown that phytoene synthase (Psy)3, polyphenol oxidase (Ppo)4, lipoxygenase (Lox)5,6, and some other peroxidase enzymes in wheat grains affect the color, processing quality, milling quality, storage characteristics, and other quality traits of flour or flour products through the oxidative degradation of pigment substances. The yellowness of flour is mainly influenced by the quantity of carotenoids, lutein, and flavonoids7. The yellow pigment content in wheat flour is affected by Lox activity, which converts oxidative carotenoids to make wheat flour white8. Psy1 is the initial step in catalyzing carotenoid biosynthesis and serves as a crucial regulatory point, strongly correlated with carotenoid accumulation (r = 0.8)9, directly affecting the color of grain endosperm and flour. In the presence of molecular oxygen, Ppo catalyzes the oxidation of phenols to form quinones10, which can polymerize to create high molecular weight black or brown pigments, thereby affecting the color of wheat products, especially yellow alkaline and white salted noodles11. Peroxidase (POD) is a reductase that can oxidize primary phenolic acids, such as ferulic acid, leading to the formation of chromogenic groups and brown substances12. High POD activity may cause the flour to darken, subsequently leading to suboptimal noodle color13. The browning index of noodle products is significantly correlated with POD activity (r = 0.84–0.97)14.
The color of flour has a significant impact on the quality of the final wheat product. Miskelly7 found a significant correlation between the color of flour and the yellowing of noodles in China and Japan. Flour with a high yellow pigment content is the preferred choice for producing alkaline noodles in China and Japan. In many Asian countries, noodles are made with specially selected flour to enhance the color of the final product15. Therefore, yellow noodles from Japan and China require high b* value flour16. However, other end products, such as bread, steamed bread, dry white noodles, and dumplings, require white flour with a very low b* value. Similarly, in the United States and other parts of the world, white flour with extremely low b* values (0) and high L* values (100) is ideal for bread baking17.
Flour color is a quantitative trait controlled by multiple genes with high heritability18. Environmental and management measures may also affect the color of flour. Grain protein content, hardness, seed coat color, grain size, and shape may all contribute to changes in flour color19. Previous studies have reported the discovery of QTLs that affect flour color near the hardness sites (Pina and Pinb) on chromosome 5DS2,20,21,22. It is speculated that these SNP markers may also control flour color through grain hardness. At the same milling extraction rate, the flour color of low-protein wheat is whiter than that of high-protein wheat. The a* value of wheat flour with a red seed coat is higher than that of wheat with a white seed coat19.
Improving the color of flour is an important breeding goal for wheat. Under conventional breeding methods, the efficiency of selecting this trait is relatively low. Genetic improvement is the most effective method for enhancing the color quality of wheat. GWAS is an effective tool for understanding the genetic loci of quantitative traits and marker-assisted selection (MAS) in wheat. This study identified significant association sites for flour color through GWAS, providing a reference for future MAS breeding of flour color and the cloning of color-related genes.
Materials and methods
Plant materials and field trials
The association panel consists of 341 winter wheat varieties and advanced breeding lines from various wheat regions in China excluding Libellula and Strampellula (Table S1). The validation of KASP markers was conducted using 200 winter wheat materials, referred to as the validation panel in this study. This panel included various wheat varieties and advanced breeding lines from both domestic and international sources (Table S2). Among them, 193 materials were different from those in the association panel, while the other 7 materials were part of the association panel. The genotypes of the 7 materials at the KASP marker development site were known, which was used to confirm the accuracy of KASP marker genotyping. The validation panel was used to verify the success of developing KASP markers.
The association panel was evaluated at two sites, including the Institutes of Agriculture Sciences in Emin and Qitai, Xinjiang, China, during the years 2019–2020 and 2020–2021 (hereafter referred to as 2020EM, 2020QT, 2021EM, and 2021QT, respectively). The validation panel was planted in Shihezi, Xinjiang, China, during the years 2020–2021 and 2021–2022. The experiment was conducted using an alpha-lattice design with two replications. Each replication consisted of 18 incomplete blocks, with each block comprising 19 genotypes. Each genotype was grown in a plot measuring 1.8 m in length and 8 rows with 0.25 m spacing between them, each row sowed 40 seeds. Recommended management practices were applied to the trials at their respective locations. Plots were hand-harvested at maturity, and the grain was stored at 4 °C. Using the MLU202 Mill (Wuxi, China), the grain was ground and passed through a 0.1-micron sieve. These flour samples were stored in airtight containers.
Phenotype test
The whiteness of the flour was measured using an intelligent whiteness tester (WSB-V, Zhejiang, China). Place the sample into the sample box, compact it with a pressure pad, cover it with the lid, invert the sample box, unscrew the bottom cover, remove the glass plate, and position the sample box on the measurement base for whiteness assessment.
The parameters of the colorimeter were measured using the CR-410 colorimeter (Konica Minolta, Japan), which are represented by L*, a*, and b* color spaces. L* represents brightness, where L* = 0 represents black, and L* = 100 represents white, with a total of 100 levels in between. The a* and b* values represent different color spaces. Specifically, a* represents the red-green direction, while b* represents the yellow-blue direction. In this context, +a* signifies a reddish hue, -a* indicates a greenish hue, +b* denotes a yellowish hue, and -b* signifies a bluish hue.
Genome-Wide association study
SNP genotyping was conducted using the Wheat 40 K breeding array by the MolBreeding Company in Tianjin, China (http://www.molbreeding.com/). Markers with a minor allele frequency (MAF) below 5% and missing data exceeding 10% were excluded from further analysis. Population structure was assessed using clustering analysis software Structure v2.3.4 based on Bayesian models23. PCA analysis and LD analysis were performed using TASSEL 5.024. The association panel was divided into two subgroups, with an LD decay distance of 4 Mb, which was detailed in our previous article25.
We used the MLM model in TASSEL 5.024 for GWAS and utilize the population structure (Q matrix) and kinship matrix (K matrix) as covariates to prevent false positives. According to Sheoran et al.26, when the significance test reached P < 0.0001 (-log10(P) ≥ 4), it was determined that the marker was significantly associated with the trait. If multiple SNPs controlling a trait were identified to fall within one LD interval, they were referred to as one locus25. Based on previous research reports, loci associated with multiple phenotypic traits were termed pleiotropic loci, while loci consistently identified in at least two environments were considered stable loci27.
Development and verification of KASP marker
Select SNPs that were significantly and stably identified in various environments for transformation into KASP markers. We utilized the online software Polymarker (http://www.polymarker.info/) to design two allele-specific forward primers and one common reverse primer. Standard FAM tags (5’ GAAGGTGAGTCATGCT 3’) and HEX tags (5’ GAAGGTCGAGTCAACGGATT 3’) were attached to the 5’ ends of the two allele-specific primers, respectively.
According to the descending thermal cycle protocol described by the manufacturer (LGC Genomics, Beverly, MA, USA), SNP genotyping was performed on an ABI7500 instrument using a 96-well plate. The genotyping outcomes were assessed using ABI7500 software, supplemented by manual assessment based on fluorescence values28. A t-test was conducted to detect significant differences in phenotypic traits between alleles. The primer pair sequences for PCR amplification are listed in Table S3.
Identification of candidate genes
In order to identify potential candidate genes related to flour color traits, the IWGSC online database (http://www.wheatgenome.org/) was used to search for all genes within the stable SNP marker LD region, defined as 2 Mb upstream and 2 Mb downstream of the SNP flanking region, based on the Chinese spring reference genome (IWGSC RefSeq v1.0). Protein function prediction of candidate genes was performed using the UniProt protein database (https://www.uniprot.org/) and the Ensembl plants database (http://plants.ensembl.org/Triticum_aestivum/Gene). The transcriptome data from different stages of seed maturation in the publicly available Expression Atlas database (https://www.ebi.ac.uk/gxa/Experiments/E-MTAB-4484/Results) were used to study the expression characteristics of these genes29. Based on functional annotation, we selected genes that were highly expressed at different stages of grain maturation, as well as genes that overlap with stable SNPs for further analysis. We utilized qRT-PCR to detect the expression levels of candidate genes in 5-day, 10-day, 15-day, 20-day, 25-day, and 30-day seeds after anthesis of extreme flour color materials. The extreme color materials were “Hongzhitou” and “Yupilaina”, and their color traits are detailed in Table S4.
Total RNA was isolated from seeds at various days after anthesis using the TRIzol reagent (Invitrogen, USA), following the manufacturer’s instructions. The concentration of total RNA was measured spectrophotometrically using a NanoDrop instrument (Thermo Scientific NanoDrop 2000 C Technologies, Wilmington, USA), and the purity was assessed using the A260/A280 and A260/A230 ratios provided by NanoDrop. Reverse transcription was carried out using a PrimeScriptTM first-strand complementary DNA (cDNA) Synthesis Kit (TaKaRa, Japan).
qRT-PCR was performed using an iCycler iQTM Multicolor PCR Detection System (Bio-Rad, Hercules, CA, USA). qPCR was performed with cDNA in triplicate on 96-well plates using SYBR® Premix Ex TaqTM II (TaKaRa). Each reaction (20 µL) consisted of 10 µL of SYBR® Premix Ex TaqTM II, 1 µL of diluted cDNA, 0.4 µL of forward and reverse primers, and 8.2 µL of H2O. qPCR cycling conditions were as follows: 95 °C for 2 min, followed by 40 cycles of 95 °C for 5 s, 57 °C for 32 s. Fluorescence data were collected during the 57 °C step. Wheat Actin (Genebank ID: LOC123114174) was used as a reference gene. The gene ID and primer sequences are listed in Table S5.
Statistical analysis
A multi-environment trial analysis was conducted using R software to perform the analysis of variance (ANOVA).
The best linear unbiased predictor (BLUP) value30 was calculated using the R package lme431.
Broad-sense heritability (h2) was estimated from variance components using the formula: h2 = σ2G/(σ2G + σ2GE/E + σ2e/rE), where σ2G, represents the genetic variance, σ2GE represents the genotype × environment interaction variance, σ2e represents the residual variance, E represents the number of environments, and r represents the number of replicates per line32.
Pearson’s correlation between phenotypic traits was computed using SPSS 22 (http://www.brothersoft.com/ibm-spss-statistics-469577.html).
Other ANOVA and plots were conducted in SPSS 22 and Origin 8.0, respectively.
Manhattan and Q-Q plots were created using the “qqman” package in R software33.
Gene expression heatmap was performed using TBtools34.
Results
Phenotyping
Analysis of variance (ANOVA) of the association panel showed that, except for no significant difference in whiteness between environments, there were extremely significant differences in other flour color traits among genotypes, environments, and years. In addition, Fa* and Fb* showed extremely significant differences in genotype and environmental interactions, whiteness showed extremely significant differences in genotype and environmental interactions, and genotype and year interactions (Table 1). The variation ranges of FL*, Fa*, Fb*, and whiteness in different environments were 88.17–93.36, (−1.76)-(−0.34), 5.74–11.95, and 68.60–82.20, respectively. The variation ranges of coefficient variation were 0.62%−0.68%, 19.09%−22.23%, 8.36%−11.88%, and 2.25%−2.44%, respectively. The values of h² were 55.52%, 55.52%, 82.71%, and 83.48%, respectively (Table 2). The frequency distribution of flour color traits based on BLUP values was approximately normal (Fig. 1).
Correlation analysis was conducted on the color traits of flour based on BLUP values (Table 3). Highly significant correlations were observed between FL* and Fa*, Fb*, and whiteness, with correlation coefficients of −0.251, −0.316, and 0.738, respectively. Additionally, highly significant correlations were found between Fa* and Fb*, as well as whiteness, with correlation coefficients of −0.645 and 0.225, respectively. Furthermore, a highly significant negative correlation was identified between Fb* and whiteness, with a correlation coefficient of −0.764.
Genome-wide association study for flour color traits
Based on BLUP values and using 23,143 SNPs after filtering out low-quality SNPs, the GWAS identified 31 marker-trait associations (MTAs) related to flour color traits (Fig. 2; Table S6), including 2 MTAs for FL*, 16 MTAs for Fa*, 10 MTAs for Fb*, and 3 MTAs for whiteness. These MTAs were distributed on chromosomes 1 A, 1B, 2 A, 3B, 4 A, 4B, 4D, 5 A, 5B, 5D, 6 A, 6B, 6D, and 7B, explaining 5.52% to 19.91% of phenotypic variation (Table S6). Simultaneously, GWAS was performed in each environment, and a total of 20 stable association signals were detected, distributed across 16 loci (Figures S1, S2, S3, S4; Table 4), including 1 locus for FL* located on chromosome 5D; 6 loci for Fa*, located on chromosomes 1 A, 5 A, 1B (2), 6B, and 4D; 6 loci for Fb*, located on chromosomes 2 A (2), 4 A, 4B, 6B, and 5D; 3 loci for whiteness, located on chromosomes 2 A, 4 A, and 5D. The marker 6B_20151350 was significantly and stably associated with both Fa* and Fb*; The marker 4A_601242583 was significantly and stably associated with both Fb* and whiteness; The marker 5D_6525346 was significantly and stably associated with FL*, Fb*, and whiteness. These markers were considered to be pleiotropic loci (Table 4).
Manhattan and quantile-quantile (Q-Q) plots for color quality traits identified through genome-wide association analysis using BLUP values. (A-D), Manhattan, and Q-Q plots for FL*, Fa*, Fb*, and whiteness, respectively; A horizontal line represents the significance threshold at which markers were considered associated with a trait (P < 1E-4, = 4).
Development and validation of KASP markers
Materials were classified based on genotypes of significant and stable SNPs, and t-tests were used to detect the significance of genotype effects on phenotypic traits (Table S7). SNPs with highly significant genotypic effects in the four environments of 2020EM, 2020QT, 2021EM, and 2021QT were selected to develop KASP markers (Table S7). The KASP marker developed for SNP 1A_236720351 could effectively group the validation subset based on allele genotype (Fig. 3A). In this study, only AA homozygotes and AG heterozygotes were found in the validation subset, and no GG homozygotes were present. The Fa* of AA homozygous genotype was significantly higher (P < 0.05) than that of AG heterozygous genotype (Fig. 3B). The KASP marker developed for SNP 1B_473486955 could significantly group the materials of the AA, AG, and GG genotypes in the validation subset (Fig. 3C), and the Fa* of the GG homozygous genotype was significantly higher compared to the AA homozygous genotype (Fig. 3D).
Kompetitive allele-specific PCR (KASP) verification of a significant single nucleotide polymorphism (SNP) related to the color quality. (A, C), Scatter plots of KASP markers for Fa*; (B, D), the variance of Fa* for accessions with different alleles; Red dots and blue triangles represent the homozygous genotypes, green dots represent heterozygous genotypes, and black squares on the bottom left of the plot indicate the no-template control; *** indicate significant differences at P < 0.001; * indicates significant differences at P < 0.05.
Candidate genes for flour color traits
SNPs that were significantly and stably associated with flour color traits, and showed significant differences in phenotype between different haplotypes across four environments (2020EM, 2020QT, 2021EM, and 2021QT; P < 0.001; Table S7) were selected to search for candidate genes. A total of 410 genes were detected within 2 Mb upstream and 2 Mb downstream sequences flanking those SNPs. Among them, genes before and after SNPs, and genes overlapping with SNPs were shown in Table S8, with 17 SNPs located in the intergenic region and 3 SNPs located within genes. GO enrichment analysis showed that these genes were involved in a total of 439 GO terms, classified into 15 biological processes, 2 cellular components, and 10 molecular functions. Most genes were mainly concentrated in metabolic processes and cellular processes of biological processes, cellular anatomical entities and protein-containing complexes of cellular components, and binding and catalytic activities of molecular functions (Figure S5C). KEGG35,36,37 analysis showed that these genes were mainly enriched in metabolic pathways, biosynthesis of secondary metabolites, steroid biosynthesis, photosynthesis, oxidative phosphorylation, ribosome, and other processes (Figure S5B).
Based on the RNA-seq data from a public expression database and functional annotations of those genes, and also considering genes that overlap with significant SNPs, the present study selected 7 candidate genes for qRT-PCR analysis. The relative quantitative data were transformed by log10, and a heatmap was drawn (Figure S5A). Six candidate genes were differentially expressed in seeds of flour color extreme materials (Fig. 4; Table S9). TraesCS5D02G013100 encodes PMA1, which was differentially expressed in 30-day seeds after anthesis of extreme materials; TraesCS5D02G014300.1 encodes P450, which was differentially expressed in 15-day seeds after anthesis of extreme materials; TraesCS1B02G269500.1 encodes isopentenyl diphosphate delta isomerase, which was differentially expressed in 25-day and 30-day seeds after anthesis of extreme materials; TraesCS6B02G034100.1 encodes the DExH-box ATP-dependent RNA helicase DExH12, which was differentially expressed in 20-day, 25-day, and 30-day seeds after flowering of extreme materials; This study did not annotate the protein function of gene TraesCS4A02G307200, which was differentially expressed in seeds of extreme materials at 20, 25, and 30 days after flowering; TraesCS1B02G269100.1 encodes glutathione transferase, which was differentially expressed in 25-day and 30-day seeds of materials after flowering (Fig. 4; Table S9).
Discussion
Analysis of flour color traits
The whiteness and color-related traits of flour are crucial factors that determine the quality of the final wheat product. Therefore, it is essential to identify the main and stable allele loci for these traits and then transfer these favorable alleles to the commodity variety2. In this study, significant differences were observed in the flour color traits among different genotypes, environments, and years, except for whiteness among different environments (Table 1), indicating that flour whiteness was mainly controlled by genetic makeup rather than environmental factors38. Flour whiteness and FL* showed high heritability in this study (Table 2), suggesting that these traits were primarily controlled by genetics, making them easier to improve and breed at the genetic level39. All color traits exhibited a continuous distribution in the association panel (Fig. 1), displaying typical quantitative trait characteristics, indicating that they were controlled by polygenes, consistent with previous report2. Correlation analysis revealed highly significant correlations between flour color traits (Table 3), which align with previous research results2,22,39,40.
Genome-wide association study for FL*
This study consistently identified one locus on chromosome 5D (5D_6525346) significantly associated with FL*, explaining the highest phenotypic variation rate of 19.91%. In line with the finding of Chen et al.41, who pinpointed a significant marker wPt-0853-cfd18 associated with FL* at 5,597,656 bp on chromosome 5D. Previous research has also identified the primary QTL/genes influencing FL* on chromosomes 2 A, 3 A, 4 A, 6 A, 7 A, 1B, 3B, 4B, 5B, 7B, 2D, 4D, and 5D2,42,43,44,45,46. For instance, Schmidt et al.45 identified a QTL associated with FL* on chromosome 4B (R2 = 5%). Johnson et al.44 reported that QTLs for FL* were located on chromosome 2 A (189.8 cM) and 6 A (0.1–3.1.1.1 cM). Zhao et al.46 detected 13 QTLs for FL* on chromosomes 1 A (5 cM), 6 A (1–5 cM), 1B (3–4 cM), 2B (10 cM), 3B (2 cM), 5B (9–11 cM), 7B (cM), 2D (2, 8, 15 cM), and 5D (0 cM) using the recombinant inbred line ‘Chuan 35050 × Shannong 483’ as the material.
Genome-wide association study for Fa*
This study identified 6 loci significantly and stably associated with Fa* on chromosomes 1 A, 5 A, 1B, 6B, and 4D (Table 4). Similarly, previous studies identified QTL/genes linked to flour Fa* on chromosomes 1 A, 3 A, 4 A, 6 A, 7 A, 3B, 5B, 6B, 7B, 4D, 5D, and 7D2,42,44. Zhao et al.46 identified 12 QTL/genes associated with Fa* on chromosomes 3 A (20 cM), 5 A (30 cM), 6 A (0 and 6 cM), 1B (0 cM), 2B (1, 4, 11 cM), 4B (11 cM), 6B (4 cM), 7B (12 cM), and 5D (0 and 2 cM). Zhang et al.39 used 240 recombinant inbred lines (RILs) derived from crossing the Chinese wheat variety PH82-2 with Neixiang 188 to map the QTLs of Fa* on chromosome 1 A (Xwmc120 - Xbarc269), 4 A (Xwmc468 - Xbarc170), 7 A (Xwmc809 - YP7A), 1B (Sect. 1 - HVM23), and 3B (Xbarc84 - Xbarc77), explaining up to 35.9% of the phenotypic variation. Additionally, Zhang et al.40 employed 168 diploid (DH) lines hybridized with Huapei 39 and Yumai 57 to identify a major QTL qa1B for Fa* on chromosome 1B (0.1 cM), which accounts for 25.64% of the phenotypic variation.
Genome-wide association study for Fb*
The yellowness index (Fb*) of flour was mainly influenced by the quantity of carotenoids, luteins, and flavonoids7. This study identified 6 loci significantly associated with Fb* on chromosomes 2 A, 4 A, 4B, 6B, and 5D (Table 4). Johnson et al.44 conducted GWAS on 243 varieties and advanced breeding lines selected from the past 20 years and identified loci significantly associated with yellowing on chromosomes 4 A (159.5 cM), 4B (22.5–26.4 cM), and 7B (120.4–123.2 cM and 138.3–140.4 cM). Mares and Campbell47 identified QTLs associated with Fb* mapped on chromosome 7 A in two populations of three diploid Australian wheat populations. Schmidt et al.45 identified two QTLs associated with Fb* on chromosomes 3 A (R2 = 5%) and 4B (R2 = 12%). Kuchel et al.48 identified the main QTL for Fb* on chromosome 7B (Xgwm283 - Xgwm146), explaining up to 77% of phenotypic variation. Zhao et al.46 detected 13 QTLs for Fb* on chromosomes 1 A (0, 2, 3, and 7 cM), 4 A (4 cM), 6 A (0, 1, and 4 cM), 1B (0 cM), 5B (1 and 3 cM), 7B (0 cM), 2D (12 cM), 4D (9 cM), 5D (9 cM), and 6D (0 cM). Parker et al.49 reported two QTLs on chromosomes 3 A (Xbcd828−3 A) and 7 A (Xcdo347-7 A, Xcdo347-7 A), explaining 60% and 13% of Fb* total phenotypic variation, respectively. Zhang et al.42 detected the main QTL of Fb* on chromosome 7 A, accounting for 12.1% to 37.6% of phenotypic variation in five environments. Zhang et al.39 used 240 recombinant inbred lines (RILs) obtained by crossing the Chinese wheat variety PH82-2 with Neixiang 188 to map the QTL of Fb* on chromosomes 7 A (Xwmc809-YP7A) and 1B (Sect. 1-HVM23). In summary, loci associated with Fb were detected multiple times on chromosomes 1B, 3 A, 7 A, and 7B; however, they were not identified in our study. This discrepancy may be due to the different SNP chip and associated panel utilized in this research compared to previous studies. The loci identified in this study on chromosomes 2 A and 6B have not been reported and may be new genetic loci related to Fb*.
Genome-wide association study for flour whiteness
This study identified three loci significantly and stably associated with flour whiteness on chromosomes 2 A, 4 A, and 5D (Table 4). Among these, the locus on chromosome 5D (5D_6525346) was consistently detected in all environments, with the highest phenotype interpretation rate of 16.66%. In line with previous research, Ji et al.2 found a significant SNP BS0000020_51 associated with flour whiteness in multiple environments, with the highest phenotypic variation explanatory rate of 15.95%. Notably, this locus was only 3 Mb away from the SNP (5D_6525346) identified in this study.
Notably, SNP marker 5D_6525346 was detected to be significantly associated with flour FL*, Fb*, and whiteness (Table 4). Previous studies have also detected QTLs associated with flour color near this marker, indicating that there might be important genes affecting flour color at this locus50. The whiteness of flour was also affected by the grinding characteristics51. However, the grinding characteristics are influenced by particle hardness52,53, which in turn affects the whiteness and color of flour2. Pin A and Pin B were genes related to grain hardness, and any deletion or mutation in a gene can lead to hardness; Pinb-D1 has multiple types of mutations47,54. Tsilo et al.21 found a QTL associated with Fb* and FL* on the chromosome 5D, which was consistent with the hardness (Ha) site reported by Matter et al.20 on the 5DS. Zhai et al.22 identified a distance of 2.1 cM between QTL QFL.caas-5D-1 and the Pin-b gene. The SNP marker BS0000020_51 identified by Ji et al.2 on the 5D chromosome was significantly associated with flour whiteness, FL*, and Fb*. Compared with the results of Zhai et al.22, it was inferred that this SNP marker may also control flour color through grain hardness. This study screened the marker 5D_6525346 for color association analysis, with a distance of 3.5 Mb from the Pinb-D1 (chromosome 5D, 3,031,551-3,032,419) gene, indicating that Pinb-D1 might also affect flour color, which was consistent with the research results of Zhai et al.55.
Development of KASP markers
SNPs are currently the most widely used molecular markers because they are ubiquitous in a given genome and have lower costs compared to other marker technologies56. KASP uses endpoint fluorescence detection to distinguish labeled alleles, making it an advanced SNP genotyping technique57. If multiple SNPs are identified within the same LD interval, theoretically only one SNP needs to be developed as a KASP marker. However, considering that some SNPs may be located in regions where designing good primers is not possible, or the genotyping of some primers may not be optimal, or the phenotype difference is not significant even though the genotyping might be clear. Therefore, multiple SNPs can be selected for KASP transformation simultaneously, and ultimately, the KASP marker with good genotyping and a significant phenotype difference can be chosen for breeding. This study developed two KASP markers from significant and stable SNPs associated with Fa* identified by GWAS (Fig. 3). Among them, the KASP marker transformed from 1B_473486955 was genotyping distinct, and notable phenotypic differences were observed among different haplotypes, which can be utilized for marker-assisted selection breeding.
Identification of candidate genes
Previous studies have reported that the color of flour mainly depends on the accumulation of pigment substances such as yellow pigments and carotenoids in flour, as well as the oxidative degradation of pigment substances by some other peroxidase enzymes (polyphenol oxidase, peroxidase, lipoxygenase, etc.)18. The yellow pigment content in wheat flour is influenced by the activity of lipoxygenase (LOX), which catalyzes the oxidation of carotenoids, resulting in white wheat flour58. The primary gene responsible for LOX activity in wheat was located at the TaLox-B1 locus on chromosome 4BS59. Complementary dominant functional markers, LOX16 and LOX18, have been developed by Geng et al.60 based on this locus and have been extensively used to detect LOX activity in wheat germplasm resources61. This research identified the candidate gene Lipoxygenase LOX2.1, associated with whiteness, on the 5D chromosome through GWAS (Fig. 4; Table S7). This gene may represent a novel gene that influences wheat LOX activity, warranting further investigation.In addition, TraesCS5D02G013100 encodes the plasma membrane H+-ATPase (LHA1), which is the primary pump responsible for establishing the plant cell membrane potential. This enzyme not only governs fundamental plant cell functions but also plays a role in responding to diverse environmental stimuli and signaling events62. TraesCS5D02G014300.1 encodes a plant cytochrome P450, which plays a crucial role in many biosynthetic pathways, particularly those involving the production of multiple secondary metabolites63. TraesCS1B02G269100.1 encodes glutathione transferase, which is an ancient, multi-member, and diverse enzyme class. Plant glutathione transferase has multiple effects on plant development, endogenous metabolism, stress tolerance, and exogenous detoxification64. TraesCS1B02G269500.1 encodes isopentenyl diphosphate delta isomerase. Plant hormones (such as abscisic acid, gibberellin, and cytokinin), plant alcohols, carotenoids, and monoterpenes are obtained through the plastid non-methylhydroxyvaleric acid pathway65. Isopentenyl diphosphate serves as a common precursor for all isoprenoids in this pathway66. TraesCS6B02G034100.1 encodes the DExH box ATP-dependent RNA helicase DExH12 (BRR2a). In Arabidopsis, a missense mutation in BRR2a leads to splicing defects in FLC (Flowering Locus C), resulting in reduced FLC transcript levels and premature flowering67. The function of TraesCS4A02G307200 remains unknown based on the available database information. Further investigation is necessary to understand the roles of these genes in flour coloration.
Conclusion
The coefficient of variation of flour color traits ranged from 0.62% to 22.23%, and the generalized heritability ranged from 0.56 to 0.83. They all follow an approximately normal distribution. There were highly significant correlations between flour color traits. GWAS identified 20 SNP markers significantly and stably associated with flour color traits, spread across 16 loci, including 1 for FL*, 6 for Fa*, 6 for Fb*, and 3 for whiteness. Two KASP markers were successfully developed for Fa*. Seven candidate genes that may influence flour color were identified. This study provides valuable information on genes or genetic loci related to flour color and also offers KASP markers for marker-assisted selection to enhance wheat flour color quality in China.
Data availability
The data presented in the study are deposited in Figshare DOI: https://doi.org/10.6084/m9.figshare.26360368.
References
Parker, G. D. & Langridge, P. Development of a STS marker linked to a major locus controlling flour color in wheat (Triticum aestivum L). Mol. Breed. 6, 169–174 (2000).
Ji, M. et al. Genome wide association study of the whiteness and colour related traits of flour and dough sheets in common wheat. Sci. Rep. 11, 8790 (2021).
He, X., He, Z., Ma, W., Appels, R. & Xia, C. Allelic variants of phytoene synthase 1 (Psy1) genes in Chinese and CIMMYT wheat cultivars and development of functional markers for flour colour. Mol. Breed. 23, 553–563 (2009).
Baik, B. K., Czuchajowska, Z. & Pomeranz, Y. Discoloration of dough for Oriental noodles. Cereal Chem. 72, 198–205 (1995).
Borrelli, G. M., Troeeoli, A., Fonzo, N. D. & Fares, C. Durum wheat Lipoxygenase activity and other quality parameters that affect pasta color. Cereal Chem. 76, 335–340 (1999).
Borrelli, G. M., Deleonardis, A. M., Platani, C. & Troccoli, A. Distribution along durum wheat kernel of the components involved in semolina color. J. Cereal Sci. 48, 494–502 (2008).
Miskelly, D. M. Flour components affecting paste and noodle colour. J. Sci. Food Agr. 35, 463–471 (1984).
Dong, Z. et al. Grain-specific reduction in Lipoxygenase activity improves flour color quality and seed longevity in common wheat. Mol. Breed. 35, 150 (2015).
Zhai, S. et al. Genetic analysis of phytoene synthase 1 (Psy1) gene function and regulation in common wheat. BMC Plant. Biol. 16, 228 (2016).
Mayer, A. M. & Harel, E. Polyphenol oxidases in plants. Phytochemistry 18, 193–215 (1979).
Morris, C. F., JeVers, H. C. & Engle, D. A. Effect of processing, formula and measurement variables on alkaline noodle color-toward an optimized laboratory system. Cereal Chem. 77, 77–85 (2000).
Fraignier, M. P., Michaux-Ferriere, N. & Kobrehel, K. Distribution of peroxidases in durum wheat (Triticum durum). Cereal Chem. 77, 11–17 (2000).
Kobrehel, K., Laignelet, B. & Feillet, P. Study of some factors of macaroni brownness. Cereal Chem. 51, 675–683 (1974).
Hemalatha, M. S., Manu, B. T., Bhagwat, S. G., Leelavathi, K. & Rao, U. J.S.P. Protein characteristics and peroxidase activities of different Indian wheat varieties and their relationship to chapati-making quality. Eur. Food Res. Technol. 225, 463–471 (2007).
Kruger, J. E., Hatcher, D. W. & Depauw, R. A whole seed assay for polyphenol oxidase in Canadian prairie spring wheats and its usefulness as a measure of noodle darkening. Cereal Chem. 71, 324–326 (1994).
Kruger, J. E., Matsuo, R. R. & Preston, K. A comparison of methods for the prediction of Cantonese noodle colour. Can. J. Plant. Sci. 72, 1021–1029 (1992).
Kruger, J. E. & Reed, G. Enzymes and Colour 441–487 (AACC International, 1988).
Roncallo, P. F. et al. QTL analysis of main and epistatic effects for flour color traits in durum wheat. Euphytica 185, 77–92 (2012).
Zhang, X. & Tian, J. C. The color advantage of Chinese wheat with high whiteness and analysis of factors affecting color formation. Scientia Agricultura Sinica. 41, 347–353 (2008).
Mattern, P. J., Morris, R., Schmidt, J. W. & Johnson, V. A. Location of genes for kernel properties in the wheat variety ‘Cheyenne’ using chromosome substitution lines. Columbia, 703–708. (1973).
Tsilo, T. J., Hareland, G. A., Chao, S. & Anderson, J. A. Genetic mapping and QTL analysis of flour color and milling yield related traits using Recombinant inbred lines in hard red spring wheat. Crop Sci. 51, 237–246 (2011).
Zhai, S. et al. Genome-wide linkage mapping of flour color-related traits and polyphenol oxidase activity in common wheat. Theor. Appl. Genet. 129, 377–394 (2015).
Pritchard, J. K., Stephens, M., Rosenberg, N. A. & Donnelly, P. Association mapping in structured populations. Am. J. Hum. Genet. 67, 170–181 (2000).
Bradbury, P. V. J. et al. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633–2635 (2007).
Tian, Y. et al. Genome-wide association analysis and KASP markers development for protein quality traits in winter wheat. BMC Plant Biol. 25, 1–17 (2025).
Sheoran, S. et al. Uncovering genomic regions associated with 36 Agro-Morphological traits in Indian spring wheat using GWAS. Front. Plant. Sci. 10, 527 (2019).
Ghimire, B. et al. Genetics of Fusarium head blight resistance in soft red winter wheat using a genome-wide association study. Plant. Genome. 15, e20222 (2022).
Santos, J. V. M. D. et al. Association mapping of a locus that confers Southern stem canker resistance in soybean and SNP marker development. BMC Genom. 20, 798 (2019).
Pingault, L. et al. Deep transcriptome sequencing provides new insights into the structural and functional organization of the wheat genome. Genome Biol. 16, 29 (2015).
Henderson, C. R. Best linear unbiased Estimation and prediction under a selection model. Biometrics 31, 423–447 (1975).
Bates, D. et al. Lme4: linear mixed-effects models using Eigen and S4. J. Stat. Softw. 67. (2015).
Marcotuli, I. et al. Genetic diversity and genome wide association study of β-glucan content in tetraploid wheat grains. PLoS One. 11, e0152590 (2016).
Turner, S. D. & qqman An R package for visualizing GWAS results using Q-Q and Manhattan plots. BioRxiv 005165. (2014).
Chen, C. et al. TB tools-II: A one for all, all for one bioinformatics platform for biological big-data mining. Mol. Plant. Pathol. 16, 1733–1742 (2023).
Kanehisa, M. & Goto, S. K. E. G. G. Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).
Kanehisa, M. Toward Understanding the origin and evolution of cellular organisms. Protein Sci. 28, 1947–1951 (2019).
Kanehisa, M., Furumichi, M., Sato, Y., Kawashima, M. & Ishiguro-Watanabe M. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res. 51, D587–D592 (2023).
Wu, P. P., Song, S., Zhang, F. Y., Chen, F. & Cui, D. Q. The allelic variation of Lipoxygenase genes in bread wheat cultivars from the yellow and Huai wheat areas of China. Scientia Agricultura Sinica. 48, 207–214 (2015). (In Chinese).
Zhang, Y. L. et al. QTL mapping for flour and noodle colour components and yellow pigment content in common wheat. Euphytica 165, 435–444 (2009a).
Zhang, K. P. et al. Molecular genetic analysis of flour color using a doubled haploid population in bread wheat (Triticum aestivum L). Euphytica 165, 471–484 (2009b).
Chen, B. R. et al. Genome-wide association study for starch content and constitution in sorghum (sorghum bicolor (L.) Moench). J. Integr. Agric. 18, 2446–2456 (2019).
Zhang, L. P., Yan, J., Xia, X. C., He, Z. H. & Sutherland, M. W. QTL mapping for kernel yellow pigment content in common wheat. Acta Agron. Sinica. 32, 41–44 (2006).
Martin, J. M., Frohberg, R. C., Morris, C. F., Talbert, L. E. & Giroux, M. J. Milling and bread baking traits associated with Puroindoline sequence type in hard red spring wheat. Crop Sci. 41, 228–234 (2001).
Johnson, M. et al. Association mapping for 24 traits related to protein content, gluten strength, color, cooking, and milling quality using balanced and unbalanced data in Durum wheat [Triticum turgidum L. var. Durum (Desf)]. Front. Genet. 10, 717 (2019).
Schmidt, A. L., Liu, C., Matin, D., Kelly, A. & Mclntyre, L. Molecular Markers for Selected Quality Traits in Australian Hexaploid Bread Wheat (Cirql Pty Ltd, 2004).
Zhao, Y., Sun, H. Y., Wang, Y. Y. & Pu, Y. Y. QTL mapping for the color, carotenoids and polyphenol oxidase activity of flour in Recombinant inbred lines of wheat. Aust J. Crop Sci. 7, 328–337 (2013).
Mares, D. J. & Campbell, A. W. Mapping components of flour and noodle colour in Australian wheat. Aust J. Agr Resour. Ec. 52, 1297–1309 (2001).
Kuchel, H., Langridge, P., Mosionek, L., Williams, K. & Jefferies, S. P. The genetic control of milling yield, dough rheology, and baking quality of wheat. Theor. Appl. Genet. 112, 1487–1495 (2006).
Parker, G. D., Chalmers, K. J., Rathjen, A. J. & Langridge, P. Mapping loci associated with flour colour in wheat (Triticum aestivum L). Theor. Appl. Genet. 97, 238–245 (1998).
Sourdille, P. et al. Linkage between RFLP markers and genes affecting kernel hardness in wheat. Theor. Appl. Genet. 93, 580–586 (1996).
Giroux, M. & Morris, C. F. A Glycine to Serine change in Puroindoline b is associated with wheat grain hardness and low levels of starch-surface friabilin. Theor. Appl. Genet. 95, 857–864 (1997).
Morris, C. F., Lillemo, M., Simeone, M. C., Giroux, M. J. & Babb, S. L. Prevalence of Puroindoline grain hardness genotypes among historically significant North American spring and winter wheats. Crop Sci. 41, 218–228 (2001).
Xia, L., Chen, F., He, Z., Chen, X. & Morris, C. F. Occurrence of Puroindoline alleles in Chinese winter wheats. Cereal Chem. 82, 38–43 (2005).
Chen, F. et al. Molecular and biochemical characterization of Puroindoline a and b alleles in Chinese landraces and historical cultivars. Theor. Appl. Genet. 112, 400–409 (2006).
Zhai, S. et al. A Genome-wide association study reveals a rich genetic architecture of flour color-related traits in bread wheat. Front. Plant. Sci. 9, 1136 (2018).
Cheng, A. et al. Molecular Marker Technology for Genetic Improvement of Underutilised Crops (Springer, 2017).
He, C., Holme, J. & Anthony, J. SNP Genotyping: the KASP Assay Vol. 1145, 75–86 (Humana, 2014).
Leenhardt, F. et al. Genetic variability of carotenoid concentration, and Lipoxygenase and peroxidase activities among cultivated wheat species and bread wheat varieties. Eur. J. Agron. 25, 170–176 (2006).
Geng, H. et al. Molecular markers for tracking variation in Lipoxygenase activity in wheat breeding. Mol. Breed. 28, 117–126 (2011).
Geng, H. W., Xia, X. C., Zhang, L. P., Qu, Y. & He, Z. Development of functional markers for a Lipoxygenase gene TaLox-B1 on chromosome 4BS in common wheat. Crop Sci. 52, 568–576 (2012).
Pshenichnikova, T. A. & Permyakov, A. V. Effects of limited introgressions from Triticum timopheevii tausch. Into the genome of bread wheat (Triticum aestivum L.) on physiological and biochemical traits under normal watering and drought. Russ J. Genet. 6, 553–559 (2016).
Ferrol, N., Pozo, M. J., Antelo, M. & Azcón-Aguilar, C. Arbuscular mycorrhizal symbiosis regulates plasma membrane H+-ATPase gene expression in tomato plants. J. Exp. Bot. 53, 1683–1687 (2002).
Xiang, W., Wang, X. & Ren, T. Expression of a wheat cytochrome p450 monooxygenase cDNA in yeast catalyzes the metabolism of sulfonylurea herbicides. Pestic Biochem. Physiol. 85, 1–6 (2006).
Nianiou-Obeidat, I. et al. Plant glutathione transferase-mediated stress tolerance: functions and biotechnological applications. Plant. Cell. Rep. 36, 791–805 (2017).
Lichtenthaler, H. K. The 1-deoxy-d-xylulose-5-phosphate pathway of isoprenoid biosynthesis in plants. Annu. Rev. Plant. Biol. 50, 47–65 (1999).
Nakamura, A., Shimada, H., Masuda, T., Ohta, H. & Takamiuya, K. Two distinct isopentenyl diphosphate isomerases in cytosol and plastid are differentially induced by environmental stresses in tobacco. FEBS Lett. 506, 61–64 (2001).
Mahrez, W. et al. BRR2a affects flowering time via FLC splicing. PLoS Genet. 12, e1005924 (2016).
Funding
This work was funded by the Key Industrial Innovation Technology Projects in Southern Xinjiang of the Xinjiang Production and Construction Corps (2022DB013), Agricultural Technology Public Relations Project of the Xinjiang Production and Construction Corps (2023AA201), the Science and Technology Innovation Engineering Technology Cooperation Project of the Xinjiang Production and Construction Corps (2024BA001), the National Natural Science Foundation of China (31560391), the Innovation and Entrepreneurship Base Construction Project of Xinjiang Production and Construction Corps (2022CA006), and the International Science and Technology Cooperation Program Project of Xinjiang Production and Construction Corps (2019BC003).
Author information
Authors and Affiliations
Contributions
W.L. and W.S. conceived and planned the research; P.L.; Z.X.; L.Y.; D.K.; Y.N.; H.X. and X.H. conducted the research; Y.T. analyzed the data and wrote the manuscript; All authors have read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Tian, Y., Liu, P., Zhang, X. et al. Genome-wide association study and KASP marker development for flour color in winter wheat. Sci Rep 15, 39673 (2025). https://doi.org/10.1038/s41598-025-23358-4
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-23358-4






