Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Joint genetic control of isoflavones and soyasaponins revealed by mGWAS, genomic prediction, and SHAP-guided allele stacking
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 25 April 2026

Joint genetic control of isoflavones and soyasaponins revealed by mGWAS, genomic prediction, and SHAP-guided allele stacking

  • Hakyung Kwon1,2,
  • Seung Yeob Song3,
  • Yeonghun Cho1,
  • Ji Eun Ra3 &
  • …
  • Jungmin Ha1,2 

Scientific Reports (2026) Cite this article

  • 624 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Computational biology and bioinformatics
  • Genetics
  • Plant sciences

Abstract

Isoflavones and soyasaponins are two classes of health-promoting specialized metabolites in soybean, and improving them simultaneously is a key breeding goal. Emerging evidence indicates that these two metabolite classes can act synergistically in vivo and in vitro, making their simultaneous enhancement an increasingly important breeding objective. However, despite extensive studies on each pathway independently, the genetic basis underlying joint variation of isoflavones and soyasaponins remains poorly understood. Here, we profiled 17 metabolites (12 isoflavones and 5 soyasaponins) across 376 accessions of the Korean soybean core collection using UPLC. We characterized metabolite distributions, correlations, and presence–absence patterns, and performed multi-metabolite Genome-Wide Association Study (GWAS), identifying 70 high-confidence loci. These included previously reported major loci as well as eight novel loci for isoflavones and thirteen for soyasaponins. Five genomic regions showed shared linkage disequilibrium (LD) structure between the two pathways, and we identified candidate genes for high-confidence loci. We next compared FT-IR–based phenomic prediction with GWAS-informed genomic prediction, finding that genomic prediction consistently outperformed phenomic prediction and achieved moderate to high accuracy, indicating strong genetic determinism. Finally, we applied an XGBoost– SHapley Additive exPlanations (SHAP) framework to estimate the extent to which favorable alleles could be combined in silico. Single-trait allele stacking pointed to CMJ_115, CMJ_068, and CMJ_236 as the best-performing accessions for Acetyl-daidzin, Malonyl-daidzin, and Soyasaponin-ab, respectively. Multi-trait optimization produced a virtual genotype most similar to CMJ_317, suggesting this accession as a practical parent for jointly improving both metabolite classes. Overall, our findings provide a population-scale map of diversity, genetic factors, and achievable breeding gains for functional soybean improvement.

Similar content being viewed by others

High-quality genome of a modern soybean cultivar and resequencing of 547 accessions provide insights into the role of structural variation

Article 09 September 2024

Dynamic polyphenolic profiling of soybean seeds and leaves during developmental stages

Article Open access 10 November 2025

Pilot-scale genome-wide association mapping in diverse sorghum germplasms identified novel genetic loci linked to major agronomic, root and stomatal traits

Article Open access 08 December 2023

Funding

This work was carried out with the support of "Cooperative Research Program for Agriculture Science and Technology Development (Project No. RS-2025-00853272)" Rural Development Administration, Republic of Korea. This research was supported by the Regional Innovation System & Education(RISE) program through the Gangwon RISE Center, funded by the Ministry of Education(MOE) and the Gangwon State(G.S.), Republic of Korea (2025-RISE-10-005).

Author information

Authors and Affiliations

  1. Department of Agriculture, Forestry and Bioresources and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, 08826, Republic of Korea

    Hakyung Kwon, Yeonghun Cho & Jungmin Ha

  2. Plant Genomics and Breeding Institute, Seoul National University, Seoul, 08826, Republic of Korea

    Hakyung Kwon & Jungmin Ha

  3. Food Tech Resources Research Division, Department of Food Sciences, Rural Development Administration, National Institute of Crop and Food Science, Wanju-Gun, Republic of Korea

    Seung Yeob Song & Ji Eun Ra

Authors
  1. Hakyung Kwon
    View author publications

    Search author on:PubMed Google Scholar

  2. Seung Yeob Song
    View author publications

    Search author on:PubMed Google Scholar

  3. Yeonghun Cho
    View author publications

    Search author on:PubMed Google Scholar

  4. Ji Eun Ra
    View author publications

    Search author on:PubMed Google Scholar

  5. Jungmin Ha
    View author publications

    Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Jungmin Ha.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information 1. (download DOCX )

Supplementary Information 2. (download XLSX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kwon, H., Song, S.Y., Cho, Y. et al. Joint genetic control of isoflavones and soyasaponins revealed by mGWAS, genomic prediction, and SHAP-guided allele stacking. Sci Rep (2026). https://doi.org/10.1038/s41598-026-50166-1

Download citation

  • Received: 02 December 2025

  • Accepted: 20 April 2026

  • Published: 25 April 2026

  • DOI: https://doi.org/10.1038/s41598-026-50166-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Soybean
  • Metabolites
  • Isoflavone
  • Soyasaponin
  • mGWAS
  • Genomic prediction
Download PDF

Associated content

Collection

Secondary metabolites in plants

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing