Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Data
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific data
  3. data descriptors
  4. article
High-quality chromosome-scale genome assemblies of 29 maize inbred lines of European breeding relevance
Download PDF
Download PDF
  • Data Descriptor
  • Open access
  • Published: 19 March 2026

High-quality chromosome-scale genome assemblies of 29 maize inbred lines of European breeding relevance

  • Camille Marcuzzo1 na1,
  • Clément Birbes2 na1,
  • Camille Eché1 na1,
  • Arnaud Di Franco3,
  • Thomas Faraut3,
  • Erwan Denis1,
  • Claire Kuchly  ORCID: orcid.org/0000-0001-8994-550X1,
  • Caroline Vernette1,
  • Sébastien Praud4,
  • Alain Charcosset5,
  • Christine Gaspin2,
  • Denis Milan  ORCID: orcid.org/0000-0002-8062-50721,3,
  • Stéphane D. Nicolas5,
  • Cécile Donnadieu  ORCID: orcid.org/0000-0002-5164-30951,
  • Clémentine Vitte6,
  • Christophe Klopp  ORCID: orcid.org/0000-0001-7126-54777 &
  • …
  • Carole Iampietro  ORCID: orcid.org/0000-0002-8148-47851 

Scientific Data , Article number:  (2026) Cite this article

  • 1039 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Agricultural genetics
  • Genetic variation
  • Genomics
  • Next-generation sequencing
  • Plant genetics

Abstract

Although several maize genome assemblies are publicly available, those of lines important to European breeding programs are underrepresented. Using PacBio long-read sequencing, we assembled high-quality chromosome-level genomes of 29 key lines of European breeding relevance, encompassing Northern flint and European flint lines used for adaptation to Northern European climate, lines derived from European landraces of tropical origin, and American temperate dent lines adapted to European regions. Genome assembly sizes range from 2.17 to 2.35 gigabases, with scaffold N50s ranging from 219 to 254 megabases. Completeness assessment revealed BUSCO scores ranging from 97.7 to 98.5 and merqury completeness scores ranging from 96.62 to 98.30. Calling structural variants and SNPs relative to the B73 reference sequence revealed the expected separation of inbred groups. Flint lines contribute the highest number of novel variants, thus emphasizing the importance of sequencing flint material to complete the maize pangenome. These high-quality genome assemblies therefore provide new opportunities to understand the dynamics of maize structural variation, and to identify the functional variations underlying maize phenotypic diversity.

Similar content being viewed by others

Chromosome-level genome assemblies of two maize inbred lines with contrasting plant architectures

Article Open access 19 January 2026

Joint analysis of days to flowering reveals independent temperate adaptations in maize

Article 22 April 2021

Chromosome-scale genome assembly and annotation of Paspalum notatum Flüggé var. saurae

Article Open access 16 August 2024

Data availability

All raw sequencing data, assembled genomes, and variant data (VCF files) have been deposited in publicly accessible repositories. The PacBio HiFi and Hi-C sequencing reads, as well as the genomes assembled from these data, have been uploaded to the European Nucleotide Archive (ENA) at www.ebi.ac.uk/ena as part of the SeqOccIn project, PRJEB600751634, and are accessible under project PRJEB6781231. Structural Variants and SNPs are available to European Variation Archive (EVA) and accessible under the accession PRJEB10659932. Variant data are linked to the nucleotide data through the sharing of a single BioSample ID. Variant data are also available at data.gouv.fr repository (https://doi.org/10.57745/7AUTOL)33.

Code availability

All the codes used for the analysis can be found on the SeqOccIn project’s GitHub page, following the path Data paper/Zea mays data paper: https://github.com/GeTPlaGe/SeqOccIn/tree/main/Data%20paper/Zeamays. The pipeline used for aligning reads and calling variants is available here: https://github.com/SeqOccin-SV/SeqOccinVariants.

References

  1. Wrigley, C. W. & Nirmal, R. C. The major cereal grains: Corn, rice, and wheat, https://doi.org/10.1002/0471238961.23080501.a01.pub3 (2017).

  2. Wang, Q. & Dooner, H. K. Remarkable variation in maize genome structure inferred from haplotype diversity at the bz locus. Proceedings of the National Academy of Sciences 103, 17644–17649, https://doi.org/10.1073/pnas.0603080103 (2006).

    Google Scholar 

  3. Stitzer, M. C., Anderson, S. N., Springer, N. M. & Ross-Ibarra, J. The genomic ecosystem of transposable elements in maize. PLOS Genetics 17, e1009768, https://doi.org/10.1371/journal.pgen.1009768 (2021).

    Google Scholar 

  4. Ou, S. et al. Differences in activity and stability drive transposable element variation in tropical and temperate maize. Genome Research 34, 1140–1153, https://doi.org/10.1101/gr.278131.123 (2024).

    Google Scholar 

  5. Wallace, J. G. et al. Association mapping across numerous traits reveals patterns of functional variation in maize. PLoS Genetics 10, e1004845, https://doi.org/10.1371/journal.pgen.1004845 (2014).

    Google Scholar 

  6. Zhou, P., Hirsch, C. N., Briggs, S. P. & Springer, N. M. Dynamic patterns of gene expression additivity and regulatory variation throughout maize development. Molecular Plant 12, 410–425, https://doi.org/10.1016/j.molp.2018.12.015 (2019).

    Google Scholar 

  7. Ricci, W. A. et al. Widespread long-range cis-regulatory elements in the maize genome. Nature Plants 5, 1237–1249, https://doi.org/10.1038/s41477-019-0547-0 (2019).

    Google Scholar 

  8. Marand, A. P. et al. The genetic architecture of cell type-specific cis regulation in maize. Science 388, https://doi.org/10.1126/science.ads6601 (2025).

  9. Fagny, M. et al. Identification of key tissue-specific, biological processes by integrating enhancer information in maize gene regulatory networks. Frontiers in Genetics 11, https://doi.org/10.3389/fgene.2020.606285 (2021).

  10. Springer, N. M. et al. The maize w22 genome provides a foundation for functional genomics and transposon biology. Nature Genetics 50, 1282–1288, https://doi.org/10.1038/s41588-018-0158-0 (2018).

    Google Scholar 

  11. Sun, S. et al. Extensive intraspecific gene order and gene structural variations between mo17 and other maize genomes. Nature Genetics 50, 1289–1295, https://doi.org/10.1038/s41588-018-0182-0 (2018).

    Google Scholar 

  12. Yang, N. et al. Genome assembly of a tropical maize inbred line provides insights into structural variation and crop improvement. Nature Genetics 51, 1052–1059, https://doi.org/10.1038/s41588-019-0427-6 (2019).

    Google Scholar 

  13. Lin, T., Song, Y., Lawrence, P., Kheshgi, H. S. & Jain, A. K. Worldwide maize and soybean yield response to environmental and management factors over the 20th and 21st centuries. Journal of Geophysical Research: Biogeosciences 126, https://doi.org/10.1029/2021jg006304 (2021).

  14. Chen, J. et al. A complete telomere-to-telomere assembly of the maize genome. Nature Genetics 55, 1221–1231, https://doi.org/10.1038/s41588-023-01419-6 (2023).

    Google Scholar 

  15. Darracq, A. et al. Sequence analysis of european maize inbred line f2 provides new insights into molecular and chromosomal characteristics of presence/absence variants. BMC Genomics 19, https://doi.org/10.1186/s12864-018-4490-7 (2018).

  16. Haberer, G. et al. European maize genomes highlight intraspecies variation in repeat and gene content. Nature Genetics 52, 950–957, https://doi.org/10.1038/s41588-020-0671-9 (2020).

    Google Scholar 

  17. Hufford, M. B. et al. De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes. Science 373, 655–662, https://doi.org/10.1126/science.abg5289 (2021).

    Google Scholar 

  18. Mayjonade, B. et al. Extraction of high-molecular-weight genomic dna for long-read sequencing of single molecules. BioTechniques 61, 203–205, https://doi.org/10.2144/000114460 (2016).

    Google Scholar 

  19. Workman, R. et al. High molecular weight dna extraction from recalcitrant plant species for third generation sequencing v1. https://doi.org/10.1038/protex.2018.059 (2018).

  20. Cabanettes, F. & Klopp, C. D-GENIES: dot plot large genomes in an interactive, efficient and simple way. journal = PeerJ 6, https://doi.org/10.7717/peerj.4958 (2018).

  21. Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nature Methods 18, 170–175, https://doi.org/10.1038/s41592-020-01056-5 (2021).

    Google Scholar 

  22. Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution hi-c experiments. Cell systems 3, 95–98 (2016).

    Google Scholar 

  23. Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95, https://doi.org/10.1126/science.aal3327 (2017).

    Google Scholar 

  24. Durand, N. C. et al. Juicebox provides a visualization system for hi-c contact maps with unlimited zoom. Cell Systems 3, 99–101, https://doi.org/10.1016/j.cels.2015.07.012 (2016).

    Google Scholar 

  25. Alonge, M. et al. Automated assembly scaffolding using ragtag elevates a new tomato system for high-throughput genome editing. Genome Biology 23, https://doi.org/10.1186/s13059-022-02823-7 (2022).

  26. Bradnam, K. R. et al. Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. Gigascience 2, https://doi.org/10.1186/2047-217x-2-10 (2013).

  27. Seppey, M., Manni, M. & Zdobnov, E. M. BUSCO: Assessing Genome Assembly and Annotation Completeness, 227–245 (Springer New York, 2019).

  28. Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biology 21, https://doi.org/10.1186/s13059-020-02134-9 (2020).

  29. Smolka, M. et al. Detection of mosaic and population-level structural variants with Sniffles2. Nature Biotechnology 42, https://doi.org/10.1038/s41587-023-02024-y (2024).

  30. Kirsche, M. et al. Jasmine and Iris: population-scale structural variant comparison and analysis. Nature Methods 20, https://doi.org/10.1038/s41592-022-01753-3 (2023).

  31. European nucleotide archive. https://www.ebi.ac.uk/ena/browser/view/PRJEB67812 (2025).

  32. European variant archive. https://www.ebi.ac.uk/eva/?eva-study=PRJEB106599 (2026).

  33. The 29 maize lines SNP and SV variant set. https://doi.org/10.57745/7AUTOL (2025).

  34. Germplasm Resources Information Network (GRIN) — doi.org. https://doi.org/10.15482/USDA.ADC/1212393.

  35. European nucleotide archive. https://www.ebi.ac.uk/ena/browser/view/PRJEB60075 (2023).

  36. Byrne, P. F. et al. Sustaining the future of plant breeding: The critical role of the usda-ars national plant germplasm system. Crop Science 58, 451–468, https://doi.org/10.2135/cropsci2017.05.0303 (2018).

    Google Scholar 

  37. Camus-Kulandaivelu, L. et al. Maize adaptation to temperate climate: Relationship between population structure and polymorphism in the dwarf8 gene. Genetics 172, 2449–2463, https://doi.org/10.1534/genetics.105.048603 (2006).

    Google Scholar 

  38. Bouchet, S. et al. Adaptation of maize to temperate climates: Mid-density genome-wide association genetics and diversity patterns reveal key genomic regions, with a major contribution of the vgt2 (zcn8) locus. PLoS ONE 8, e71377, https://doi.org/10.1371/journal.pone.0071377 (2013).

    Google Scholar 

Download references

Acknowledgements

We thank “La Région Occitanie” and European Union for funding the project as part of the Occitanie Region’s “Regional Research and Innovation Platforms” call for projects under the FEDER-FSE MIDI-PYRENEES ET GARONNE 2014-2020 Operational Program. We thank KWS, Maisadour, Euralis, Caussade semences, Syngenta, RAGT and Limagrain for their financial support and their inputs for choosing the genetic material analyzed. We thank Valérie Combes for sample preparation, Delphine Madur and Nathalie Rivière for genotype validation, Gaëtan Givry for EVA data submission and Jorge Duarte and Johann Joets for insightful discussions on maize genome scaffolding. We are grateful to Cyril Bauland for expertise in maize germplasm accession nomenclature. We thank Carine Palaffre and French maize inbred lines seed bank (CRB, INRAE Saint Martin de Hinx), the U.S. National Plant Germplasm System (NPGS)35 and the USDA Agricultural Research Service Germplasm Resources Information Network (GRIN)36 for providing seeds with traced seedlots, as well as Adrienne Ressayre and Christine Dillmann (GQE-Le Moulon) for providing seeds of F252 and MBS847, and Silvio Salvi (University of Bologna) for early access to seeds from the GF111 inbred line, Carlotta Balconi (CREA-Research Centre for Cereal and Industrial Crops) for providing access to Lo3, and CSIC (Consejo Superior de Investigaciones Científicas) for authorizing the use of EM1197. GeT core facility https://doi.org/10.15454/1.5572370921303193E12 is supported by France Génomique National infrastructure, funded as part of “Investissement d’avenir” program managed by the French Agence Nationale pour la Recherche (contract ANR-10-INBS-09). We are grateful to the genotoul bioinformatics platform Toulouse Occitanie (Bioinfo Genotoul, https://doi.org/10.15454/1.5572369328961167E12) for providing computing and storage resources.

Author information

Author notes
  1. These authors contributed equally: Camille Marcuzzo, Clément Birbes, Camille Eché.

Authors and Affiliations

  1. INRAE, GeT-PlaGe, Genotoul, 31326, Castanet-Tolosan, France

    Camille Marcuzzo, Camille Eché, Erwan Denis, Claire Kuchly, Caroline Vernette, Denis Milan, Cécile Donnadieu & Carole Iampietro

  2. Université Fédérale de Toulouse, INRAE, MIAT, BioinfOmics, 31326, Castanet-Tolosan, France

    Clément Birbes & Christine Gaspin

  3. Université de Toulouse, INRAE, GenPhySE, 31326, Castanet-Tolosan, France

    Arnaud Di Franco, Thomas Faraut & Denis Milan

  4. Groupe Limagrain, Centre de Recherche, Route d’Ennezat, Chappes, France

    Sébastien Praud

  5. Université Paris-Saclay, INRAE, AgroParisTech, GQE, Le Moulon, 91190, Gif-sur-Yvette, France

    Alain Charcosset & Stéphane D. Nicolas

  6. Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE, Le Moulon, EMR GEvAD, 91190, Gif-sur-Yvette, France

    Clémentine Vitte

  7. Université Fédérale de Toulouse, INRAE, MIAT, Sigenae, BioInfo Genotoul, BioinfOmics, 31326, Castanet-Tolosan, France

    Christophe Klopp

Authors
  1. Camille Marcuzzo
    View author publications

    Search author on:PubMed Google Scholar

  2. Clément Birbes
    View author publications

    Search author on:PubMed Google Scholar

  3. Camille Eché
    View author publications

    Search author on:PubMed Google Scholar

  4. Arnaud Di Franco
    View author publications

    Search author on:PubMed Google Scholar

  5. Thomas Faraut
    View author publications

    Search author on:PubMed Google Scholar

  6. Erwan Denis
    View author publications

    Search author on:PubMed Google Scholar

  7. Claire Kuchly
    View author publications

    Search author on:PubMed Google Scholar

  8. Caroline Vernette
    View author publications

    Search author on:PubMed Google Scholar

  9. Sébastien Praud
    View author publications

    Search author on:PubMed Google Scholar

  10. Alain Charcosset
    View author publications

    Search author on:PubMed Google Scholar

  11. Christine Gaspin
    View author publications

    Search author on:PubMed Google Scholar

  12. Denis Milan
    View author publications

    Search author on:PubMed Google Scholar

  13. Stéphane D. Nicolas
    View author publications

    Search author on:PubMed Google Scholar

  14. Cécile Donnadieu
    View author publications

    Search author on:PubMed Google Scholar

  15. Clémentine Vitte
    View author publications

    Search author on:PubMed Google Scholar

  16. Christophe Klopp
    View author publications

    Search author on:PubMed Google Scholar

  17. Carole Iampietro
    View author publications

    Search author on:PubMed Google Scholar

Contributions

C.D., D.M., and Ch.G. conceived and supervised the whole “SeqOccIn” project. Cl.V. and A.C. conceived the maize-related sub-project of the “SeqOccIn” project. C.D., D.M., Ch.G., Cl.V. and A.C. secured funding. C.I. coordinated data generation and quality control. C.I., C.M., C.E., E.D. produced sequence data. Ch.K., T.F. and Cl.K. supervised bioinformatic analyses. C.B., A.D.F., T.F., J.D., S.N., Cl.V. and Ch.K. analysed the results. S.P. and A.C. coordinated the selection of the inbred lines with private partners. Cl.K. and Ca.V. secured data and submitted them to public databases. C.I., Cl.V., Ch.K., S.N. and T.F. wrote the original draft of the manuscript. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Cécile Donnadieu, Clémentine Vitte, Christophe Klopp or Carole Iampietro.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary file (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Marcuzzo, C., Birbes, C., Eché, C. et al. High-quality chromosome-scale genome assemblies of 29 maize inbred lines of European breeding relevance. Sci Data (2026). https://doi.org/10.1038/s41597-026-07055-z

Download citation

  • Received: 07 July 2025

  • Accepted: 09 March 2026

  • Published: 19 March 2026

  • DOI: https://doi.org/10.1038/s41597-026-07055-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Editors & Editorial Board
  • Journal Metrics
  • Policies
  • Open Access Fees and Funding
  • Calls for Papers
  • Contact

Publish with us

  • Submission Guidelines
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Data (Sci Data)

ISSN 2052-4463 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research