Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Communications Biology
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. communications biology
  3. articles
  4. article
Molecular QTL are enriched for structural variants in a cattle long-read cohort
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 21 January 2026

Molecular QTL are enriched for structural variants in a cattle long-read cohort

  • Xena Marie Mapel1 na1,
  • Alexander S. Leonard  ORCID: orcid.org/0000-0001-8425-56301 na1 &
  • Hubert Pausch  ORCID: orcid.org/0000-0002-0501-67601 

Communications Biology , Article number:  (2026) Cite this article

  • 952 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Agricultural genetics
  • Gene expression
  • Genome-wide association studies
  • Quantitative trait

Abstract

Sequencing cohorts with long-read technology is crucial to understand the impact of structural variants (SVs) on complex traits. Here, we obtain 4.86 terabases of HiFi reads with an average read N50 of 16.3 Kb from 120 Bos taurus taurus bulls, yielding a mean coverage depth of 13.5-fold. We genotype 23.8 M small variants (SNPs and short INDELs) and 79.3 k SVs to perform association testing with molecular phenotypes derived from a subset of 117 bulls with total RNA sequencing data from testis tissue. We identify 27.3 k molecular QTL (molQTL) including 316 for which SVs were the most significant variant. This corresponds to a 2.1- and 5.6-fold enrichment of SVs among expression and splicing QTL, respectively. When considering SVs in perfect LD with the lead small variant, the enrichment increases to 6.1- and 12-fold for expression and splicing QTL in testis, respectively. Imperfect genotyping for SVs limits our ability to detect all SV molQTL, suggesting that the true enrichment of SVs among molQTL may be even higher. These results demonstrate that SVs have a profound impact on gene expression and splicing variation but highlight the necessity of improved SV genotyping to fully leverage long-read sequencing cohorts for dissecting complex traits.

Similar content being viewed by others

Detection and functional assessment of structural variants using whole-genome re-sequencing data in Nellore cattle

Article Open access 19 August 2025

Integrating whole-genome sequencing with multi-omic data reveals the impact of structural variants on gene regulation in the human brain

Article 14 March 2022

Molecular quantitative trait loci

Article 25 January 2023

Data availability

DNA and RNA sequencing data of the analysed cohort are available in the ENA database at the study accessions PRJEB42335 (Long-read sequencing data from cattle for the purpose of de-novo genome assembly), PRJEB28191 (Short read sequencing of cattle) and PRJEB46995 (Testis transcriptome of mature bulls). Accession identifiers for all samples are available as Supplementary Data 1. Gene expression and splicing matrices, a VCF file of genome-wide small and structural variant genotypes used for e/sQTL mapping, a cross-table to link genotype and transcriptome data as well as results from e/sQTL mapping have been archived at zenodo (https://zenodo.org/records/15431126)79. The source data behind the graphs in the paper can be found in Supplementary Data 2 and at Zenodo (https://zenodo.org/records/15431126)79.

Code availability

Computational workflows are available through https://github.com/AnimalGenomicsETH/HiFi_cohort and are archived at zenodo (https://zenodo.org/records/18172395)80.

References

  1. Daetwyler, H. D. et al. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nat. Genet. 46, 858–865 (2014).

    Google Scholar 

  2. McVean, G. A. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

    Google Scholar 

  3. Abecasis, G. R., Cherny, S. S. & Cardon, L. R. The impact of genotyping error on family-based analysis of quantitative traits. Eur. J. Hum. Genet. 9, 130–134 (2001).

    Google Scholar 

  4. Sollis, E. et al. The NHGRI-EBI GWAS Catalog: knowledgebase and deposition resource. Nucleic Acids Res. 51, D977–D985 (2023).

    Google Scholar 

  5. Hu, Z.-L., Park, C. A. & Reecy, J. M. Bringing the Animal QTLdb and CorrDB into the future: meeting new challenges and providing updated services. Nucleic Acids Res. 50, D956–D961 (2022).

    Google Scholar 

  6. Canela-Xandri, O., Rawlik, K. & Tenesa, A. An atlas of genetic associations in UK Biobank. Nat. Genet. 50, 1593–1599 (2018).

    Google Scholar 

  7. Weischenfeldt, J., Symmons, O., Spitz, F. & Korbel, J. O. Phenotypic impact of genomic structural variation: insights from and for human disease. Nat. Rev. Genet. 14, 125–138 (2013).

    Google Scholar 

  8. Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461–468 (2018).

    Google Scholar 

  9. Cameron, D. L., Di Stefano, L. & Papenfuss, A. T. Comprehensive evaluation and characterisation of short read general-purpose structural variant calling software. Nat. Commun. 10, 3240 (2019).

    Google Scholar 

  10. Huddleston, J. et al. Discovery and genotyping of structural variation from long-read haploid genome sequence data. Genome Res. 27, 677–685 (2017).

    Google Scholar 

  11. Kosugi, S. & Terao, C. Comparative evaluation of SNVs, indels, and structural variations detected with short- and long-read sequencing data. Hum. Genome Var. 11, 18 (2024).

    Google Scholar 

  12. Ahsan, M. U., Liu, Q., Perdomo, J. E., Fang, L. & Wang, K. A survey of algorithms for the detection of genomic structural variants from long-read sequencing data. Nat. Methods 20, 1143–1158 (2023).

    Google Scholar 

  13. Mikheyev, A. S. & Tin, M. M. Y. A first look at the Oxford Nanopore MinION sequencer. Mol. Ecol. Resour. 14, 1097–1102 (2014).

    Google Scholar 

  14. Eid, J. et al. Real-time DNA sequencing from single polymerase molecules. Science 323, 133–138 (2009).

    Google Scholar 

  15. Alonge, M. et al. Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell 182, 145–161.e23 (2020).

    Google Scholar 

  16. Beyter, D. et al. Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits. Nat. Genet. 53, 779–786 (2021).

    Google Scholar 

  17. Schloissnig, S. et al. Structural variation in 1,019 diverse humans based on long-read sequencing. Nature 644, 442–452 (2025).

    Google Scholar 

  18. Duan, X., Pan, M. & Fan, S. Comprehensive evaluation of structural variant genotyping methods based on long-read sequencing data. BMC Genom. 23, 324 (2022).

    Google Scholar 

  19. Yang, Q. et al. SVLearn: a dual-reference machine learning approach enables accurate cross-species genotyping of structural variants. Nat. Commun. 16, 2406 (2025).

    Google Scholar 

  20. Kalleberg, J., Rissman, J. & Schnabel, R. D. Overcoming limitations to customize DeepVariant for domesticated animals with TrioTrain. Genome Res. 35, 1859–1874 (2025).

    Google Scholar 

  21. Lloret-Villas, A., Bhati, M., Kadri, N. K., Fries, R. & Pausch, H. Investigating the impact of reference assembly choice on genomic analyses in a cattle breed. BMC Genom. 22, 363 (2021).

    Google Scholar 

  22. Leonard, A. S. et al. Structural variant-based pangenome construction has low sensitivity to variability of haplotype-resolved bovine assemblies. Nat. Commun. 13, 3012 (2022).

    Google Scholar 

  23. Leonard, A. S., Crysnanto, D., Mapel, X. M., Bhati, M. & Pausch, H. Graph construction method impacts variation representation and analyses in a bovine super-pangenome. Genome Biol. 24, 124 (2023).

    Google Scholar 

  24. Talenti, A. et al. A cattle graph genome incorporating global breed diversity. Nat. Commun. 13, 910 (2022).

    Google Scholar 

  25. Bhati, M., Mapel, X. M., Lloret-Villas, A. & Pausch, H. Structural variants and short tandem repeats impact gene expression and splicing in bovine testis tissue. Genetics 225, iyad161 (2023).

    Google Scholar 

  26. Lee, Y.-L. et al. High-resolution structural variants catalogue in a large-scale whole genome sequenced bovine family cohort data. BMC Genom. 24, 225 (2023).

    Google Scholar 

  27. Grant, J. R. et al. A large structural variant collection in Holstein cattle and associated database for variant discovery, characterization, and application. BMC Genom. 25, 903 (2024).

    Google Scholar 

  28. Lee, Y.-L. et al. A 12 kb multi-allelic copy number variation encompassing a GC gene enhancer is associated with mastitis resistance in dairy cattle. PLoS Genet. 17, e1009331 (2021).

    Google Scholar 

  29. Trigo, B. B. et al. Variants at the ASIP locus contribute to coat color darkening in Nellore cattle. Genet. Sel. Evol. 53, 40 (2021).

    Google Scholar 

  30. Rothammer, S. et al. The 80-kb DNA duplication on BTA1 is the only remaining candidate mutation for the polled phenotype of Friesian origin. Genet. Sel. Evol. 46, 44 (2014).

    Google Scholar 

  31. Milia, S. et al. Taurine pangenome uncovers a segmental duplication upstream of KIT associated with depigmentation in white-headed cattle. Genome Res. 35, 1041–1052 (2025).

    Google Scholar 

  32. Durkin, K. et al. Serial translocation by means of circular intermediates underlies colour sidedness in cattle. Nature 482, 81–84 (2012).

    Google Scholar 

  33. Küttel, L. et al. A complex structural variant at the KIT locus in cattle with the Pinzgauer spotting pattern. Anim. Genet. 50, 423–429 (2019).

    Google Scholar 

  34. Kadri, N. K. et al. A 660-Kb deletion with antagonistic effects on fertility and milk production segregates at high frequency in nordic red cattle: additional evidence for the common occurrence of balancing selection in livestock. PLoS Genet. 10, e1004049 (2014).

    Google Scholar 

  35. Venhoranta, H. et al. Ectopic KIT copy number variation underlies impaired migration of primordial germ cells associated with gonadal hypoplasia in cattle (Bos taurus). PLoS ONE 8, e75659 (2013).

    Google Scholar 

  36. Mapel, X. M. et al. Molecular quantitative trait loci in reproductive tissues impact male fertility in cattle. Nat. Commun. 15, 674 (2024).

    Google Scholar 

  37. Leonard, A. S., Mapel, X. M. & Pausch, H. Pangenome genotyped structural variation improves molecular phenotype mapping in cattle. Genome Res. 34, 300–309 (2024).

    Google Scholar 

  38. Olagunju, T. A. et al. Telomere-to-telomere assemblies of cattle and sheep Y-chromosomes uncover divergent structure and gene content. Nat. Commun. 15, 8277 (2024).

    Google Scholar 

  39. Crysnanto, D., Wurmser, C. & Pausch, H. Accurate sequence variant genotyping in cattle using variation-aware genome graphs. Genet. Sel. Evol. 51, 21 (2019).

    Google Scholar 

  40. Jansen, S. et al. Assessment of the genomic variation in a cattle population by re-sequencing of key animals at low to medium coverage. BMC Genom. 14, 446 (2013).

    Google Scholar 

  41. English, A. C., Cunial, F., Metcalf, G. A., Gibbs, R. A. & Sedlazeck, F. J. K-mer analysis of long-read alignment pileups for structural variant genotyping. Nat. Commun. 16, 3218 (2025).

    Google Scholar 

  42. Sudmant, P. H. et al. An integrated map of structural variation in 2,504 human genomes. Nature 526, 75–81 (2015).

    Google Scholar 

  43. Ebert, P. et al. Haplotype-resolved diverse human genomes and integrated analysis of structural variation. Science 372, eabf7117 (2021).

    Google Scholar 

  44. Scott, A. J., Chiang, C. & Hall, I. M. Structural variants are a major source of gene expression differences in humans and often affect multiple nearby genes. Genome Res. 31, 2249–2257 (2021).

    Google Scholar 

  45. Zhang, Y. et al. Structural variation reshapes population gene expression and trait variation in 2,105 Brassica napus accessions. Nat. Genet. 56, 2538–2550 (2024).

    Google Scholar 

  46. Chiang, C. et al. The impact of structural variation on human gene expression. Nat. Genet. 49, 692–699 (2017).

    Google Scholar 

  47. Billingsley, K. J. et al. Long-read sequencing of hundreds of diverse brains provides insight into the impact of structural variation on gene expression and DNA methylation. bioRxiv https://doi.org/10.1101/2024.12.16.628723 (2024).

    Google Scholar 

  48. Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11, 499 (2010).

    Google Scholar 

  49. Huang, L., Wang, C. & Rosenberg, N. A. The relationship between imputation error and statistical power in genetic association studies in diverse populations. Am. J. Hum. Genet. 85, 692–698 (2009).

    Google Scholar 

  50. Noyvert, B. et al. Imputation of structural variants using a multi-ancestry long-read sequencing panel enables identification of disease associations. eLife 14, RP106115 (2025).

  51. Gong, J. et al. Long-read sequencing of 945 Han individuals identifies structural variants associated with phenotypic diversity and disease susceptibility. Nat. Commun. 16, 1494 (2025).

    Google Scholar 

  52. Garimella, K. V. et al. Population-scale long-read sequencing in the all of us research program. medRxiv 2025.10.02.25336942 Preprint at https://doi.org/10.1101/2025.10.02.25336942 (2025).

  53. Tang, L. et al. GWAS reveals determinants of mobilization rate and dynamics of an active endogenous retrovirus of cattle. Nat. Commun. 15, 2154 (2024).

    Google Scholar 

  54. Adelson, D. L., Raison, J. M. & Edgar, R. C. Characterization and distribution of retrotransposons and simple sequence repeats in the bovine genome. Proc. Natl. Acad. Sci. USA 106, 12855–12860 (2009).

    Google Scholar 

  55. Cui, Y. et al. Multi-omic quantitative trait loci link tandem repeat size variation to gene regulation in human brain. Nat. Genet. 57, 369–378 (2025).

    Google Scholar 

  56. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    Google Scholar 

  57. Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Google Scholar 

  58. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at http://arxiv.org/abs/1303.3997 (2013).

  59. Vasimuddin, Md., Misra, S., Li, H. & Aluru, S. Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems. in Proc. 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 314–324 (IEEE, 2019).

  60. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Google Scholar 

  61. Yun, T. et al. Accurate, scalable cohort variant calls using DeepVariant and GLnexus. Bioinformatics 36, 5582–5589 (2021).

    Google Scholar 

  62. Smolka, M. et al. Detection of mosaic and population-level structural variants with Sniffles2. Nat. Biotechnol. 42, 1571–1580 (2024).

    Google Scholar 

  63. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).

    Google Scholar 

  64. McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17, 122 (2016).

    Google Scholar 

  65. Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

    Google Scholar 

  66. Kirsche, M. et al. Jasmine and Iris: population-scale structural variant comparison and analysis. Nat. Methods 20, 408–417 (2023).

    Google Scholar 

  67. Pedersen, B. S. & Quinlan, A. R. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics 34, 867–868 (2018).

    Google Scholar 

  68. Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).

    Google Scholar 

  69. Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).

    Google Scholar 

  70. Soneson, C., Love, M. I. & Robinson, M. D. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res 4, 1521 (2015).

    Google Scholar 

  71. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Google Scholar 

  72. van de Geijn, B., McVicker, G., Gilad, Y. & Pritchard, J. K. WASP: allele-specific software for robust molecular quantitative trait locus discovery. Nat. Methods 12, 1061–1063 (2015).

    Google Scholar 

  73. Cotto, K. C. et al. Integrated analysis of genomic and transcriptomic data for the discovery of splice-associated variants in cancer. Nat. Commun. 14, 1589 (2023).

    Google Scholar 

  74. Li, Y. I. et al. Annotation-free quantification of RNA splicing using LeafCutter. Nat. Genet. 50, 151–158 (2018).

    Google Scholar 

  75. Delaneau, O. et al. A complete tool set for molecular QTL discovery and analysis. Nat. Commun. 8, 15452 (2017).

    Google Scholar 

  76. Zhou, H. J., Li, L., Li, Y., Li, W. & Li, J. J. PCA outperforms popular hidden variable inference methods for molecular QTL mapping. Genome Biol. 23, 210 (2022).

    Google Scholar 

  77. Salavati, M. et al. Improving the annotation of the cattle genome by annotating transcription start sites in a diverse set of tissues and populations using Cap Analysis Gene Expression sequencing. G3 13, jkad108 (2023).

    Google Scholar 

  78. Karolchik, D. et al. The UCSC genome browser database. Nucleic Acids Res. 31, 51–54 (2003).

    Google Scholar 

  79. Pausch, H., Mapel, X. & Leonard, A. A bovine long read cohort to identify SVs and quantify their impact on gene expression [Data set]. Zenodo https://doi.org/10.5281/zenodo.17338532 (2025).

  80. Leonard, A. & Mapel, X. AnimalGenomicsETH/HiFi_cohort: v.Paper Manuscript version. Zenodo https://doi.org/10.5281/zenodo.18172395 (2026).

Download references

Acknowledgements

This study was supported by an ETH Research Grant and a grant from the Swiss National Science Foundation (SNSF, grant-ID 204654). The funding bodies were neither involved in the design of the study and collection, analysis, and interpretation of data nor in writing the manuscript. We thank Eirini Lampraki from Pacific Biosciences for DNA fragment analysis and sequencing. We thank Audald Lloret-Villas and Qiongyu He for valuable discussions.

Author information

Author notes
  1. These authors contributed equally: Xena Marie Mapel, Alexander S. Leonard.

Authors and Affiliations

  1. Animal Genomics, ETH Zurich, Zurich, Switzerland

    Xena Marie Mapel, Alexander S. Leonard & Hubert Pausch

Authors
  1. Xena Marie Mapel
    View author publications

    Search author on:PubMed Google Scholar

  2. Alexander S. Leonard
    View author publications

    Search author on:PubMed Google Scholar

  3. Hubert Pausch
    View author publications

    Search author on:PubMed Google Scholar

Contributions

X.M.M. sampled tissue and purified HMW DNA, aligned RNA reads against the reference, developed and applied workflows to quantify gene expression and splicing variation, conducted molecular QTL mapping, interpreted results, and drafted the manuscript; A.S.L. aligned DNA reads against the reference, called variants from short- and long-read alignments, interpreted results, and drafted the manuscript; H.P. conceived the study, contributed to the analysis of expression and splicing QTL, interpreted results, and contributed to the writing of the manuscript. All authors approved the final version of the manuscript.

Corresponding author

Correspondence to Hubert Pausch.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Biology thanks Tuan V. Nguyen and the other anonymous reviewer(s) for their contribution to the peer review of this work. Primary handling editors: Ani Manichaikul and Laura Rodríguez Pérez. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Reporting summary

Transparent Peer Review file

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mapel, X.M., Leonard, A.S. & Pausch, H. Molecular QTL are enriched for structural variants in a cattle long-read cohort. Commun Biol (2026). https://doi.org/10.1038/s42003-026-09596-w

Download citation

  • Received: 19 June 2025

  • Accepted: 14 January 2026

  • Published: 21 January 2026

  • DOI: https://doi.org/10.1038/s42003-026-09596-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • Journal Information
  • Open Access Fees and Funding
  • Journal Metrics
  • Editors
  • Editorial Board
  • Calls for Papers
  • Referees
  • Contact
  • Editorial policies
  • Aims & Scope

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Communications Biology (Commun Biol)

ISSN 2399-3642 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing