Mitochondrial genomes of Middle Pleistocene horses from the open-air site complex of Schöningen

Weingarten, Arianna; Häusler, Meret; Serangeli, Jordi; Verheijen, Ivo; Reiter, Ella; Radzevičiūtė, Rita; Stoessel, Alexander; Krause, Johannes; Spyrou, Maria A.; Conard, Nicholas J.; Nieselt, Kay; Posth, Cosimo

doi:10.1038/s41559-025-02859-5

Download PDF

Article
Open access
Published: 01 October 2025

Mitochondrial genomes of Middle Pleistocene horses from the open-air site complex of Schöningen

Nature Ecology & Evolution (2025)Cite this article

1151 Accesses
97 Altmetric
Metrics details

Subjects

Abstract

Deep-time palaeogenomics offers rare insights into macroevolutionary events for both extant and extinct species. Aside from a Middle Pleistocene genome from North American permafrost (780–560 ka) and a number of Late Pleistocene specimens, most ancient horse DNA studies have focused on tracing the origins of domestication and subsequent periods. Here we present mitochondrial genomes from two Equus mosbachensis specimens from Schöningen, Germany, a Middle Pleistocene archaeological site complex with direct and repeated evidence of hominin–horse interactions on the shore of a palaeolake. Using petrous bone sampling, targeted enrichment and damage-aware and polarization-free mitochondrial DNA reconstruction methods, we extend the range of genome recovery in open-air sites to ~300,000 years ago. Phylogenetic analyses position these mitochondrial DNAs in two distinct, deeply divergent lineages, basal to both previously sequenced ancient Eurasian specimens and all modern-day horses. The Schöningen horse mitochondrial DNA data reveal a previously unrecognized diversification event within the clade, ultimately giving rise to modern-day horses, that is molecularly dated to ~570 ka and provides genetic support for the morphological species assignment. By extending the recoverable limits of ancient DNA from Middle Pleistocene open-air sites, our molecular findings bridge a temporal and geographic gap, providing insights on early evolutionary events within the genus Equus.

Equus mitochondrial pangenome reveals independent domestication imprints in donkeys and horses

Article Open access 25 February 2025

Comparative analysis of the mitochondrial genomes of the soft-shelled turtles Palea steindachneri and Pelodiscus axenaria and phylogenetic implications for Trionychia

Article Open access 28 February 2025

Mitochondrial genomes reveal mid-Pleistocene population divergence, and post-glacial expansion, in Australasian snapper (Chrysophrys auratus)

Article Open access 03 December 2022

Main

The Pleistocene epoch from ~2.6 million years ago (Ma) to 11.7 thousand years ago (ka) is characterized by extreme climatic shifts, diverse megafaunal extinctions and the emergence and radiation of archaic and modern humans^1,2,3,4. Recent innovations in laboratory and computational techniques have enabled the recovery of highly degraded ancient biomolecules, allowing a deeper investigation of the Pleistocene fossil record^5,6. While fossils from the Late Pleistocene have yielded important genomic results, samples from deep-time palaeogenomics, specifically organisms from the Early and Middle Pleistocene (~2.6 Ma to 126 ka), remain limited⁷. This type of data has the potential to provide direct insights into the processes of divergence, speciation and adaptation, especially for long-extinct species. However, the analysis of genomes from this period presents major analytical challenges related to extra-short fragment lengths, high levels of chemical modifications, low copy numbers and phylogenetically distant reference genomes⁸. Although optimized experimental approaches can mitigate these issues, the depositional environment can have a pronounced influence on post-mortem DNA survival⁹. Preservation of ancient DNA (aDNA) from skeletal remains is favoured by low temperatures in permafrost and glacial environments^10,11. These conditions have yielded the oldest sequenced specimens, including genome-wide data of two mammoths from Siberia (Russia) dated to 1.2–1.1 Ma (ref. ¹²) and a low-coverage horse genome from the Yukon (Canada) dated to 780–560 ka (ref. ¹³). The latter was used to establish the most recent common ancestor of the genus Equus (horses, zebras and donkeys) at 4.5–4.0 Ma, doubling previous estimates based on fossils. In temperate environments, cave sites appear to best preserve DNA, possibly due to minimal temperature variation and protection from external disturbances, with the oldest genetic data published so far dated to 430 ka from Sima de los Huesos (Spain)^14,15,16. Open-air sites, although usually considered less ideal for DNA retrieval, have also provided genomic results from the Middle Pleistocene, such as the elephant DNA from Neumark-Nord and Weimar-Ehringsdorf dated to 120 ka and 240 ka, respectively¹⁷. While deep-time genomes recovered thus far align with these expected preservation trends (Fig. 1), with environmental DNA from permafrost extending as far back as 2 Ma (ref. ⁹), the limits of DNA preservation at temperate environments are still unexplored.

**Fig. 1: Environmental contexts and temporal distributions of pre-100 ka nuclear genomic data and/or mitogenomes.**

The Equidae family, of which Equus is the only extant genus, has a fossil record spanning 55 Ma, making it one of the best-documented examples of macroevolution and an ideal candidate for deep-time palaeogenomic research. Today, Equus consists of three subgenera: Hippotigris (zebras), Asinus (donkeys) and Equus (horses). However, the palaeontological record, dating back to the Eocene (55–34 Ma), reveals a far greater diversity, with over 35 genera and hundreds of extinct species¹⁸. Most of the macroevolutionary history of horses so far has been reconstructed through morphological studies. Equus originated in North America, and the Pleistocene fossil record indicates several successful dispersals into Eurasia across the Bering Land Bridge¹⁹. Two major migrations from North America into Eurasia occurred: the first, ~2.6 Ma, involved stenoid horses, the ancestors of all modern zebras and donkeys²⁰. Except for within Africa, many of these early lineages were eventually replaced by a second migration of caballine horses across the Bering Land Bridge ~0.9–0.8 Ma. Although several non-caballine lineages in Eurasia persisted well into the Holocene and exhibited their own diversification and phylogeographic structure, the caballine lineage expanded widely and became dominant. This lineage further diversified and became strongly phylogeographically structured until the early Holocene, when the diversity of caballine outside of Eurasia was entirely lost²¹. As a result, all living horses today are descendants of the sole suriving Eurasian clade associated with the second major dispersal out of North America into Eurasia²².

While ancient horse genomics research has predominantly examined Holocene domestication origins, understanding these developments required reconstructing predomestication population histories^23,24,25. Such reconstructions extended beyond radiocarbon dating limits and led to an even deeper chronology of human–horse interactions, one in which horses played a pivotal role in the subsistence strategies of archaic hominins²⁶. Some of the earliest and best-documented evidence of this relationship comes from excavations at the Middle Pleistocene site complex of Schöningen in Lower Saxony, Germany, dating back 320–300 ka, which unearthed the world’s oldest complete wooden spears alongside the remains of 20–25 butchered horses^26,27,28 (Extended Data Fig. 1 and Supplementary Note 1). Despite the site’s extensive faunal record, which includes over 20,000 large mammal remains²⁹, its unique environmental context³⁰ and published stable isotopic data³¹, aDNA results from Schöningen have never been published.

In this study, we present the analysis of two largely complete mitochondrial genomes from an extinct horse species (Equus mosbachensis) excavated at Schöningen to investigate their phylogenetic positioning and to explore the diversification events within the equid mitochondrial DNA (mtDNA) evolutionary history.

Results

Mitochondrial genomes of Middle Pleistocene horses

We generated nearly complete mitochondrial genomes (94% and 82% coverage at 3×) from DNA extracted from the petrous portions of the temporal bone of two specimens morphologically identified as a large Equus, which in Schöningen are associated with E. mosbachensis (SCEN001; archaeological ID: 4709 and SCEN002; archaeological ID: 25189) (Extended Data Fig. 2). Using optimized approaches to recover degraded, short DNA fragments^14,32, we prepared a total of six non-uracil-DNA glycosylase (UDG) single-stranded libraries per sample³³. Two genetic libraries generated for both specimens underwent an initial shallow shotgun screening for ~15 million raw reads. Mapping against the E. caballus nuclear genome yielded 0.8% (SCEN001) and 1.66% (SCEN002) endogenous DNA. The average fragment length of the mapped reads was ~34 bp with a C-to-T substitution rate of 72% (SCEN001) and 66% (SCEN002) at the 5′ end, and an excess of upstream guanine residues indicative of depurination-driven fragmentation, both consistent with expectations for deep-time palaeogenomes (Extended Data Fig. 3 and Supplementary Data 3). Despite the low genome-wide coverage, we were able to perform sex determination with SCEN001 identified as male and SCEN002 as female (Supplementary Data 4).

We subsequently enriched each library with one or two rounds of capture using DNA probes that encompass the entire horse mtDNA^34,35,36. After sequencing, the EAGER (v1.92.55) pipeline was utilized to map our data to the E. caballus reference mitogenome, yielding a coverage of 11.2× for SCEN001 and 7× for SCEN002 (ref. ³⁷). The post-mtDNA capture average fragment length ranged between 36 bp and 37 bp and 5′-end deamination between 51% and 73% across all libraries from both samples. The post-capture enrichment factor of 2,121 was calculated for SCEN001, reflecting highly efficient mtDNA recovery (Supplementary Data 5).

Ancient mitogenome reconstruction

The Schöningen data present a particularly challenging case due to their short fragment lengths, low coverage and high rates of deamination-induced substitutions. In such contexts, existing genome reconstruction approaches may have limitations. aDNA damage is usually mitigated through trimming sequence ends³⁸, masking C-to-T mismatches³⁹ or rescaling correction^22,40. However, the first two approaches can discard substantial amounts of data, while the latter generally relies on the reference genome; all may lead to overly stringent data filtering. At the same time, they can also overlook the identification of damaged reads, which can introduce bias or lead to the loss of valuable signals in highly divergent or poorly preserved ancient samples, such as those from Schöningen.

To address these limitations, we developed and implemented three complementary approaches designed to be explicitly damage-aware and perform well on low-coverage data to maximize sequence recovery while minimizing miscalling errors (Fig. 2). Each method leverages the characteristic damage patterns of single-stranded libraries (C-to-T transitions on forward mapping reads, G-to-A on reverse-mapping reads; Supplementary Note 2). We evaluated the damage-aware reconstructions by comparing them with a non-damage-aware reconstruction that performs consensus base calling solely based on coverage and base frequency. For the reconstructions of SCEN001 and SCEN002, we required a minimum coverage of 3× and at least 65% support for the most frequent base to make a confident base call.

**Fig. 2: Schematic overview of the damage-aware genome reconstruction method.**

Polarization-based damage silencing

This classical approach uses a modern reference genome to identify damage. At any site where the reference has a C, forward mapping reads showing a T are considered as damaged and replaced with an uninformative base (N). Similarly, As on reverse-mapping reads are silenced when the reference shows a G.

Using this method, SCEN001 exhibited 2,973 potentially damaged positions, while SCEN002 had 1,733. The consensus resulted in 1,079 (6.5%) non-informative base calls for SCEN001 and 3,282 (19.8%) for SCEN002, of which 91 and 173, respectively, occurred at positions that were identified as damaged (Supplementary Data 12). The high number of N calls in SCEN002, despite fewer detected damaged positions, reflects its lower coverage. While similar methods have been used successfully in other ancient mtDNA studies, the evolutionary distance between these Middle Pleistocene horses and modern horses can make a reference genome less reliable for accurately identifying damage¹⁶. We therefore implemented two further damage-aware methods, as explained below.

Polarization-free damage silencing

To identify damage patterns without reference genome polarization, this method analyses variation patterns directly among aligned reads. A position is flagged as potentially damaged when the following two conditions are met: (1) at least one T occurs in a forward-oriented read, and (2) a C is observed at the same position in any read (regardless of orientation). This read-internal comparison detects C-to-T (and G-to-A in reverse-oriented reads) damage patterns while remaining independent of reference genome comparisons. At flagged positions, Ts (forward reads) and As (reverse reads) are then silenced (that is, replaced by N) during consensus calling.

This approach identified 3,226 damaged sites in SCEN001 and 2,147 in SCEN002. In the consensus sequence of SCEN001, 1,060 (6.4%) positions were non-informative base calls, of which 95 were at positions identified as damaged. The consensus sequence of SCEN002 showed 3,281 (19.8%) non-informative base calls; 198 of these occurred at putatively damaged sites (Supplementary Data 12). Although it detected more damage than the polarization-based method, it produced slightly fewer Ns because it more precisely targets likely damaged bases.

Polarization-free damage weighting

This approach weights base calls during consensus generation: damaged bases (Ts from C-to-T transitions on forward strands; As from G-to-A on reverse strands) are downweighted by (1 − damage frequency), while their undamaged counterparts (Cs and Gs, respectively) are upweighted by (1 + damage frequency), maintaining probabilistic equilibrium. These position and sample-specific weights, derived from DamageProfiler⁴¹, are applied at the consensus-calling stage to simultaneously suppress damage artefacts while amplifying authentic signals, independently of base quality scores.

Although this method considers the same number of damaged positions as the polarization-free damage silencing approach, it resulted in even fewer non-informative calls: 1,014 (6.1%) for SCEN001, 49 of these at damaged positions, and 3,186 (19.2%) for SCEN002, of which 103 occurred at positions identified as damaged (Supplementary Data 12). Thus, in comparison with polarization-based damage silencing and polarization-free damage silencing, polarization-free damage weighting reduces the number of non-informative base calls at damaged positions by more than 40%. Simulations and empirical comparisons confirmed that this method performs best in low-coverage datasets, enabling more confident calling and retaining more usable data (Supplementary Note 2). Manual inspection of private substitutions across all methods also showed that this approach yielded the most reliable results, with fewer erroneous or ambiguous calls (Supplementary Data 11). For all downstream analyses, we thus focus on the results obtained using the polarization-free damage weighting approach, but results from the other reconstruction methods are presented in Extended Data Figs. 4–7 and discussed in Supplementary Note 3.

Mitochondrial genome phylogeny

To investigate the phylogenetic position and evolutionary history of the reconstructed Schöningen mitogenomes, we compiled a dataset of 146 previously published pre-Holocene ancient mtDNAs (one Middle Pleistocene (TC21) and 145 Late Pleistocene) alongside the two newly generated Middle Pleistocene mitogenomes (Fig. 3a and Supplementary Data 2). It has been shown that pre-Holocene caballine horse mtDNA clades are phylogeographically structured, reflecting two major intercontinental dispersal events across the Bering Land Bridge during the Pleistocene⁴². The first was from North America (clade B) to Eurasia (clades A and C), during the Middle Pleistocene²⁰. The second, extending into the Late Pleistocene, was a back migration from Eurasia (clade A) into North America (clades A1 and A2)²². We built individual maximum likelihood phylogenetic trees for the four consensus versions obtained from each reconstruction method for the Schöningen samples (Extended Data Figs. 4–7). Across all reconstructions, the resulting topologies consistently reproduced the previously established phylogeographic structure^22,42 with the Schöningen mitogenomes occupying basal positions within clade A. Specimen SCEN001 diverges earlier than SCEN002, with both individuals falling on distinct clades, supported by high bootstrap values (Fig. 3b). In addition, the inclusion of modern mitogenomes in the ancient dataset (resulting in a combined dataset of n = 171) demonstrated that clade A encompasses the full extent of mitochondrial genome diversity present in modern-day horses (Extended Data Fig. 8).

**Fig. 3: Ancient horse mitochondrial genome phylogeography and evolution.**

In addition, we built phylogenetic trees using a maximum parsimony approach to assess the robustness of the tree topology using an alternative method (Extended Data Fig. 9). The placement of the Schöningen samples is consistent across both reconstruction approaches, appearing basal to all known clade A mtDNA diversity. However, the position of clade C differed: it grouped with clade B in the maximum parsimony tree but with clade A in the maximum likelihood analysis. Notably, the maximum likelihood topology received stronger bootstrap support for this relationship (96% for the maximum likelihood tree versus 52% for the maximum parsimony tree). This instability in the placement of clade C has been reported previously and probably reflects limited phylogenetic resolution at the root of the tree, consistent with a potential polytomy during the early diversification of clades A, B and C. Nevertheless, the maximum likelihood tree, which places clade C as sister to Eurasian clade A (Fig. 3b), aligns with earlier studies and supports a single major dispersal of caballine horses from North America into Eurasia^22,42,43.

Molecular dating and divergence-time estimates

To estimate the timing of evolutionary events within the equid mtDNA tree, and to infer the molecular age of Schöningen mtDNA, we conducted a dating analysis using BEAST v2.6.6⁴⁴. For this analysis, we use the mitochondrial genome from SCEN001 reconstructed with the polarization-free damage weighting method, as it has higher coverage and carries fewer missing sites. In addition, we focused on a subset of the dataset that includes both modern-day horses and Pleistocene horses with known dates and a low occurrence of missing data, resulting in a multiple sequence alignment including 113 mitogenomes (Supplementary Data 6). For the dating with BEAST, we used a previously calculated mtDNA mutation rate⁴⁵ and radiocarbon dates for ancient individuals <50 ka and the geological age of TC21¹³ to anchor the tree. For SCEN001, we provided a temporal prior between 500 ka and 100 ka to not constrain its date to previously proposed age estimations. The results of the BEAST analysis revealed a molecular age for the SCEN001 branch length of 359,860 years (95% highest posterior density (HPD) 500,00–191,690). Despite the large HPD interval, the mean age is broadly in agreement with the U-series dates and biostratigraphic proxies of the corresponding Schöningen archaeological layers, 13II-4 b/c. In addition, we estimate that clades (A,C) and B diverged ~800 ka (95% HPD 956,970–678,990 years) (Fig. 4 and Table 1), corroborating previously published approximations for the coalescence time of the Eurasian and North American mtDNA clades^13,22,42,46. Similar to what was previously reported²², but with achieved BEAST run convergence, the coalescence age between all previously sequenced mtDNA of extant and extinct horses belonging to clade A was dated to ~230 ka (95% HPD 314,540–160,690 years). Finally, the inclusion of the Schöningen mitogenome revealed a previously undescribed deep split within clade A, with the divergence between SCEN001 and all other clade A lineages estimated at ~573,310 years ago (95% HPD 752,330–380,280 years). This finding establishes an upper boundary for the origin of clade A and, by extension, the maternal ancestry of modern-day horses.

**Fig. 4: The Bayesian phylogeny of horse mtDNA sequences is represented as a maximum clade credibility tree.**

Table 1 Divergence times of mtDNA clades and molecular ages estimated in BEAST v2.6.6

Full size table

To independently assess divergence time estimates and evaluate the potential impact of population structure assumptions on molecular dating, we conducted a complementary analysis using least-squares dating (LSD2)⁴⁷. This analysis was performed on the same dataset (n = 113), with a root prior of 4.25 Ma and identical tip dates. The resulting divergence estimates for key nodes, ~553,483 years ago (95% confidence interval 702,028–445,173) for the SCEN001 and clade A split and ~269,373 years ago (95% confidence interval 381,203–173,407) for clade A, are highly comparable to the BEAST results, supporting the consistency of the inferred divergence times across different modelling approaches.

Discussion

We present the mitochondrial genomes and genetic sexing of two E. mosbachensis specimens, reconstructing their matrilineal evolutionary history within the context of equine macroevolution. These genetic data were sequenced from morphologically identified petrous bones recovered at Schöningen using micro-computed tomography (micro-CT) scans to enhance sampling accuracy and laboratory techniques used to recover highly fragmented aDNA. Shallow shotgun sequencing revealed both a male and a female individual. Morphological analysis identified the male (SCEN001) as a subadult, found in direct association with a wooden spear.

Reconstructing the ancient mitogenomes of the Schöningen horses posed challenges due to extensive DNA fragmentation and damage, particularly C-to-T and G-to-A transitions. Although UDG treatment is often used to reduce such damage, it was not applied here due to the observation of borderline biomolecular preservation of these Middle Pleistocene, open-air samples, where enzymatic treatment could further reduce already limited endogenous DNA proportions and library complexity.

To evaluate the impact of damage, we initially tested a basic, no-correction reconstruction approach using a simple minimum coverage of 3 and 65% support threshold. Although this method is prone to introduce erroneous base calls due to uncorrected damage, the resulting phylogeny is nearly identical to those generated with damage-aware approaches (Extended Data Figs. 4–7), supporting the robustness of our reconstruction framework despite the expected longer branches when damage is not explicitly modelled.

To more accurately account for damage, we developed and applied two polarization-free computational methods (polarization-free damage silencing and polarization-free damage weighting) and compared these with a more classical polarization-based damage silencing approach. The latter method identifies damaged positions by comparing reads with a modern reference genome, resulting in a substantially higher number of non-informative base calls due to the evolutionary distance with the ancient mitogenomes. To overcome this limitation, the polarization-free damage silencing method detects damaged positions solely on the basis of read alignment characteristics, reducing non-informative calls through a more targeted damage masking. The polarization-free damage weighting method further expands data recovery by downweighting damaged bases and upweighting undamaged ones using sample-specific damage profiles. This approach minimized non-informative calls, especially in low-coverage datasets, and outperformed silencing methods with both simulated and empirical datasets. Together, the polarization-free approaches enhance the accuracy of ancient mitogenome reconstruction and highlight the importance of more unbiased and better-calibrated methods for highly degraded DNA. Therefore, caution is warranted when using the more classical polarization-based damage silencing approach in downstream analyses such as molecular dating, where damage or imbalanced masking can bias divergence and tip-dating estimates.

The Schöningen Spear Horizon from which the analysed horse skeletal remains derive is biostratigraphically dated to ~320–300 ka (ref. ²⁹). This age determination aligns with geochronological, faunal and botanical evidence placing the site with the interglacial phase of MIS 9⁴⁸. However, recent amino acid geochronology of snail opercula suggests that the Spear Horizon dates to approximately 200 ka, placing it in association with MIS 7⁴⁹. Despite the large uncertainty, our molecular dating with a mean estimate for the Schöningen mitogenome of ~360 ka is more in line with the older chronology of the assemblage. While permafrost has yielded the oldest DNA retrieved from skeletal remains dating to ~1.2 Ma (ref. ¹²) and cave contexts the next oldest to ~400 ka (ref. ¹⁴), the recovery of Schöningen horse mitogenomes extends the known limit of DNA preservation in open-air sites beyond what was known so far (that is, ~240 ka)¹⁷.

Although cave environments are widely recognized for promoting DNA preservation due to their constant humidity and stable, low temperatures^16,50,51, the Schöningen site complex demonstrates that comparable preservation can occur under open-air conditions. Its permanently waterlogged and anoxic sediments might have created a microenvironment that inhibited oxygen exposure, microbial activity and thermal fluctuations, possibly permitting aDNA survival⁵². The preservation of genetic material in open-air sites is rare. However, when conditions are favourable, these sites can yield aDNA of quality and, occasionally, age comparable to that found in caves, although still not matching that from permafrost.

In the context of palaeontology, E. mosbachensis is recognized as the first true caballoid horse to emerge in Europe during the Middle Pleistocene, endemic to central Europe^53,54. Morphological homogeneity in Middle Pleistocene equids contrasts with the variability in earlier specimens, but this reduced variability led to taxonomic over splitting, with numerous species names reflecting a lack of consensus on species, subspecies or ecomorph designations⁵³. Archaeogenetic studies have addressed these taxonomic issues across extinct equid lineages, reclassifying Onohippidium as Hippidion devillei⁵⁵, revising the placement of Hippidion⁵⁶ and identifying Harringtonhippus as a distinct genus⁴⁶. At the mtDNA level, the divergence of mtDNA clades A, B and C at ~800 ka, combined with the ancient distribution of clade C in East Asia and western Beringia, suggests that the split between A and C, dated here to ~688 ka, probably occurred in northeastern Siberia shortly after horses dispersed from North America. Clade A then expanded westwards across Eurasia, with subsequent reintroductions of sublineages A1 and A2 into the Americas. Both Schöningen mitogenomes occupy a basal position relative to all extant horse mtDNA diversity (clade A), representing deeply divergent and previously unknown lineages. Although mtDNA does not provide insights into genetic admixture or contributions to later populations, this phylogenetic placement suggests a common ancestor for Schöningen and modern horses more recent than their divergence from other extinct equine clades. We date this divergence to ~570 ka (HPD ~750–380 ka), which represents the upper boundary for the origin of clade A. Furthermore, the Schöningen mitogenome provides an additional data point for dating later evolutionary events within clade A. Our BEAST analysis supports the start of major horse mtDNA diversification after ~230 ka (HPD ~314–160 ka). This period overlaps with the interglacial MIS 7, potentially reflecting environmental changes that contributed to shaping equine evolution^48,57,58. Clade A includes specimens assigned to multiple Equus species, such as E. caballus, E. ferus and E. przewalskii. With the SCEN001 lineage diverging on average more than 300,000 years before the differentiation of clade A, its assignment to E. mosbachensis is supported genetically.

In conclusion, our study offers a molecular perspective on early Equus evolution, uncovering the spatial and temporal dynamics of the genus from the Middle Pleistocene onwards. As technological advances continue to improve aDNA recovery and analysis, new insights from deep-time palaeogenomics will further enrich our understanding of extinct species and their interactions with early humans.

Methods

Equus mosbachensis samples and micro-CT scans

In conducting this research, we sampled two morphologically identified Equus petrous bones from the collection at the Research Museum Schöningen, Lower Saxony, Germany. The samples were selected on the basis of visibly good preservation, completeness and availability from the collection. SCEN001 belongs to specimen ID 4709, a petrous bone associated with a complete skull of a young E. mosbachensis male horse discovered from the Spear Horizon, in close proximity to some of the wooden spears (Extended Data Fig. 1c). SCEN002 is an isolated petrous bone with ID 25189, morphologically identical to SCEN001, recovered from the archaeological layer 3b, located approximately 3 m below the Spear Horizon. The archaeological context of the samples can be found in (Supplementary Data 7). Before sampling, both petrous samples were micro-CT scanned using the Bruker SkyScan 2211 X-ray nanotomograph at the former Max Planck Institute for the Science of Human History (MPI-SHH), Jena, Germany. This was carried out to create a high-resolution three-dimensional image of the inner ear morphology before destructive sampling. The image also served as a guide during sampling, improving the chances of drilling closer to the bony labyrinth. Photographs of the samples and their three-dimensional reconstructions can be found in Extended Data Fig. 2.

Sampling, DNA extraction, library preparation and shotgun sequencing

This laboratory work took place at the aDNA lab of the former MPI-SHH, Jena, Germany. Both samples underwent ultraviolet irradiation for 30 min to reduce surface contamination. SCEN001 was cut in half with a saw blade, to more easily access the inner ear, based on the orientation provided by the micro-CT scans. Instead, SCEN002 was sampled from the outside of the bone, removing the exposed surface of the petrous with a dentistry drill. We then produced ~50 mg of bone powder (SCEN001 49.9 mg, SCEN002 53.7 mg) from the protected bony labyrinth region. DNA extraction, including lysis, binding and purification steps, was performed according to Dabney et al.¹⁴ with modifications by Rohland et al.³² on an Agilent Bravo Automated Liquid Handling Platform robot. This protocol produces 1,000 µl of lysate, of which 125 µl are used to prepare 30 µl of extract, which is used as input for one library. Here, six libraries were produced. Library preparation was also carried out on the robot, according to the single-stranded protocol published in the work of Gansauge et al.³³, with no UDG treatment applied. Libraries were double-indexed using unique combinations of sample-specific P5 and P7 primers during an initial indexing PCR performed with AccuPrime Pfx (Invitrogen) DNA polymerase for 35 cycles. After indexing, libraries were purified using solid-phase reversible immobilization beads to remove excess primers and small fragments. To increase library yield, an additional 15-cycle amplification using Herculase II Fusion DNA Polymerase was carried out. Finally, a single-cycle reconditioning PCR with Herculase was performed. One library per sample was then sequenced on an Illumina HiSeq4000 platform producing 15 million reads across two sequencing runs of single-end 75 cycles.

Bait production for mitochondrial genome enrichment

This laboratory work was performed in the modern laboratory of the Archaeo- and Paleogenetics group at the University of Tübingen, Tübingen, Germany. The amplified single-stranded libraries were enriched for E. caballus mtDNA with baits created in-house. Using previously published primers⁴², we performed long-range PCR on a modern-day horse skeletal muscle tissue sample. DNA was extracted using DNeasy Blood & Tissue Kit (Qiagen), following the manufacturer’s instructions. For purification, the NEB monarch kit was used, and the samples were eluted in 100 µl of the kit’s elution buffer. The modern purified extract concentration was 30.6 ng µl⁻¹, measured on a NanoDrop 8000 spectrophotometer (Thermo Scientific). Long-range PCR was carried out with the Expand Long Range dNTPack kit (Roche), following the instruction manual (version 08). The temperature profile used on the thermoblock, along with the elongation time and annealing temperature (Ta) for each primer pair, can be found in Supplementary Data 8. The PCR fragment sizes were then checked by running an aliquot on a 1% agarose gel. PCR products were purified with the NEB monarch kit and eluted in 120 µl of low-TE buffer. The purified PCR products (to be used as baits) and the horse DNA extract (to be used as a positive control) were then sheared to 400–550-bp fragments using Covaris microTUBEs at the Max Planck Institute for Biology in Tübingen, Germany. Fragment lengths were verified on a TapeStation 4150 (Agilent). The sheared (<600-bp) fragments of the long-range PCR products were used to generate baits for mtDNA capture following the protocol outlined by Furtwängler et al.³⁵, which converted the fragmented products to an immortalized bait library by ligating double-stranded adaptors APL5 and APL6. Single-stranded biotinylated probes were then generated by using APL2 primers as in the work of Fu et al.³⁶.

Positive control for capture enrichment

The sheared (<600 bp) DNA from the modern horse extract was converted into a double-stranded, double-indexed Illumina library^59,60 at the molecular biology laboratory of the University of Tübingen. Indexing PCR was performed using AccuPrime Pfx DNA polymerase (Invitrogen) for ten cycles, followed by purification with the Monarch PCR & DNA Cleanup Kit (NEB). The indexed library was then amplified with Herculase II Fusion DNA Polymerase to achieve a final concentration exceeding 200 ng µl⁻¹, as measured with a NanoDrop 8000 spectrophotometer (Thermo Scientific). The number of cycles and reaction splits were calculated to stay below a total yield of 1.0 × 10¹³ molecules. The amplified library was then used for mtDNA capture.

Mitochondrial DNA capture

The six libraries then underwent one or two rounds of in-solution target enrichment capture³⁵, with a modification to the hybridization and wash temperatures, which were lowered to 60 °C and 55 °C (ref. ¹⁶). The hybridization time was also modified from 48 to 24 h. After the washing steps, the captured samples had 15 µl of TET (Tris-HCI, EDTA, Tween-20) added, from which a 1:10 dilution was made to perform quantitative PCR to determine the copy number for a final amplification with Herculase II, directly on the beads. Each sample was then split into three reactions (5 µl template per run) and amplified, followed by a purification with MinElute columns, and eluted in 11 µl of TET (Tris-HCI, EDTA, Tween-20). The enriched samples were then quantified using a TapeStation 4150 (Agilent) and diluted to 10-nM pools for sequencing on a HiSeq4000 at the Max Planck Institute for Evolutionary Anthropology in Leipzig.

Data processing

Raw sequencing reads were demultiplexed, allowing for a maximum of one mismatch per index. Subsequent data processing was performed using the EAGER pipeline (v1.92.55)³⁷. AdapterRemoval (v2.2.0) was used to trim adapters from both read ends and to discard fragments shorter than 30 bp (ref. ⁶¹). Reads were then aligned to the EquCab 3.0 (GCF_002863925.1) horse reference genome using BWA (Burrows-Wheeler Alignment tool, v0.7.12) with parameters -n 0.01, -l 16500 and -q 30. Enriched mtDNA reads were further realigned to the NC_001640 mtDNA reference genome using CircularMapper⁶². Duplicates arising from PCR amplification were removed using DeDup (v0.12.2)³⁷. Post-mortem DNA damage patterns were assessed with DamageProfiler⁴¹. Depurination signatures were assessed in mapDamage (v2.0.9)⁴⁰.

Genome assembly

Consensus sequences were generated using a damage-aware reconstruction method, tested in three modes. The polarization-free damage weighting approach, presented in the main analysis, was supplemented by additional analyses using the polarization-based damage silencing and polarization-free damage silencing detailed in Supplementary Note 3. Consensus genome calling incorporated thresholds of minimum coverage (≥3) and base support frequency (≥65%). Positions failing these thresholds were assigned as non-informative bases (‘N’). Resulting consensus sequences were output in FASTA format alongside a log file detailing individual base calls. Further information about the tool and its implementation is provided in Supplementary Note 2.

Sex determination

Sex determination was performed using BAM files from shotgun-sequenced reads mapped to the nuclear reference genome EquCab 3.0. The E. caballus reference genome is female and, therefore, lacking a Y chromosome. Instead of using ChrY, chromosome 3 (Chr3), which is similar in size to chromosome X (ChrX), was used for comparison following the method of Pečnerová et al.⁶³. Male horses, having only one X chromosome, are expected to exhibit a ChrX:Chr3 ratio of approximately 0.5, whereas females are expected to have a ChrX:Chr3 ratio of approximately 1. Read counts for ChrX and Chr3 were obtained using samtools idxstats and normalized to the respective chromosome lengths (Supplementary Data 3).

Mitochondrial genome phylogeny

Previously published ancient and modern mitogenomes were downloaded from the National Center for Biotechnology Information (NCBI), with accession codes and metadata detailed in Supplementary Data 9. The published data were not reprocessed through EAGER but were used as originally published. Two alignments were generated using MUSCLE (v3.8.425)⁶⁴: one included newly assembled genomes and all ancient pre-Holocene mitogenomes (n = 148), while the second added modern genomes (n = 171). Partial deletion (91%) was performed in MEGA (v11.0.11)⁶⁵ after evaluating the optimal retention of alignment sites across deletion thresholds to maximize data kept before considerable loss (Supplementary Data 10).

Phylogenetic trees were constructed for both datasets. A maximum parsimony tree was generated in MEGA (v11.0.11) with 1,000 bootstrap iterations for the pre-Holocene dataset (Extended Data Fig. 9). In addition, maximum likelihood phylogenies were constructed in IQTREE2⁶⁶ using ModelFinder, which selected the K3Pu + F + R4 substitution model, with 1,000 bootstrap replicates for the four different reconstruction modes (Extended Data Figs. 4–7). Lastly, a maximum likelihood tree was also constructed in IQTREE2 with the ancient and modern dataset (n = 171), again utilizing ModelFinder which selected the K3Pu + F + I + R3 substitution model, with 1,000 bootstrap replicates (Extended Data Fig. 8).

Coalescence time estimates and dating

Divergence times among caballine horse lineages were estimated using BEAST (v2.6.6)⁴⁴. The dataset, consisting of pre-Holocene and modern mitogenomes (n = 171), was first filtered to remove a repetitive region (positions 16,128–16,359 bp) and sequences with excessive missing data (Ns, gaps and inconclusive dates). The subset used for the BEAST dating analysis comprised sequences with less than 6% missing data, corresponding to a minimum of 15,700 aligned base pairs out of ~16,700 and resulting in a multiple sequence alignment of n = 113. After complete deletion, 14,739 positions remained for analysis.

BEAUti (v2.6.6) was used to set up the BEAST analysis. Radiocarbon or stratigraphic dates for samples were used to calibrate the molecular clock, with additional normal priors applied for SCEN001 (500–100 ka) and TC21 (780–560 ka) based on Orlando et al.¹³. A GTR + G substitution model with four gamma categories was assumed, paired with an estimated relaxed clock model (exponential prior mean: 4.68 × 10⁻⁸ substitutions per site per year)⁴⁵ and a Bayesian skyline coalescent prior with seven monophyletic clade groupings. The groupings were designated to obtain more precise divergence time estimates among major clades and the ancient samples: (1) A + A1 + A2; (2) A + A1 + A2 + SCEN; (3) B; (4) C; (5) A + A1 + A2 + SCEN + C; (6) B + TC21; and (7) outgroup.

Two independent Markov chain Monte Carlo runs were performed for 100 million iterations each, sampling every 10,000 steps. The first 25% of each chain was discarded as burn-in, and the remaining samples were combined using LogCombiner (v2.6.6). Convergence was assessed in Tracer (v1.7.2). Posterior trees were combined to generate a maximum clade credibility tree in TreeAnnotator (v2.6.6), and the results were visualized in FigTree (v1.4.4).

Additional BEAST runs were performed using alternative consensus reconstructions for SCEN001, including one using BModelTest with SCEN001 treated as an undated tip to allow the model and substitution parameters to be inferred jointly. We also ran a version with SCEN001 and SCEN002 constrained to the biostratigraphic age (300–230 ka) and a control analysis excluding SCEN001 entirely. All results and model parameters are provided in Supplementary Note 3.

To assess divergence time estimates without population structure assumptions we used least-squares dating (LSD2)⁴⁷. A maximum likelihood tree was generated with IQTREE2 from the BEAST dataset (n = 113), using a constrained topology ((A,C),B) to match the monophyletic priors of the BEAST analysis. The maximum likelihood tree, tip dates, outgroup specification, alignment length and a root prior of 4.25 Ma (caballine/stenonine split) were provided as input. Confidence intervals were estimated using LSD2’s internal resampling procedure.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The raw reads and aligned nuclear and mitochondrial DNA sequences for the newly reported individuals are available at the European Nucleotide Archive under the accession number PRJEB84966. The assembled mtDNA fasta files, reconstructed with Polarization-Free Damage Weighting mode, are available on NCBI GenBank with accession numbers PX123152 and PX123153. The accession numbers and sources of previously published ancient data used in this study are available in Supplementary Data 2, 6 and 9.

Code availability

The source code for the damage-aware genome reconstruction is available via GitHub at https://github.com/meret-haeusler/DORIAN. The simulation and evaluation scripts of the Supplementary Information are available via GitHub at https://github.com/meret-haeusler/Supplementary_DORIAN_evaluation.

References

Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010).
Article CAS PubMed PubMed Central Google Scholar
Krause, J. et al. The complete mitochondrial DNA genome of an unknown hominin from southern Siberia. Nature 464, 894–897 (2010).
Article CAS PubMed PubMed Central Google Scholar
Wang, M.-S. et al. A polar bear paleogenome reveals extensive ancient gene flow from polar bears into brown bears. Nat. Ecol. Evol. 6, 936–944 (2022).
Article PubMed Google Scholar
Barlow, A. et al. Middle Pleistocene genome calibrates a revised evolutionary history of extinct cave bears. Curr. Biol. 31, 1771–1779.e7 (2021).
Article CAS PubMed Google Scholar
Orlando, L. et al. Ancient DNA analysis. Nat. Rev. Methods Prim. 1, 14 (2021).
Article CAS Google Scholar
Liu, Y., Mao, X., Krause, J. & Fu, Q. Insights into human history from the first decade of ancient human genomics. Science 373, 1479–1484 (2021).
Article CAS PubMed Google Scholar
Dalén, L., Heintzman, P. D., Kapp, J. D. & Shapiro, B. Deep-time paleogenomics and the limits of DNA survival. Science 382, 48–53 (2023).
Article PubMed PubMed Central Google Scholar
Orlando, L., Gilbert, M. T. P. & Willerslev, E. Reconstructing ancient genomes and epigenomes. Nat. Rev. Genet. 16, 395–408 (2015).
Article CAS PubMed Google Scholar
Kjær, K. H. et al. A 2-million-year-old ecosystem in Greenland uncovered by environmental DNA. Nature 612, 283–291 (2022).
Article PubMed PubMed Central Google Scholar
Lindahl, T. Instability and decay of the primary structure of DNA. Nature 362, 709–715 (1993).
Article CAS PubMed Google Scholar
Kistler, L., Ware, R., Smith, O., Collins, M. & Allaby, R. G. A new model for ancient DNA decay based on paleogenomic meta-analysis. Nucleic Acids Res. 45, 6310–6320 (2017).
Article CAS PubMed PubMed Central Google Scholar
Van Der Valk, T. et al. Million-year-old DNA sheds light on the genomic history of mammoths. Nature 591, 265–269 (2021).
Article PubMed PubMed Central Google Scholar
Orlando, L. et al. Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse. Nature 499, 74–78 (2013).
Article CAS PubMed Google Scholar
Dabney, J. et al. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments. Proc. Natl Acad. Sci. USA 110, 15758–15763 (2013).
Article CAS PubMed PubMed Central Google Scholar
Meyer, M. et al. Nuclear DNA sequences from the Middle Pleistocene Sima de los Huesos hominins. Nature 531, 504–507 (2016).
Article CAS PubMed Google Scholar
Meyer, M. et al. A mitochondrial genome sequence of a hominin from Sima de los Huesos. Nature 505, 403–406 (2014).
Article CAS PubMed Google Scholar
Meyer, M. et al. Palaeogenomes of Eurasian straight-tusked elephants challenge the current view of elephant evolution. Elife 6, e25413 (2017).
Article PubMed PubMed Central Google Scholar
MacFadden, B. J. & Hulbert, R. C. Explosive speciation at the base of the adaptive radiation of Miocene grazing horses. Nature 336, 466–468 (1988).
Article Google Scholar
Forsten, A. Middle Pleistocene replacement of stenonid horses by caballoid horses—ecological implications. Palaeogeogr. Palaeoclimatol. Palaeoecol. 65, 23–33 (1988).
Article Google Scholar
Azzaroli, A. Quaternary mammals and the “end-Villafranchian” dispersal event—a turning point in the history of Eurasia. Palaeogeogr. Palaeoclimatol. Palaeoecol. 44, 117–139 (1983).
Article Google Scholar
Rook, L. et al. Mammal biochronology (land mammal ages) around the world from Late Miocene to Middle Pleistocene and major events in horse evolutionary history. Front. Ecol. Evol. 7, 278 (2019).
Article Google Scholar
Vershinina, A. O. et al. Ancient horse genomes reveal the timing and extent of dispersals across the Bering Land Bridge. Mol. Ecol. 30, 6144–6161 (2021).
Article PubMed Google Scholar
Fages, A. et al. Tracking five millennia of horse management with extensive ancient genome time series. Cell 177, 1419–1435.e31 (2019).
Article CAS PubMed PubMed Central Google Scholar
Orlando, L. The evolutionary and historical foundation of the modern horse: lessons from ancient genomics. Annu. Rev. Genet. 54, 563–581 (2020).
Article CAS PubMed Google Scholar
Librado, P. et al. Widespread horse-based mobility arose around 2200 BCE in Eurasia. Nature 631, 819–825 (2024).
Article CAS PubMed PubMed Central Google Scholar
Hutson, J. M., Villaluenga, A., García-Moreno, A., Turner, E. & Gaudzinski-Windheuser, S. Persistent predators: zooarchaeological evidence for specialized horse hunting at Schöningen 13II-4. J. Hum. Evol. 196, 103590 (2024).
Article PubMed Google Scholar
Thieme, H. Lower Palaeolithic hunting spears from Germany. Nature 385, 807–810 (1997).
Article CAS PubMed Google Scholar
Conard, N. J. et al. Excavations at Schöningen and paradigm shifts in human evolution. J. Hum. Evol. 89, 1–17 (2015).
Article PubMed Google Scholar
Van Kolfschoten, T. The Palaeolithic locality Schöningen (Germany): a review of the mammalian record. Quat. Int. 326–327, 469–480 (2014).
Article Google Scholar
Urban, B. & Sierralta, M. New palynological evidence and correlation of Early Palaeolithic sites Schöningen 12 B and 13 II, Schöningen open lignite mine. in The Chronological Setting of the Palaeolithic Sites of Schöningen (ed Behre, K.-E.) 77–96 (Propylaeum, 2019).
Rivals, F. et al. Investigation of equid paleodiet from Schöningen 13 II-4 through dental wear and isotopic analyses: archaeological implications. J. Hum. Evol. 89, 129–137 (2015).
Article PubMed Google Scholar
Rohland, N., Glocke, I., Aximu-Petri, A. & Meyer, M. Extraction of highly degraded DNA from ancient bones, teeth and sediments for high-throughput sequencing. Nat. Protoc. 13, 2447–2461 (2018).
Article CAS PubMed Google Scholar
Gansauge, M.-T., Aximu-Petri, A., Nagel, S. & Meyer, M. Manual and automated preparation of single-stranded DNA libraries for the sequencing of DNA from ancient biological remains and other sources of highly degraded DNA. Nat. Protoc. 15, 2279–2300 (2020).
Article CAS PubMed Google Scholar
Maricic, T., Whitten, M. & Pääbo, S. Multiplexed DNA sequence capture of mitochondrial genomes using PCR products. PLoS ONE 5, e14004 (2010).
Article PubMed PubMed Central Google Scholar
Furtwängler, A. et al. Ratio of mitochondrial to nuclear DNA affects contamination estimates in ancient DNA analysis. Sci. Rep. 8, 14075 (2018).
Article PubMed PubMed Central Google Scholar
Fu, Q. et al. A revised timescale for human evolution based on ancient mitochondrial genomes. Curr. Biol. 23, 553–559 (2013).
Article CAS PubMed PubMed Central Google Scholar
Peltzer, A. et al. EAGER: efficient ancient genome reconstruction. Genome Biol. 17, 60 (2016).
Article PubMed PubMed Central Google Scholar
Posth, C. et al. Reconstructing the deep population history of Central and South America. Cell 175, 1185–1197 (2018).
Article PubMed PubMed Central Google Scholar
Lynch, V. J. et al. Elephantid genomes reveal the molecular bases of woolly mammoth adaptations to the Arctic. Cell Rep. 12, 217–228 (2015).
Article CAS PubMed Google Scholar
Jónsson, H., Ginolhac, A., Schubert, M., Johnson, P. L. F. & Orlando, L. mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics 29, 1682–1684 (2013).
Article PubMed PubMed Central Google Scholar
Neukamm, J., Peltzer, A. & Nieselt, K. DamageProfiler: fast damage pattern calculation for ancient DNA. Bioinformatics 37, 3652–3653 (2021).
Article CAS PubMed Google Scholar
Yuan, J. et al. Mitochondrial genomes of Late Pleistocene caballine horses from China belong to a separate clade. Quat. Sci. Rev. 250, 106691 (2020).
Article Google Scholar
Librado, P. et al. The origins and spread of domestic horses from the Western Eurasian steppes. Nature 598, 634–640 (2021).
Article PubMed PubMed Central Google Scholar
Bouckaert, R. et al. BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis. PLoS Comput. Biol. 15, e1006650 (2019).
Article CAS PubMed PubMed Central Google Scholar
Schubert, M. et al. Prehistoric genomes reveal the genetic foundation and cost of horse domestication. Proc. Natl Acad. Sci. USA https://doi.org/10.1073/pnas.1416991111 (2014).
Heintzman, P. D. et al. A new genus of horse from Pleistocene North America. Elife 6, e29944 (2017).
Article PubMed PubMed Central Google Scholar
To, T.-H., Jung, M., Lycett, S. & Gascuel, O. Fast dating using least-squares criteria and algorithms. Syst. Biol. 65, 82–97 (2016).
Article CAS PubMed Google Scholar
Tucci, M. et al. Evidence for the age and timing of environmental change associated with a Lower Palaeolithic site within the Middle Pleistocene Reinsdorf sequence of the Schöningen coal mine, Germany. Palaeogeogr. Palaeoclimatol. Palaeoecol. 569, 110309 (2021).
Article Google Scholar
Hutson, J. M. et al. Revised age for Schöningen hunting spears indicates intensification of Neanderthal cooperative behavior around 200,000 years ago. Sci. Adv. 11, eadv0752 (2025).
Article CAS PubMed PubMed Central Google Scholar
Zavala, E. I. et al. Pleistocene sediment DNA reveals hominin and faunal turnovers at Denisova Cave. Nature 595, 399–403 (2021).
Article CAS PubMed PubMed Central Google Scholar
Gelabert, P. et al. A sedimentary ancient DNA perspective on human and carnivore persistence through the Late Pleistocene in El Mirón Cave, Spain. Nat. Commun. 16, 107 (2025).
Article CAS PubMed PubMed Central Google Scholar
Evans, S., Llamas, B. & Wood, J. R. Sedimentary ancient DNA from caves: challenges and opportunities. J. Quat. Sci. 40, 565–578 (2025).
Article Google Scholar
Boulbes, N. & Van Asperen, E. N. Biostratigraphy and palaeoecology of European Equus. Front. Ecol. Evol. 7, 301 (2019).
Article Google Scholar
Eisenmann, V. Les métapodes d’Equus sensu lato (Mammalia, Périssodactyla). Geobios 12, 863–886 (1979).
Article Google Scholar
Orlando, L. et al. Revising the recent evolutionary history of equids using ancient DNA. Proc. Natl Acad. Sci. USA 106, 21754–21759 (2009).
Article CAS PubMed PubMed Central Google Scholar
Weinstock, J. et al. Evolution, systematics, and phylogeography of pleistocene horses in the new world: a molecular perspective. PLoS Biol. 3, e241 (2005).
Article PubMed PubMed Central Google Scholar
Urban, B. et al. Spatial interpretation of high‐resolution environmental proxy data of the Middle Pleistocene Palaeolithic faunal kill site Schöningen 13 II‐4, Germany. Boreas 52, 440–458 (2023).
Article Google Scholar
Krahn, K. J. et al. Temperature and palaeolake evolution during a Middle Pleistocene interglacial–glacial transition at the Palaeolithic locality of Schöningen, Germany. Boreas 53, 504–524 (2024).
Article Google Scholar
Kircher, M. in Ancient DNA (eds Shapiro, B. & Hofreiter, M.) vol. 840 197–228 (Humana Press, 2012).
Meyer, M. & Kircher, M. Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb. Protoc. 2010, pdb.prot5448 (2010).
Article PubMed Google Scholar
Schubert, M., Lindgreen, S. & Orlando, L. AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Res. Notes 9, 88 (2016).
Article PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Pečnerová, P. et al. Genome-based sexing provides clues about behavior and social structure in the woolly mammoth. Curr. Biol. 27, 3505–3510 (2017).
Article PubMed Google Scholar
Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).
Article CAS PubMed PubMed Central Google Scholar
Kumar, S., Stecher, G., Li, M., Knyaz, C. & Tamura, K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol. 35, 1547–1549 (2018).
Article CAS PubMed PubMed Central Google Scholar
Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Article CAS PubMed PubMed Central Google Scholar
Diez-del-Molino, D., Heintzman, P. D. & Dalén, L. A list of representative paleogenomic datasets derived from human and faunal remains. Zenodo https://doi.org/10.5281/ZENODO.8270285 (2023).
Lang, J. et al. The Pleistocene of Schöningen, Germany: a complex tunnel valley fill revealed from 3D subsurface modelling and shear wave seismics. Quat. Sci. Rev. 39, 86–105 (2012).
Article Google Scholar

Download references

Acknowledgements

We thank L. Orlando for sharing ancient horse mtDNA sequences, the Archaeo- and Palaeogenetics group at the University of Tübingen for comments and discussions, the entire Schöningen excavation team and Henning Hassmann from the Cultural Heritage Office of Lower Saxony for assisting with the permission to sample, James A. Fellows Yates and Alexander Herbig for feedback during the start of the project, and C. Fotiadou for help with data analysis. The Ministry of Science of Lower Saxony funds the excavation at Schöningen. Funding for this study came from Senckenberg Centre for Human Evolution and Palaeoenvironment, the Max Planck Society, an Evolution of Cultural Modernity fellowship from the Ministry of Science of Baden-Württemberg, the Ministry of Science of Lower Saxony, and the Leibniz ScienceCampus ‘Geogenomic Archaeology Campus Tübingen (GACT)’ (project number W73/2022) supported by the Leibniz Association, the Ministry of Science, Research, and the Arts of Baden-Württemberg and the University of Tübingen. We acknowledge support from the Open Access Publication Fund of the University of Tübingen.

Funding

Open access funding provided by Eberhard Karls Universität Tübingen.

Author information

These authors contributed equally: Arianna Weingarten, Meret Häusler.
These authors jointly supervised this work: Nicholas J. Conard, Kay Nieselt, Cosimo Posth.

Authors and Affiliations

Archaeo- and Palaeogenetics, Institute for Archaeological Sciences, Department of Geosciences, University of Tübingen, Tübingen, Germany
Arianna Weingarten, Meret Häusler, Ella Reiter, Maria A. Spyrou & Cosimo Posth
Senckenberg Centre for Human Evolution and Palaeoenvironment, University of Tübingen, Tübingen, Germany
Arianna Weingarten, Maria A. Spyrou, Nicholas J. Conard & Cosimo Posth
Senckenberg Centre for Human Evolution and Palaeoenvironment, Schöningen, Germany
Arianna Weingarten, Jordi Serangeli & Ivo Verheijen
Integrative Transcriptomics, Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany
Meret Häusler & Kay Nieselt
Early Prehistory and Quaternary Ecology, Department of Geosciences, University of Tübingen, Tübingen, Germany
Jordi Serangeli, Ivo Verheijen & Nicholas J. Conard
Cultural Heritage Office of Lower Saxony, Hanover, Germany
Ivo Verheijen
Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
Rita Radzevičiūtė, Alexander Stoessel, Johannes Krause & Maria A. Spyrou
Institute of Zoology and Evolutionary Research, Friedrich Schiller University Jena, Jena, Germany
Alexander Stoessel

Authors

Arianna Weingarten
View author publications
Search author on:PubMed Google Scholar
Meret Häusler
View author publications
Search author on:PubMed Google Scholar
Jordi Serangeli
View author publications
Search author on:PubMed Google Scholar
Ivo Verheijen
View author publications
Search author on:PubMed Google Scholar
Ella Reiter
View author publications
Search author on:PubMed Google Scholar
Rita Radzevičiūtė
View author publications
Search author on:PubMed Google Scholar
Alexander Stoessel
View author publications
Search author on:PubMed Google Scholar
Johannes Krause
View author publications
Search author on:PubMed Google Scholar
Maria A. Spyrou
View author publications
Search author on:PubMed Google Scholar
Nicholas J. Conard
View author publications
Search author on:PubMed Google Scholar
Kay Nieselt
View author publications
Search author on:PubMed Google Scholar
Cosimo Posth
View author publications
Search author on:PubMed Google Scholar

Contributions

A.W., E.R., R.R., M.A.S. and C.P. performed or supervised laboratory work and computational analyses. I.V., J.S. and N.J.C. provided zooarchaeological assessments and archaeological context. A.S. reconstructed the micro-CT scans. M.H. and K.N. developed the mtDNA reconstruction tool with contributions from A.W., C.P. and M.A.S. A.W. and M.H. prepared the figures with input from C.P. and K.N. A.W. and C.P. wrote the manuscript with edits from K.N., M.H., I.V., J.S. and N.J.C. and all other co-authors. N.J.C. and J.K. initiated the study. C.P., N.J.C. and K.N. supervised the work.

Corresponding authors

Correspondence to Arianna Weingarten or Cosimo Posth.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Ecology & Evolution thanks Ludovic Orlando and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Archaeological context of SCEN001 and SCEN002.

a. Topographic map of the area around the town of Schöningen. The open-cast mine is outlined, and the Spear Horizon site is indicated by a purple dot. Top right shows the location of Schöningen within Germany. (Graphic: Utz Böhner, Dirk Fabian). b. Excavation map of the site Schöningen 13II. SCEN001, SCEN002 shown by purple dots. (Modified graphic: Jens Lehman). c. Petrous bone SCEN001 within the Spear Horizon, indicated by a purple dot. Bones are displayed in blue, flint tools in red, and wooden artefacts in green. Spear I is located south of the sampled horse skull. (Graphic: Ivo Verheijen). d. In-situ photo of spear I from the Spear Horizon. (Photo: Peter Pfarr). e. Photo of spear II from the Spear Horizon, surrounded by numerous remains of Equus mosbachensis. (Photo: Peter Pfarr).

Extended Data Fig. 2 Sample photos and micro-CT scans.

a. Photograph of the sample with a one-centimeter scale (Photo: Ivo Verheijen). b. 3D reconstruction of the petrous bone showing the exterior surface. (Graphic: Alexander Stoessel). c. 3D reconstruction of the petrous bone highlighting the inner ear region targeted during sampling. (Graphic: Alexander Stoessel).

Extended Data Fig. 3 Authentification of ancient DNA in shotgun sequenced libraries.

a. C to T damage profiles generated in Damage Profiler show characteristic peaks at fragment ends, with an uptick beyond ~20 bp reflecting damage from the opposite end of the short molecules. b. Base composition around fragment termini, generated in mapDamage, reveals guanine and adenine enrichment immediately upstream of reads (gray box), consistent with depurination driven strand breaks in ancient DNA. c. Read length distribution plot (DamageProfiler), with a minimum length threshold of 30 bp applied.

Extended Data Fig. 4 Best-scoring maximum likelihood tree using Polarization-Free Damage Weighting consensus.

Tree constructed in IQ-TREE2 using ModelFinder, which selected the K3Pu+F + R4 substitution model. The tree is based on an alignment with 91% partial deletion applied in MEGA, including all 146 previously published and two newly reported pre-Holocene horse mitochondrial genomes. Branch supports are tested using 1000 bootstrap iterations. SCEN reconstructions have the Polarization-Free Damage Weighting method applied.

Extended Data Fig. 5 Best-scoring maximum likelihood tree using no correction consensus.

Tree constructed in IQ-TREE2 using ModelFinder, which selected the K3Pu+F + R4 substitution model. The tree is based on an alignment with 91% partial deletion applied in MEGA, including all 146 previously published and two newly reported pre-Holocene horse mitochondrial genomes. Branch supports are tested using 1000 bootstrap iterations. SCEN reconstructions have no damage correction applied.

Extended Data Fig. 6 Best-scoring maximum likelihood tree using Polarization-Based Damage Silencing consensus.

Tree constructed in IQ-TREE2 using ModelFinder, which selected the K3Pu+F + R4 substitution model. The tree is based on an alignment with 91% partial deletion applied in MEGA, including all 146 previously published and two newly reported pre-Holocene horse mitochondrial genomes. Branch supports are tested using 1000 bootstrap iterations. SCEN reconstructions have Polarization-Based Damage Silencing applied.

Extended Data Fig. 7 Best-scoring maximum likelihood tree using Polarization-Free Damage Silencing consensus.

Tree constructed in IQ-TREE2 using ModelFinder, which selected the K3Pu+F + R4 substitution model. The tree is based on an alignment with 91% partial deletion applied in MEGA, including all 146 previously published and two newly reported pre-Holocene horse mitochondrial genomes. Branch supports are tested using 1000 bootstrap iterations. SCEN reconstructions have Polarization-Free Damage Silencing applied.

Extended Data Fig. 8 Best-scoring maximum likelihood tree using Polarization-Free Damage Weighting consensus, with modern genomes.

Tree constructed in IQ-TREE2 using ModelFinder, which selected the K3Pu+F + I + R3 substitution model. The tree is based on an alignment with 91% partial deletion applied in MEGA, including 171 pre-Holocene ancient and modern mitochondrial genomes. Branch supports are tested using 1000 bootstrap iterations. SCEN reconstructions have Polarization-Free Damage Weighting applied. All major clades are collapsed besides Clade A. Modern samples are highlighted and located within the diversity of Clade A.

Extended Data Fig. 9 Maximum parsimony tree using Polarization-Free Damage Weighting.

Tree estimated in MEGA for 148 pre-Holocene ancient Caballine mitogenomes with 91% partial deletion. Branch supports are tested using 1000 bootstrap iterations. SCEN reconstructions have Polarization-Free Damage Weighting applied.

Supplementary information

Supplementary Information

Supplementary Notes 1–3, Figs. 1 and 2 and Tables 1–10.

Reporting Summary

Supplementary Data 1–12

Supplementary Data 1. Deep-time palaeogenome metadata, modified from the database of Diez-del-Molino et al. (2023). Supplementary Data 2. Ancient horse palaeogenome metadata. Supplementary Data 3. Results of EAGER analyses for shotgun libraries of SCEN001 and SCEN002 against the nuclear horse reference genome (GCF_002863925.1). Supplementary Data 4. Sex determination of SCEN001 and SCEN002. Supplementary Data 5. Results of EAGER analyses for captured libraries of SCEN001 and SCEN002 against the mitochondrial horse reference genome (NC_001640). Supplementary Data 6. Ancient and modern horse metadata used in BEAST analysis. Supplementary Data 7. Archaeological context for SCEN001 SCEN002. Supplementary Data 8. Primers used for mtDNA capture, elongation time and annealing temperature (Ta) of each primer pair. Supplementary Data 9. Ancient and modern horse metadata. Supplementary Data 10. a, Polarization-based damage silencing consensus for SCEN001 and SCEN002. Sites retained across partial deletion of the multiple alignment (N = 148). b, Polarization-free damage silencing consensus for SCEN001 and SCEN002. Sites retained across partial deletion of the multiple alignment (N = 148). c, Polarization-free damage weighting consensus for SCEN001 and SCEN002. Sites retained across partial deletion of the multiple alignment (N = 148). d, Polarization-free damage weighting consensus for SCEN001 and SCEN002. Sites retained across partial deletion of the multiple alignment (N = 171). Supplementary Data 11. Single-nucleotide polymorphism (SNP) analysis of Schöningen mtDNA private substitutions among the three reconstruction methods. Supplementary Data 12. Number of positions that were identified as damaged during the respective damage-aware genome reconstruction and number of non-informative (N) base calls in the reconstructed genome, for SCEN001 and SCEN002, respectively.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Weingarten, A., Häusler, M., Serangeli, J. et al. Mitochondrial genomes of Middle Pleistocene horses from the open-air site complex of Schöningen. Nat Ecol Evol (2025). https://doi.org/10.1038/s41559-025-02859-5

Download citation

Received: 27 January 2025
Accepted: 20 August 2025
Published: 01 October 2025
DOI: https://doi.org/10.1038/s41559-025-02859-5

Subjects

Abstract

Similar content being viewed by others

Main

Results

Mitochondrial genomes of Middle Pleistocene horses

Ancient mitogenome reconstruction

Polarization-based damage silencing

Polarization-free damage silencing

Polarization-free damage weighting

Mitochondrial genome phylogeny

Molecular dating and divergence-time estimates

Discussion

Methods

Equus mosbachensis samples and micro-CT scans

Sampling, DNA extraction, library preparation and shotgun sequencing

Bait production for mitochondrial genome enrichment

Positive control for capture enrichment

Mitochondrial DNA capture

Data processing

Genome assembly

Sex determination

Mitochondrial genome phylogeny

Coalescence time estimates and dating

Reporting summary

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links