The spatiotemporal distribution of human pathogens in ancient Eurasia

Sikora, Martin; Canteri, Elisabetta; Fernandez-Guerra, Antonio; Oskolkov, Nikolay; Ågren, Rasmus; Hansson, Lena; Irving-Pease, Evan K.; Mühlemann, Barbara; Holtsmark Nielsen, Sofie; Scorrano, Gabriele; Allentoft, Morten E.; Valeur Seersholm, Frederik; Schroeder, Hannes; Gaunitz, Charleen; Stenderup, Jesper; Vinner, Lasse; Jones, Terry C.; Nystedt, Björn; Sjögren, Karl-Göran; Parkhill, Julian; Fugger, Lars; Racimo, Fernando; Kristiansen, Kristian; Iversen, Astrid K. N.; Willerslev, Eske

doi:10.1038/s41586-025-09192-8

Download PDF

Article
Open access
Published: 09 July 2025

The spatiotemporal distribution of human pathogens in ancient Eurasia

Nature volume 643, pages 1011–1019 (2025)Cite this article

74k Accesses
12 Citations
677 Altmetric
Metrics details

Subjects

Abstract

Infectious diseases have had devastating effects on human populations throughout history, but important questions about their origins and past dynamics remain¹. To create an archaeogenetic-based spatiotemporal map of human pathogens, we screened shotgun-sequencing data from 1,313 ancient humans covering 37,000 years of Eurasian history. We demonstrate the widespread presence of ancient bacterial, viral and parasite DNA, identifying 5,486 individual hits against 492 species from 136 genera. Among those hits, 3,384 involve known human pathogens², many of which had not previously been identified in ancient human remains. Grouping the ancient microbial species according to their likely reservoir and type of transmission, we find that most groups are identified throughout the entire sampling period. Zoonotic pathogens are only detected from around 6,500 years ago, peaking roughly 5,000 years ago, coinciding with the widespread domestication of livestock³. Our findings provide direct evidence that this lifestyle change resulted in an increased infectious disease burden. They also indicate that the spread of these pathogens increased substantially during subsequent millennia, coinciding with the pastoralist migrations from the Eurasian Steppe^4,5.

Insights into infectious diseases through ancient pathogen genomics

Article 17 November 2025

Metagenomic analysis of human, animal, and environmental samples identifies potential emerging pathogens, profiles antibiotic resistance genes, and reveals horizontal gene transfer dynamics

Article Open access 09 April 2025

Pseudomonas aeruginosa: ecology, evolution, pathogenesis and antimicrobial susceptibility

Article 29 May 2025

Main

Pathogens have been a constant threat to human health throughout our evolutionary history. Until around 1850, at least a quarter of all children died before the age of one, and around another quarter before turning 15. Infectious diseases are estimated to have been responsible for more than half of these deaths⁶. Larger disease outbreaks have profoundly affected human societies, sometimes devastatingly affecting entire civilizations⁷. Infectious diseases have left lasting impressions on human genomes, as selective pressures from pathogens have continuously shaped human genetic variation^8,9. Where and when different human pathogens first emerged, how and why they spread, and how they affected human populations are important but largely unresolved questions.

During the Holocene (beginning roughly 12,000 years ago), the agricultural transition created larger and more sedentary communities, facilitating pathogen transmission and persistence within populations¹⁰. Simultaneously, the rise of animal husbandry and pastoralism are thought to have increased the risk of zoonoses³. Technological advances, such as horses and carts, increased both mobility and the risk of disease transmission between populations¹¹. It has been proposed that these changes led to the so-called ‘first epidemiological transition’ characterized by increased infectious disease mortality³. However, direct evidence remains scarce, and the idea is debated¹². Palaeopathological examinations of ancient skeletons offer insights into past infectious disease burden¹³, but are limited to the few diseases identifiable from the available tissue. Recent advances in ancient DNA (aDNA) techniques allow for the retrieval of direct genomic evidence of past microbial infections, enabling the reconstruction of complete ancient pathogen genomes. These studies have typically focused on specific pathogens and have provided surprising insights into the evolutionary history of the causative agents of some of the most historically important infectious diseases affecting humans, including plague (Yersinia pestis)^14,15, tuberculosis (Mycobacterium tuberculosis)^16,17, smallpox (variola virus)^18,19, hepatitis B (hepatitis B virus, HBV)^20,21,22 and others^{23,24,25,26,27,28}. However, there is an unmet need to investigate the combined landscape of ancient bacteria, viruses and parasites that affected our ancestors across various regions and time periods. Here we use a new high-throughput computational workflow to screen for ancient microbial DNA and use our data to investigate long-standing questions in paleoepidemiology: when and where did important human pathogens arise? And what factors influenced their spatiotemporal distribution?

Ancient microbial DNA from 1,313 Eurasians

To understand the distribution of ancient pathogens, we developed an accurate and scalable workflow to identify ancient microbial DNA in shotgun-sequenced aDNA data (Extended Data Figs. 1–4 and Supplementary Information 1). The data (roughly 405 billion sequencing reads) are derived from 1,313 ancient individuals from western Eurasia (n = 1,015; 77%), central and north Asia (n = 265; 20%) and southeast Asia (n = 33; 3%), spanning a roughly 37,000-year period, from the Upper Palaeolithic to historical times (Fig. 1, Supplementary Table 1 and Supplementary Information 2). As burial practices varied across cultures and time, these samples represent a subset of groups within past societies. Nevertheless, the identified pathogens probably affected the broader population, as diseases spread easily in communities with poor sanitation and hygiene²⁹. Initial metagenomic classification showed a large fraction of reads classified as soil-dwelling taxa including genera such as Streptomyces or Pseudomonas, reflecting a predominantly environmental source of microbial DNA. Further characterization using a topic model, however, suggested that microbial DNA in ancient tooth samples often derives from genera commonly associated with the human oral microbiome such as Actinomyces or Streptococcus (Extended Data Fig. 1d–g).

We selected a set of 136 bacterial and protozoan genera (11,553 species total) containing human pathogenic species² as well as 1,356 viral genera (259,979 species total) for further authentication and detection of ancient taxa. We found that ancient microbial DNA was widely detected, with 5,486 authenticated individual hits identified across 1,005 samples (Z-score for aDNA damage rate from metaDMG of greater than or equal to 1.5; Fig. 2a, Supplementary Table 2 and Extended Data Fig. 4). Of those, 3,384 hits were found among 214 known human pathogen species², with the remaining 2,104 hits involving 278 other species. The highest numbers were observed in bacterial genera associated with the human oral microbiome, such as Actinomyces (380; 28.5% of samples) and Streptococcus (242; 18.1% of samples), or those commonly found in soil environments, such as Clostridium (252; 18.9% of samples) and Pseudomonas (111; 8.3% of samples).

**Fig. 2: Overview and characteristics of detected ancient microbial DNA.**

We observed marked differences in the distributions of the genetic similarity of the ancient microbial sequences to their reference assemblies, both among genera and between species within a genus (Fig. 2b and Extended Data Fig. 5). High average nucleotide identity (ANI) indicates that ancient microbial sequences are closely related to a reference assembly in the modern database, and was observed in hits across all species from some genera (for example, Yersinia, Fig. 2b). In other genera, only a few hits had a closely related database reference assembly match. An example is the genus Mycobacterium, in which only hits of the leprosy-causing bacterium Mycobacterium leprae were highly similar to their reference assembly (ANI > 99%, Fig. 2b). Low ANI indicates that the ancient microbial DNA is only distantly related to the reference assembly, for example, because of aDNA damage, poor representation of the diversity of the genus in the database or false-positive classification of ancient microbial reads deriving from a related genus (Extended Data Fig. 3). Alternatively, ANI can also be reduced when reads mapped to a particular reference assembly originate from many closely related strains or species in a sample. To test for such mixtures, we quantified the rate of observing different alleles at two randomly sampled reads at nucleotide positions across the genomes of hits with read depths greater than or equal to one. We found a high multi-allele rate in many species associated with the human oral microbiome, such as Streptococcus sanguinis or Treponema denticola. Hits for these species also showed lower ANI, consistent with the expectation for mixtures of ancient microbial DNA (Extended Data Fig. 6a).

The rate of read mapping varied by orders of magnitude between species, from hits in species with high read recruitment, such as M. leprae (greater than 100-fold enrichment over the median number of classified reads across target genera) to hits at the lower limits of detection, for example, for the louse-borne pathogen Borrelia recurrentis (lowest read recruitment roughly 100-fold less than the median number of classified reads across target genera; Fig. 2c and Extended Data Fig. 5b). Ancient microbial DNA from species commonly found in soil, such as Clostridium botulinum, was detected at similar rates in tooth and bone samples. Conversely, species associated with the human oral microbiome (for example, Fusobacterium nucleatum, Streptococcus mutans and Porphyromonas gingivalis) or pathogenic infections (for example, Y. pestis and HBV) were significantly more frequently identified in tooth samples (Extended Data Fig. 6b). To further verify hits with low read numbers, we performed a BLASTn search for all reads of each hit with n ≤ 100 final reads (n = 712 hits total; Supplementary Table 3). Most hits showed a high proportion (greater than or equal to 80%) of reads assigned to the same species using BLASTn, and the species with the most top-ranked BLASTn hits generally matched the inferred hit species (Extended Data Fig. 7a,b).

Our results show that ancient microbial DNA isolated from human remains originates from complex mixtures of distinct endogenous and exogenous sources. The high detection rate, high read recruitment, lower ANI and evidence of mixtures in genera such as Clostridium or Pseudomonas (Fig. 2 and Extended Data Figs. 5 and 6) suggest that a substantial fraction of this ancient microbial metagenome derives from environmental sources, possibly associated with the ‘necrobiome’ involved in post-mortem putrefaction processes^30,31 (Supplementary Information 3). By contrast, species from other frequently observed genera, including Actinomyces or Streptococcus, were predominantly identified from teeth and probably originated from the endogenous oral microbiome³⁰. Species representing likely cases of pathogenic infections (for example, Y. pestis and M. leprae) were often characterized by higher ANI and/or low multi-allele rate, consistent with pathogen load predominantly originating from a single dominant strain.

The ancient Eurasian pathogen landscape

Our dataset provides a unique opportunity to investigate the origins and spatiotemporal distribution of human pathogens in Eurasia, expanding the known range of some ancient pathogenic species and identifying others not previously reported using paleogenomic data (Supplementary Tables 3 and 5).

Considering bacterial pathogens, we found widespread distribution of the plague-causing bacterium Y. pestis, consistent with previous studies¹⁵. We identified 42 putative cases of Y. pestis (35 newly reported, Extended Data Fig. 6e), corresponding to a detection rate of roughly 3% in our samples. These newly identified cases expand the spatial and temporal extent of ancient plague over previous results (Fig. 3). The earliest three cases were dated between around 5,700–5,300 calibrated years before present (cal. bp), across a broad geographic area ranging from western Russia (NEO168, 5,583–5,322 cal. bp), to central Asia (BOT2016, 5,582–5,318 cal. bp) and to Lake Baikal in Siberia³² (DA342, 5,745–5,474 cal. bp). This broad range of detection among individuals predating 5,000 cal. bp challenges previous interpretations that early plague strains represent only isolated zoonotic spillovers³³. We replicated previously identified cases of plague in Late Neolithic and Bronze Age (LNBA) contexts across the Eurasian Steppe¹⁵ and identified many instances in which several individuals from the same burial context were infected (Afanasievo Gora, Russia; Kytmanovo, Russia; Kapan, Armenia; Arban 1, Russia) (Supplementary Table 2). These results indicate that the transmissibility and potential for local epidemic outbreaks for strains at those sites were probably higher than previously assumed³³. Finally, 11 out of 42 cases were identified in late medieval and early modern period individuals (800–200 bp) from two cemeteries in Denmark (Aalborg, Randers), highlighting the high burden of plague during this time in Europe. All but one hit (NEO627, n = 84 reads total) showed expected coverage for the virulence plasmids pCD1 and pMT1, with hits before 2,500 years bp characterized by the previously reported absence of a 19 kilobase region on pMT1 containing the ymt gene¹⁵ (Extended Data Fig. 8c and Supplementary Information 4).

**Fig. 3: Spatiotemporal distribution of selected ancient pathogens.**

Another bacterial pathogen frequently detected was the spirochaete bacterium B. recurrentis, the causative agent of louse-borne relapsing fever (LBRF), a disease with a mortality of 10–40% (Supplementary Information 5). Whereas previous paleogenomic evidence for LBRF is limited to a few cases from Scandinavia and Britain^23,34, we report 34 new putative cases (2.5% detection rate; Extended Data Fig. 6e), with wide geographic distribution across Europe, central Asia and Siberia (Fig. 3). We detected the earliest case in a Neolithic farmer individual from Scandinavia (NEO29, Lohals, 5,647–5,471 cal. bp), suggesting that human body lice were already vectors for infectious disease during the Neolithic period, supported by phylogenetic analyses of B. recurrentis³⁴. The highest detection rates were found during the Iron and Viking Ages. LBRF outbreaks were historically associated with crowded living conditions, poor personal hygiene and wet and cold seasons, but are rare today in most regions (Supplementary Information 5). Our results indicate that B. recurrentis infections exerted a substantial disease burden on past populations.

We also report new cases of other bacterial pathogens previously detected in paleogenomic data. The leprosy-causing bacterium M. leprae was identified in seven individuals (0.5% detection rate) from Scandinavia and only appeared from the Late Iron Age onwards (earliest case RISE174, 1,523–1,339 cal. bp). Because M. leprae can infect both red squirrels and humans³⁵, and archaeological evidence demonstrates that fur trade from Scandinavia, including squirrel fur, increased substantially during the late Iron and Viking Ages³⁶, our results support the suggestion that squirrel fur trade could have facilitated transmission³⁷. Our findings are also consistent with the widespread distribution of leprosy in medieval Europe³⁸. We further detected three putative cases of Treponema pallidum—subspecies of which are the causative agents of treponematoses such as yaws, and endemic and venereal syphilis—in three individuals from recent time periods (earliest case 101809T, Denmark, 600–500 bp; Extended Data Fig. 7). Two cases were identified in individuals from Borneo in southeast Asia (around 500–300 years bp), expanding the range of paleogenomic evidence for treponemal disease into this region.

Among the species newly reported using paleogenomic data, we identified 12 putative cases of Yersinia enterocolitica, the causative agent of yersiniosis, commonly contracted through consuming contaminated raw or undercooked meat (Fig. 3). The animal reservoirs for Y. enterocolitica include boars, deer, horses, cattle and sheep. As Y. enterocolitica rarely enters the bloodstream, our results probably underestimate the disease burden. This species includes some of the only identified putative zoonotic infections in individuals from Mesolithic hunter-gatherer contexts (NEO941, Denmark, 6,446–6,302 cal. bp). We also detected other members of the order Enterobacterales, transmitted through the faecal–oral route, including members of the genera Shigella, Salmonella and Escherichia (Supplementary Table 2). We report the earliest evidence for ancient leptospirosis (genus Leptospira) dating back to the Neolithic, 5,650–5,477 cal. bp (NEO46, Sweden; Leptospira borgpetersenii). Whereas earlier cases predominantly involved L. borgpetersenii (n = 5, 0.4% detection rate), most hits were Leptospira interrogans (n = 20, 1.5% detection rate), almost exclusively in Scandinavian contexts from the Viking Age onwards (Fig. 3). L. borgpetersenii is today primarily found in cattle, whereas L. interrogans is detected more broadly in both domestic and wild animals. Although the clinical manifestations are similar, with an untreated fatality rate of 1% today, transmission routes vary³⁹. Although host-to-host transmission predominates for L. borgpetersenii, transmission by means of urine-contaminated environments dominates for L. interrogans. We also report two putative cases of Corynebacterium diphtheriae, the causative agent of diphtheria; the oldest of which dates back to the Mesolithic (Sidelkino, 11,336–11,181 cal. bp) (Supplementary Table 2).

Other diseases associated with animals and livestock, such as listeriosis (Listeria monocytogenes) and brucellosis (genus Brucella), could not be reliably identified. Another main human pathogen not identified in our dataset is M. tuberculosis, which causes tuberculosis. However, as the M. tuberculosis load in blood is typically low in immunocompetent patients without advanced disease⁴⁰ and latent tuberculosis develops in 60% of cases and can persist for decades, it is, on the basis of current knowledge, unlikely to be readily identified using aDNA data from tooth and bone remains sampled for ancient human DNA.

Identifying eukaryotic pathogens is challenging as sequence contamination from other organisms frequently occurs in their large and often fragmented reference genomes⁴¹. An illustrative example in our dataset is the protozoan parasite Toxoplasma gondii, which we readily identified in hits with high ANI and aDNA damage but low support from coverage evenness statistics owing to reads mapping to short contigs representing human contamination (Extended Data Fig. 4a,b). Despite these challenges, we identified nine putative malaria infections across three different human-infecting species (Plasmodium vivax n = 5; Plasmodium malariae n = 3; Plasmodium falciparum n = 1; Extended Data Fig. 7 and Supplementary Table 2). The most widely detected parasite species was P. vivax, with the earliest evidence in a Bronze Age individual from central Europe (RISE564, 4,750–3,750 bp based on an archaeological context). Other cases include a medieval individual from central Asia (DA204, Kazakhstan; 1,053–1,025 cal. bp) and two Viking Age individuals from eastern Europe (VK224, 950–750 bp and VK253, 950–850 bp; Russia). The P. vivax malaria vector Anopheles atroparvus is at present widespread in Europe and nearby regions, including the Pontic Steppe, and our cases indicate this was also true in the past⁴². The single case of P. falciparum malaria was found in a sample from Armenia (NEO111; 463-0 cal. bp), where malaria was eliminated in the 1960s⁴³.

Among DNA viral species, we found widespread infections with HBV (28 cases, 2.1% detection rate), consistent with previous studies^20,21,22 (Extended Data Fig. 6e). Our newly reported HBV cases include individuals from Mesolithic (Kolyma River, n = 1) and Neolithic (Lake Baikal, n = 3) contexts in Siberia dating back to 9,906–9,665 cal. bp, expanding the spatiotemporal range of ancient HBV into those regions (Extended Data Fig. 7). We also report a putative ancient case (n = 1) of torque teno virus dating back roughly 7,000 years (NEO498, Ukraine; 7,161–6,950 cal. bp). Torque teno virus infects around 80% of the human population today and, although it is not associated with any particular disease, it replicates rapidly in immunocompromised individuals⁴⁴. Other ancient virus hits included viruses not known to infect humans, such as ancient phage DNA (for example, Escherichia phage T4, Proteus virus Isfahan; Supplementary Table 2) and one putative case of an ancient insect virus (Invertebrate iridescent virus 31 (IIV-31)) in a tooth sample of a Viking Age individual from Sweden (VK30, Varnhem; 950–650 bp)⁴⁵. The virus source is probably exogenous, potentially originating from aDNA of food sources in the tooth remains.

Coinfections with many pathogens can worsen disease progression and outcomes⁴⁶ and they were probably an important morbidity factor in ancient human populations. Searching for individuals showing co-occurrence of distinct ancient microbial species, we identified 15 cases of putative coinfections in our dataset (Supplementary Table 2). A striking case was a Viking Age individual from Norway (VK388), in which we replicated previous results of infection with a probably smallpox-causing variola virus¹⁹ and furthermore found evidence of infection with the leprosy-causing bacterium M. leprae. Another case of possible coinfection with M. leprae was found in VK366, a Viking Age individual from Denmark, who also showed evidence for leptospirosis (L. interrogans). Among the 15 cases, 6 involved coinfections of HBV with non-viral pathogens (Y. pestis n = 3; B. recurrentis n = 2; P. malariae n = 1; Supplementary Table 2). This suggests that some of these cases involved chronic hepatitis, possibly reflecting HBV infection during infancy, when hepatitis becomes chronic in 90–95% of modern cases compared to only 2–6% in adult infections. An intriguing early case of a possible coinfection was found in a Mesolithic hunter-gatherer from Russia (Sidelkino, 11,336–11,181 cal. bp). This individual showed evidence of the respiratory pathogen C. diphtheriae and Helicobacter pylori, usually restricted to gastric infections; however, rare contemporary examples of bacteraemia have been reported for both^47,48. Overall, our results show that coinfections can be detected using ancient metagenomic screening but are probably underestimated given methodological limitations such as differences in pathogen load, tissue availability and other factors affecting detectability of ancient microbial DNA.

Temporal dynamics and drivers of incidence

Understanding the factors affecting the dynamics of past epidemics is a main aim of paleoepidemiology. Our dataset allows us to address this question using direct molecular evidence for ancient pathogens across prehistory. To investigate changes in pathogen incidence over time, we performed Bayesian change-point detection and time series decomposition on two pathogens with high detection rates, Y. pestis (plague) and B. recurrentis (LBRF), using the detection rate of the respective pathogen as a proxy for its incidence (Methods). For plague, we inferred a gradual rise in detection rate starting from roughly 6,000 bp, about 1,000 years after the estimated time to the most recent common ancestor of now-known ancient strains (7,100 cal. bp)³³. It reached a first peak around 5,000 bp across Europe and the Eurasian Steppe, coinciding with the emergence and early spread of the LNBA− strains, believed to have had limited flea-borne transmissibility^15,49 (Fig. 4). Detection remained high with extra peaks for a roughly 3,000-year period, until an abrupt change around 2,800 bp led to a roughly 800-year period in which plague was only detected in one sample (VK522, Oland, Sweden 2,343–2,154 cal. bp). Starting at roughly 2,000 bp, plague reappeared in three samples from central Asia (DA92, DA101, DA104, Kazakhstan and Kyrgyzstan; Extended Data Fig. 7 and Supplementary Table 2), just before the first historically documented plague pandemic (Fig. 4). Another hiatus of roughly 600 years led to a rise and peak associated with the second plague pandemic roughly 600 bp (European late medieval cases, Denmark and previously published cases; Fig. 4). This pattern of change coincides with the extinction of the LNBA− strains roughly 2,700 bp (ref. ⁴⁹) and the second Y. pestis diversification event starting roughly 3,700 bp, which gave rise to an extinct Bronze Age lineage (RT5, LNBA+)⁵⁰ and present-day lineages; these had increased flea-mediated transmission adaptations favouring bubonic plague and led to all known later plague pandemics⁵¹. The adaptations included acquiring two plasmids: one with the ymt gene for survival in the flea midgut and another with the pla gene for invasiveness after transmission⁵². The lack of detection during both periods is also seen in publicly available ancient Y. pestis genomes from other Eurasian sites, suggesting that sampling bias is unlikely to substantially influence the observed dynamics.

**Fig. 4: Bayesian time series decomposition of main epidemic pathogens.**

The inferred temporal dynamics of LBRF show a first peak in detection around 5,500 bp, slightly more recent than for plague, but with more sporadic occurrences and sharper peaks during the first roughly 2,000 years (Fig. 4). The geographic extent during the early period ranges from Scandinavia (NEO29, Denmark, 5,647–5,471 cal. bp) to the Altai mountains (RISE503, Russia, 3,677–3,461 cal. bp) (Fig. 3 and Supplementary Table 2). From roughly 2,800 bp, LBRF was detected more consistently, peaking around 2,000 years ago, predominantly in the Eurasian Steppe region (Fig. 3). This change from epidemic outbreaks to endemicity overlaps in time with the estimated emergence of a distinct B. recurrentis Iron Age clade³⁴ (Supplementary Information 5). The period of high LBRF detection coincided with a time without detectable plague activity (Fig. 4), reinforcing that the absence of plague is not due to sample size limitations or poor DNA preservation. This opposing pattern is unlikely to result from any cross-immunity between Y. pestis and B. recurrentis but could plausibly, in part, be caused by population size decreases and behavioural and societal adjustments during plague epidemics. LBRF remained detectable until the end of the time series, particularly in Europe; the continued presence might have facilitated the emergence of a medieval B. recurrentis clade roughly 600 years bp (ref. ³⁴) (Supplementary Information 5) (Figs. 3 and 4).

A striking feature shared in the temporal dynamics of plague and LBRF was the absence of detectable cases before roughly 6,000 bp, coinciding with a transition of individuals in predominantly hunter-gatherer contexts to those in farming or pastoralist cultural contexts. It has been proposed that this transition led to a higher risk of zoonotic disease transmission and facilitated the spread of both old and new pathogens³. Our dataset allows us to test this hypothesis using molecular evidence for infectious disease burden. To increase power to detect changes in the load of different pathogen types, we focused on grouped ancient microbial hit categories (Supplementary Table 4).

We found that species associated with the ancient oral microbiome showed the highest relative detection rate, accounting for up to 50% of ancient hits across various periods (Fig. 5a and Extended Data Fig. 9). Species in the ‘environmental’ classes of probably exogenous origins were also detected at consistent rates throughout time. Species in the ‘infection’ classes occurred at low detection rates throughout (mostly less than 10%). We found that species in the ‘zoonotic’ reservoir classes were not detected until around 6,500 bp (Fig. 5a). Using Bayesian time series decomposition, we inferred an overall increase in the detection rates of the zoonotic reservoir classes from roughly 6,000 bp, remaining at elevated levels until the medieval period (Fig. 5b and Extended Data Fig. 9a). Whereas species in the ‘anthroponotic’ reservoir classes also occur earlier (predominantly species with human-to-human transmission, Extended Data Fig. 10a), we observe increased detection rates from roughly 2,500 bp onwards (Fig. 5b and Extended Data Fig. 9). Our results thus provide direct evidence for an epidemiological transition of increased infectious disease burden after the onset of agriculture through to historical times.

**Fig. 5: Time series of ancient microorganisms by microbial source.**

We used Bayesian spatiotemporal modelling to investigate possible drivers of the observed ancient microbial incidences. We modelled the presence or absence of either individual microbial species or combined species groups using sets of putative covariates, including spatiotemporal variables (longitude, latitude and sample age), paleoclimatic variables (mean annual temperature and precipitation), human mobility and ancestry, sample material (tooth or other) and a proxy for ‘detectability’ (the number of human-classified reads). In the models for the zoonotic or anthroponotic infection species classes, sample age was an important predictor, consistently negatively associated with incidence, and high effect sizes in the individual species models for B. recurrentis and L. interrogans (Extended Data Fig. 11 and Supplementary Table 6). Longitude was another important factor in the infection classes; it was positively associated with incidence rates for the combined anthroponotic class, and in individual models for Y. pestis and B. recurrentis. The positive effect of longitude suggests a higher detection rate in the eastern part of our spatiotemporal range, where samples from the Eurasian Steppe predominate.

The increased infection incidence in Steppe populations could reflect an increased genetic susceptibility or a higher risk of acquiring diseases associated with the pastoralist lifestyle. The latter suggestion seems more plausible as continued exposure to selective pressures from certain infectious diseases probably would reduce susceptibility in these populations. Human ancestry showed small but consistent positive effects in some models, particularly the infection classes, for the Caucasus hunter-gatherers. Across all models, the incidence of ancient microorganisms was positively associated with teeth as sample material; the highest effect sizes were found in the oral microbiome and infection classes (Extended Data Fig. 11). Teeth preserved ancient oral microbiome and pathogen DNA better than petrous bones (the source of 86% of our samples), probably because of oral cavity exposure and better access to microbial DNA in the bloodstream⁵³. These results support the notion that species detected in those classes are predominantly of endogenous origin.

Conclusions

During the Holocene, human lifestyles changed considerably as agriculture, animal husbandry and pastoralism became key practices but the impact on infectious disease incidence is debated. Our study represents a large-scale characterization of ancient pathogens across Eurasia, providing clear evidence that identifiable zoonotic pathogens emerged around 6,500 years ago and were consistently detected after 6,000 years ago. Although zoonotic cases probably existed before 6,500 years ago, the risk and extent of zoonotic transmission probably increased with the widespread adoption of husbandry practices and pastoralism. Today, zoonoses account for more than 60% of newly emerging infectious diseases⁵⁴.

We observed some of the highest detection rates at roughly 5,000 bp, a time of substantial demographic changes in Europe due to the migration of Steppe pastoralists and the displacement of earlier populations^4,5. Steppe pastoralists, through their long-term continuous exposure to animals, probably developed some immunity to certain zoonoses and their dispersals may have carried these diseases westwards and eastwards. Consequently, the genetic upheaval in Europe could have been facilitated by epidemic waves of zoonotic diseases causing population declines, with depopulated areas subsequently being repopulated by opportunistic settlers who intermixed with the remaining original population. This scenario would mirror the population decline of Indigenous people in the Americas following their exposure to diseases introduced by European colonists^55,56. Our findings support the interpretation of increased pathogen pressure as a likely driver of positive selection on immune genes associated with the risk of multiple sclerosis in Steppe populations roughly 5,000 years ago⁵⁷, and immune gene adaptations having occurred predominantly after the onset of the Bronze Age in Europe⁹.

Expanding our analyses to the broader pathogen landscape allowed us to infer and contrast incidence patterns between different species and types of pathogens to a greater extent than previously possible. If ancient pathogen DNA of a single species is not detected in a particular region or period, asserting whether this is due to low disease incidence or confounding factors such as differential DNA preservation between different periods and environments is challenging. Our analyses counter these limitations; we demonstrate that pathogens with known epidemic potential and high detection rates, such as Y. pestis (plague) and B. recurrentis (LBRF), show notable differences in their detection rate over time, suggesting that low detection rates in these cases represent an actual reduction in incidence. During the early period (roughly 5,700–2,700 years ago), the continuous detection of Y. pestis is suggestive of endemic disease. The succeeding pattern of distinct waves and periods without detection indicate epidemic outbreaks; these detection peaks match the historically described plague pandemics. This shift from endemic to epidemic is concurrent with important changes in the Y. pestis genome, particularly increased flea-transmissibility and pathogenicity^15,50. The pattern for B. recurrentis is almost entirely the opposite, with narrow peaks and long periods without detection, suggesting local epidemics before roughly 2,700 years ago and consistent detection afterwards. This later endemicity of LBRF could be driven by changes in the bacterial genome and by human and environmental factors known to increase the risk of louse infestation^34,58,59. Experimental studies have demonstrated that Y. pestis, like B. recurrentis, can infect body lice in the midgut and, sometimes, also the Pawlowsky gland, a putative salivary gland⁵⁹. Body lice infected in the Pawlowsky gland can transmit Y. pestis in concentrations sufficient to initiate disease in humans, possibly contributing to transmission during plague outbreaks. Infected body lice have higher mortality than uninfected lice, and it remains unknown whether coinfection of body lice with Y. pestis and B. recurrentis is possible.

Our study has some important limitations. Whereas ancient shotgun metagenomic data offer direct evidence of past infections, their usefulness depends on having a high pathogen load and the right tissue samples. Our ancient tooth and bone samples are well suited to detect high-load bloodstream infections such as Y. pestis and B. recurrentis, but pathogens with lower loads or different tissue preferences are underrepresented. Moreover, differentiating ancient infections from those arising from environmental sources, such as the necrobiome, is challenging. Finally, our dataset lacks information on RNA viruses, therefore underestimating the zoonotic disease burden. However, the timing is probably accurate as the conditions favouring zoonotic transmission of RNA viruses are similar to those of other zoonotic pathogens⁶⁰.

Our findings demonstrate how the nascent field of genomic paleoepidemiology can create a map of the spatial and temporal distribution of diverse human pathogens over millennia. This map will develop as more ancient specimens are investigated, as will our abilities to match their distribution with genetic, archaeological and environmental data. Our current map shows clear evidence that lifestyle changes in the Holocene led to an epidemiological transition, resulting in a greater burden of zoonotic infectious diseases. This transition profoundly affected human health and history throughout the millennia and continues to do so today.

Methods

Dataset

We compiled a dataset of aDNA shotgun-sequencing data from 1,313 ancient individuals previously sequenced for studies of human population history (references for previous publications describing laboratory procedures and sample and/or site descriptions in Supplementary Table 1). To facilitate ancient microbial DNA authentication, we excluded sequencing libraries subjected to uracil-DNA glycosylase (UDG) treatment that removes characteristic aDNA damage patterns from further analyses. Samples sequenced across several libraries were combined into single analysis units to maximize sensitivity for detection of ancient microbial DNA present in low abundance.

Ancient microbial DNA screening

We carried out screening for ancient microbial DNA using a computational workflow combining k-mer-based taxonomic classification, read mapping and aDNA authentication. We first performed taxonomic classification of the sequencing reads (minimum read length 30 base pairs (bp)) using KrakenUniq⁶¹, against a comprehensive database of complete bacterial, archaeal, viral, protozoan genomes in the RefSeq database (built with default parameters of k-mer size 31 and low-complexity sequences masked). To increase sensitivity for ancient viral DNA, we reran the classification on a viral-specific database of complete viral genomes and neighbour assemblies from RefSeq (https://www.ncbi.nlm.nih.gov/genome/viruses/about/assemblies/), using all reads classified as non-human from the previous run.

Following this initial metagenomic classification, a subset of genera was further processed in the genus-level read mapping and authentication stages. For bacterial pathogens, we selected genera with two or more established species of human pathogens from a recent survey of human bacterial pathogens² (n = 125 genera). Genera with a single pathogenic species were not included to balance between including genera responsible for substantial human pathogenic burden and computational feasibility. We further included genera including human protozoan pathogens (n = 11 genera), as well as all viral genera (n = 1,356).

For each genus of interest showing more than or equal to 50 unique k-mers assigned, all sequencing reads classified were collected and aligned in parallel against a representative reference assembly for each individual species within the genus. We selected the assembly with the most unique k-mers assigned as the representative reference genome for each species in a particular sample. If no reads were assigned to any assembly of the species in KrakenUniq, we selected the first assembly for mapping. Read mapping against the selected assembly was carried out using bowtie2 (ref. ⁶²), using the ‘very sensitive’ preset and allowing one mismatch in the seed (-N 1 option). Mapped BAM files were subjected to duplicate marking using ‘samtools markdup’⁶³, and filtered for mapping quality MAPQ ≥ 20. The aDNA damage rates were estimated using metaDMG⁶⁴.

Authentication of ancient microbial DNA

To authenticate ancient microbial DNA, we calculated sets of summary statistics quantifying expected molecular characteristics of true positive ancient microbial DNA hits⁶⁵:

Similarity to the reference assembly

Summary statistics in this category measure how similar sequencing reads are to a particular reference assembly, with true positive hits expected to show higher similarity than false-positive hits. Summary statistics used include the following.

Average edit distance

This describes the average number of mismatches in sequencing reads mapped to a particular reference (lower means more similar to the reference).

ANI

The ANI is the average number of bases in a mapped sequencing read matching the reference assembly, normalized by the read length (higher means more similar to the reference).

Number of unique k-mers assigned

The number of unique k-mers that are assigned to a particular reference assembly from the KrakenUniq classification (higher means more similar to the reference).

aDNA characteristics

Summary statistics in the aDNA characteristics category measure the evidence for sequencing reads deriving from an aDNA source. Summary statistics used include the following.

Average read length

This describes the average length in base pairs of sequencing reads mapped to a particular reference (shorter means more likely to be ancient).

Terminal aDNA substitution rates

The terminal aDNA substitution rate is the frequency of C>T (G>A) substitutions observed at the 5′ (3′) terminal base across all sequencing reads mapped to a particular reference (higher means more likely to be ancient).

Bayesian D _max

Bayesian D_max is an estimator of the aDNA damage rate from metaDMG (higher means more likely to be ancient).

Bayesian Z

Bayesian Z is an estimator of the significance of evidence for the aDNA damage rate from metaDMG (higher means more likely to be ancient).

Evenness of genomic coverage

Summary statistics in this category measure how evenly mapped sequencing reads are distributed across a reference assembly. Summary statistics used include the following.

Average read depth

The average read depth is the average number of reads covering a base in the reference assembly.

Breadth of coverage

The breadth of coverage describes the fraction of the reference assembly that is covered by one or more sequencing reads.

Expected breadth of coverage

The breadth of coverage expected for a particular average read depth is calculated⁶⁶ as

$$1-{{\rm{e}}}^{-({\rm{average\; read\; depth}})}$$

Ratio of observed to expected breadth of coverage

This is the ratio of the breadth of coverage observed in the mapping to the breadth of coverage expected given observed average read depth (higher means more even coverage).

Relative entropy of read start positions

This relative entropy is a measure for the information content of the genomic positions of mapped reads. To obtain this statistic, we calculate the frequency of read alignments with their start positions falling within windows along the reference assembly, using two different window sizes (100 and 1,000 bp). The obtained frequency vector is converted into Shannon information entropy, and normalized using the maximum entropy attainable if the same total number of reads were evenly distributed across the windows (higher means more even coverage).

Filtering of putative ancient microbial hits

From this initial screening, we then selected a subset of putative microbial ‘hits’ (sample–species combinations) for further downstream analysis based on a set of aDNA authentication summary statistics:

number of mapped reads greater than or equal to 20
5′ C>T deamination rate greater than or equal to 0.01
3′ G>A deamination rate greater than or equal to 0.01
ratio of observed to expected breadth of coverage greater than or equal to 0.8
relative entropy of read start positions greater than or equal to 0.9
ANI > 0.965
rank of number of unique k-mers assigned less than or equal to 2

For this initial filtered list of putative microbial hits, we ran metaDMG using the full Bayesian inference method to obtain Z scores measuring the strength of evidence for observing aDNA damage.

The final list of putative individual ancient microbial hits was then obtained using the filtering cutoffs

metaDMG Bayesian D_max ≥ 0.05
metaDMG Bayesian Z ≥ 1.5
rank of number of unique k-mers assigned = 1

For authentication of viral species, we used the same filtering cutoffs described above, except for a lower ANI cutoff (greater than 0.95), as well as a lower cutoff for relative entropy of read start positions (greater than 0.7) for short viral genomes (fewer than 10 kilobases).

The result of this filtering is a single best-matching species hit for each sample and genus of interest (Supplementary Table 2). We note that this approach will miss potential cases in which aDNA from many species of the same genus are present in the sample. However, because of the considerable challenges involved in distinguishing this scenario from false positives due to cross-mapping of ancient reads from a single source of DNA to reference assemblies of a closely related species (for example, Y. pestis/Yersinia pseudotuberculosis), we opted for the conservative option of retaining only the best hit for each genus.

To further authenticate putative hits with low read counts (n ≤ 100 final reads), we carried out a BLASTn analysis. We extracted the reads for a species hit from the final filtered BAM files and queried them against the ‘nt’ database (downloaded 18 August 2024) using ‘blastn -task blastn -max_hsps 1’. For the reads of each putative ancient microbial hit, we then tabulated the number of times and proportion of the highest scoring BLAST hits matched either the genus or species inferred from our workflow (Supplementary Table 3).

Simulations of ancient microbial DNA

We simulated aDNA fragments from microbial reference genomes in silico using gargammel⁶⁷. We chose nine species representing pathogens of interest, and for each selected an assembly not present in the pathogen screening workflow database:

Brucella melitensis (GCF_027625455.1)
H. pylori (NZ_CP134396.1)
M. tuberculosis (NZ_CP097110.1)
Salmonella enterica (NZ_CP103966.1)
Y. pestis (NZ_CP064125.2)
Y. pseudotuberculosis (NZ_CP130901.1)
P. vivax (GCA_900093555.2)
variola virus (GCA_037113635.1)
human betaherpesvirus 5 (GCA_027927465.1)

For each reference genome, we simulated 5 million single-end sequencing reads (100 bp read length) with adapter sequences, with read length distribution and damage patterns from a mapDamage2 results of a previously published ancient pathogen genome (RISE509, Y. pestis¹⁵). The full-length simulated reads were then adapter-trimmed using AdapterRemoval⁶⁸. To investigate the ability of the workflow to detect low abundance ancient microorganisms, we randomly down-sampled the full read set for each reference genome using seqtk (https://github.com/lh3/seqtk) (50, 100, 200, 500 reads; ten replicates each).

Topic model analysis

We carried out topic model analysis on taxonomic classification profiles for each sample using the R package fastTopics⁶⁹ (https://github.com/stephenslab/fastTopics). We used the number of unique k-mers assigned to non-human genera from KrakenUniq as the observed count data for each sample, excluding genera with fewer than 50 unique k-mers assigned. The analysis was carried out for L = 2 and L = 3 topics to capture broad structures in the classification profiles.

Ancient microbial groups

For combined analyses, we grouped the ancient microbial hits into three categories on the basis of the likely source of the microbial DNA (Supplementary Table 4):

1.
Environmental, to capture all hits derived from environmental sources including the necrobiome (labelled environment_background, environment_pathogen, to distinguish potential pathogenic species from non-pathogenic ones)
2.
Oral microbiome, including both commensal and pathogenic species (microbiome_oral)
3.
Probably pathogenic infections, further distinguished into different modes of transmission (infection_anthroponotic; infection_vector_borne; infection_zoonotic).

We define zoonotic pathogens here as those transmitted from animals to humans or which made such a host jump in our sampling time frame⁷⁰.

Time series

To infer temporal dynamics of ancient microbial species, we calculated detection rates in a sliding window of k = 21 temporally consecutive samples across the entire timeline of the 1,266 samples with dating information. For individual species, the detection rate for each window corresponds to the proportion of the 21 samples in each window that were positive for the species of interest. For analyses of species combined in classes, we calculated the detection rate as the ratio of the total number of hits within a class in the window over the total number of possible hits across all species in a window (21 samples × 258 species across all classes). For individual species with n ≥ 20 hits or combined species classes, we further performed Bayesian change-point detection and time series decomposition (BEAST)⁷¹ implemented in the R package Rbeast (https://github.com/zhaokg/Rbeast), using the detection frequencies described above as response variables. Data for previously published Y. pestis and B. recurrentis genomes were obtained from AncientMetagenomeDir (release v.23.12, https://doi.org/10.5281/zenodo.10580647)⁷².

Spatiotemporal models of species incidence

To identify possible drivers of the observed spatiotemporal ancient microbial incidence, we combined the individual microbial species and the combined species groups with palaeoclimatic variables, human mobility estimates and kriged estimates of ancestry composition for Holocene West Eurasia. Palaeoclimatic reconstructions were accessed using the CHELSA-Trace21k data, which provide global monthly climatologies for temperature and precipitation at 30-arcsec spatial resolution in 100-year time steps for the past 21,000 years (ref. ⁷³). To pair the microbial species and groups with the palaeoclimatic reconstructions, we took the average climatic value across all the time steps that fall within the microbial species and/or group age ± standard deviation at each of the sampling locations. Palaeoclimatic variables considered were annual mean temperature (BIO01) and annual precipitation (BIO12). Human mobility values were accessed from ref. ⁷⁴ and roughly represent the distance in kilometres between the burial location of the ancient human individual and its putative ancestral origin on the basis of patterns of genetic similarity derived from multidimensional scaling (MDS) analysis. Microbial species and/or groups were paired with the mobility estimate of the ancient human individual that occurs closest in space and time. Kriged ancestry estimates were extracted from ref. ⁷⁵, using the spatiotemporal ancestry kriging method from Racimo et al.⁷⁶, and paired to the closest spatiotemporal location of the ancient human remain where the corresponding microbial species and/or groups were sampled.

To determine the influence of the covariates on the microbial incidence, we used a hierarchical Bayesian model implemented in the inlabru R package^77,78, in which ancient microbial presence or absence follows a binomial distribution and the spatiotemporal variables (latitude, longitude and sample age), number of human-classified reads, sample material, palaeoclimatic variables, human mobility and human ancestry constitute the linear predictors. The sample material is a categorical variable indicating whether the material used for sequencing was a tooth or not (bone), which inlabru treats as a random effect variable. We followed the default inlabru priors, in which distributions are distributed as a Gaussian variable with mean μ and precision τ. The prior on the precision τ is a Gamma with parameters 1 and 0.00005. The mean is a linear combination of the covariates. By default, the prior on the intercept of the linear combination is a uniform distribution, whereas the priors on the coefficients are Gaussian with zero mean and precision 0.001. All covariates were normalized before the analyses. For each microbial species and group, we tested many models with different sets of covariates: (1) palaeoclimate + mobility + ancestry, (2) palaeoclimate + mobility, (3) palaeoclimate + ancestry, (4) only climate, (5) mobility + ancestry, (6) only mobility, (7) only ancestry and (8) no climate, mobility nor ancestry. Spatiotemporal variables, number of human-classified reads and sample material were included in all models. Because covariates were normalized, results indicate deviations from the mean. The effect size is interpreted in units of standard deviation. We used a criterion (DIC score) that prioritizes both fit and simplicity (low number of effective parameters) for evaluating the performance of hierarchical Bayesian models. Parameter distributions of the full model tended not to differ by much to those of the best-performing models under this score, when those parameters were included in the model. Results for both the full and best-performing models (that is, models with the lowest DIC score for each microbial species or combined species group) are shown in Extended Data Fig. 11. DIC scores as well as Watanabe–Akaike information criterion for all models we tested can be found in the Supplementary Table 6.

Geographic maps

All maps were created using the R project for statistical computing⁷⁹, using the sf package⁸⁰. Shape files for coastlines, rivers and lakes were obtained from Natural Earth (https://www.naturalearthdata.com) using the rnaturealearth package⁸¹. Elevation data for Fig. 1 were obtained from AWS Open Data Terrain Tiles (https://registry.opendata.aws/terrain-tiles) using the elevatr package⁸².

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Data for the 907 individuals for whom sequencing data as trimmed read files (FASTQ) are released in this study have been deposited at the European Nucleotide Archive under accession no. PRJEB65256. Accession numbers for sequencing read data of the remaining individuals are provided in Supplementary Table 1. Processed analysis files including the KrakenUniq database file and metagenomic profiling results, microbial species read alignments (BAM format) as well as per-sample summary tables and plots from the screening pipeline are available at https://doi.org/10.17894/ucph.f0f75211-7bc3-445d-90c0-b72a22ba0930.

Code availability

A snakemake workflow implementing the computational screening pipeline is available at https://github.com/martinsikora/pathopipe.

References

Lewis, C. M. Jr, Akinyi, M. Y., DeWitte, S. N. & Stone, A. C. Ancient pathogens provide a window into health and well-being. Proc. Natl Acad. Sci. USA 120, e2209476119 (2023).
Article CAS PubMed PubMed Central Google Scholar
Bartlett, A., Padfield, D., Lear, L., Bendall, R. & Vos, M. A comprehensive list of bacterial pathogens infecting humans. Microbiology https://doi.org/10.1099/mic.0.001269 (2022).
Barrett, R., Kuzawa, C. W., McDade, T. & Armelagos, G. J. Emerging and re-emerging infectious diseases: the third epidemiologic transition. Annu. Rev. Anthropol. 27, 247–271 (1998).
Article Google Scholar
Haak, W. et al. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, 207–211 (2015).
Article CAS PubMed PubMed Central Google Scholar
Allentoft, M. E. et al. Population genomics of Bronze Age Eurasia. Nature 522, 167–172 (2015).
Article CAS PubMed Google Scholar
Volk, A. A. & Atkinson, J. A. Infant and child death in the human environment of evolutionary adaptation. Evol. Hum. Behav. 34, 182–192 (2013).
Harper, K. Plagues Upon the Earth: Disease and the Course of Human History (Princeton Univ. Press, 2021).
Karlsson, E. K., Kwiatkowski, D. P. & Sabeti, P. C. Natural selection and infectious disease in human populations. Nat. Rev. Genet. 15, 379–393 (2014).
Article CAS PubMed PubMed Central Google Scholar
Kerner, G. et al. Genetic adaptation to pathogens and increased risk of inflammatory disorders in post-Neolithic Europe. Cell Genom. 3, 100248 (2023).
Article CAS PubMed PubMed Central Google Scholar
Page, A. E. et al. Reproductive trade-offs in extant hunter-gatherers suggest adaptive mechanism for the Neolithic expansion. Proc. Natl Acad. Sci. USA 113, 4694–4699 (2016).
Article CAS PubMed PubMed Central Google Scholar
Rascovan, N. et al. Emergence and spread of basal lineages of Yersinia pestis during the Neolithic decline. Cell 176, 295–305.e10 (2019).
Article CAS PubMed Google Scholar
Fuchs, K. et al. Infectious diseases and Neolithic transformations: evaluating biological and archaeological proxies in the German loess zone between 5500 and 2500 BCE. Holocene 29, 1545–1557 (2019).
Article Google Scholar
Abegg, C., Desideri, J., Dutour, O. & Besse, M. More than the sum of their parts: reconstituting the paleopathological profile of the individual and commingled Neolithic populations of western Switzerland. Archaeol. Anthropol. Sci. 13, 59 (2021).
Bos, K. I. et al. A draft genome of Yersinia pestis from victims of the Black Death. Nature 478, 506–510 (2011).
Article CAS PubMed PubMed Central Google Scholar
Rasmussen, S. et al. Early divergent strains of Yersinia pestis in Eurasia 5,000 years ago. Cell 163, 571–582 (2015).
Article CAS PubMed PubMed Central Google Scholar
Bos, K. I. et al. Pre-Columbian mycobacterial genomes reveal seals as a source of New World human tuberculosis. Nature 514, 494–497 (2014).
Article CAS PubMed PubMed Central Google Scholar
Vågene, Å. J. et al. Geographically dispersed zoonotic tuberculosis in pre-contact South American human populations. Nat. Commun. 13, 1195 (2022).
Article PubMed PubMed Central Google Scholar
Duggan, A. T. et al. 17th Century variola virus reveals the recent history of smallpox. Curr. Biol. 26, 3407–3412 (2016).
Article CAS PubMed PubMed Central Google Scholar
Mühlemann, B. et al. Diverse variola virus (smallpox) strains were widespread in northern Europe in the Viking Age. Science 369, eaaw8977 (2020).
Article PubMed Google Scholar
Mühlemann, B. et al. Ancient hepatitis B viruses from the Bronze Age to the Medieval period. Nature 557, 418–423 (2018).
Kocher, A. et al. Ten millennia of hepatitis B virus evolution. Science 374, 182–188 (2021).
Article CAS PubMed Google Scholar
Krause-Kyora, B. et al. Neolithic and medieval virus genomes reveal complex evolution of hepatitis B. eLife 7, e36666 (2018).
Article PubMed PubMed Central Google Scholar
Guellil, M. et al. Genomic blueprint of a relapsing fever pathogen in 15th century Scandinavia. Proc. Natl Acad. Sci. USA 115, 10422–10427 (2018).
Article CAS PubMed PubMed Central Google Scholar
Schuenemann, V. J. et al. Genome-wide comparison of medieval and modern Mycobacterium leprae. Science 341, 179–183 (2013).
Article CAS PubMed Google Scholar
Guellil, M. et al. Ancient herpes simplex 1 genomes reveal recent viral structure in Eurasia. Sci. Adv. 8, eabo4435 (2022).
Article CAS PubMed PubMed Central Google Scholar
van Dorp, L. et al. Plasmodium vivax malaria viewed through the lens of an eradicated European strain. Mol. Biol. Evol. 37, 773–785 (2020).
Article PubMed Google Scholar
Key, F. M. et al. Emergence of human-adapted Salmonella enterica is linked to the Neolithization process. Nat. Ecol. Evol. 4, 324–333 (2020).
Article PubMed PubMed Central Google Scholar
Mühlemann, B. et al. Ancient human parvovirus B19 in Eurasia reveals its long-term association with humans. Proc. Natl Acad. Sci. USA 115, 7557–7562 (2018).
Article PubMed PubMed Central Google Scholar
Bonczarowska, J. H. et al. Pathogen genomics study of an early medieval community in Germany reveals extensive co-infections. Genome Biol. 23, 250 (2022).
Article CAS PubMed PubMed Central Google Scholar
Abdoun, A., Amir, N. & Fatima, M. Thanatomicrobiome in forensic medicine. New Microbiol. 46, 236–245 (2023).
CAS PubMed Google Scholar
Burcham, Z. M. et al. A conserved interdomain microbial network underpins cadaver decomposition despite environmental variables. Nat. Microbiol. 9, 595–613 (2024).
Article CAS PubMed PubMed Central Google Scholar
Macleod, R. et al. Lethal plague outbreaks in Lake Baikal Hunter-gatherers 5500 years ago. Preprint at bioRxiv https://doi.org/10.1101/2024.11.13.623490 (2024).
Susat, J. et al. A 5,000-year-old hunter-gatherer already plagued by Yersinia pestis. Cell Rep. 35, 109278 (2021).
Article CAS PubMed Google Scholar
Swali, P. et al. Ancient Borrelia genomes document the evolutionary history of louse-borne relapsing fever. Science 388, 836–846 (2025).
Avanzi, C. et al. Red squirrels in the British Isles are infected with leprosy bacilli. Science 354, 744–747 (2016).
Article CAS PubMed Google Scholar
Hennius, A. Outlanders?: Resource Colonisation, Raw Material Exploitation and Networks in Middle Iron Age Sweden. PhD thesis, Uppsala Univ. (2021).
Urban, C. et al. Ancient Mycobacterium leprae genome reveals medieval English red squirrels as animal leprosy host. Curr. Biol. 34, 2221–2230.e8 (2024).
Article CAS PubMed Google Scholar
Pfrengle, S. et al. Mycobacterium leprae diversity and population dynamics in medieval Europe from novel ancient genomes. BMC Biol. 19, 220 (2021).
Article CAS PubMed PubMed Central Google Scholar
Bulach, D. M. et al. Genome reduction in Leptospira borgpetersenii reflects limited transmission potential. Proc. Natl Acad. Sci. USA 103, 14560–14565 (2006).
Article PubMed PubMed Central Google Scholar
Rees, C. E., Swift, B. M. & Haldar, P. State-of-the-art detection of Mycobacterium tuberculosis in blood during tuberculosis infection using phage technology. Int. J. Infect. Dis. 141S, 106991 (2024).
Article PubMed Google Scholar
Steinegger, M. & Salzberg, S. L. Terminating contamination: large-scale search identifies more than 2,000,000 contaminated entries in GenBank. Genome Biol. 21, 115 (2020).
Article CAS PubMed PubMed Central Google Scholar
Hay, S. I., Guerra, C. A., Tatem, A. J., Noor, A. M. & Snow, R. W. The global distribution and population at risk of malaria: past, present, and future. Lancet Infect. Dis. 4, 327–336 (2004).
Article PubMed PubMed Central Google Scholar
Davidyants, V. A. et al. Role of malaria partners in malaria elimination in Armenia. Malar. J. 18, 178 (2019).
Article PubMed PubMed Central Google Scholar
Roberto, P. et al. Torque teno virus (TTV): a gentle spy virus of immune status, predictive marker of seroconversion to COVID-19 vaccine in kidney and lung transplant recipients. J. Med. Virol. 95, e28512 (2023).
Article CAS PubMed PubMed Central Google Scholar
İnce, İ. A., Özcan, O., Ilter-Akulke, A. Z., Scully, E. D. & Özgen, A. Invertebrate iridoviruses: a glance over the last decade. Viruses 10, 161 (2018).
Article PubMed PubMed Central Google Scholar
Singer, M., Bulled, N., Ostrach, B. & Mendenhall, E. Syndemics and the biosocial conception of health. Lancet 389, 941–950 (2017).
Article PubMed Google Scholar
Zasada, A. A., Zaleska, M., Podlasin, R. B. & Seferynska, I. The first case of septicemia due to nontoxigenic Corynebacterium diphtheriae in Poland: case report. Ann. Clin. Microbiol. Antimicrob. 4, 8 (2005).
Article PubMed PubMed Central Google Scholar
Han, X. Y., Tarrand, J. J., Dickey, B. F. & Esteva, F. J. Helicobacter pylori bacteremia with sepsis syndrome. J. Clin. Microbiol. 48, 4661–4663 (2010).
Article PubMed PubMed Central Google Scholar
Andrades Valtueña, A. et al. Stone Age genomes shed light on the early evolution, diversity, and ecology of plague. Proc. Natl Acad. Sci. USA 119, e2116722119 (2022).
Article PubMed PubMed Central Google Scholar
Spyrou, M. A. et al. Analysis of 3800-year-old Yersinia pestis genomes suggests Bronze Age origin for bubonic plague. Nat. Commun. 9, 2234 (2018).
Article PubMed PubMed Central Google Scholar
Demeure, C. et al. Yersinia pestis and plague: an updated view on evolution, virulence determinants, immune subversion, vaccination and diagnostics. Microbes Infect. 21, 202–212 (2019).
Article PubMed Google Scholar
Sun, Y.-C., Jarrett, C. O., Bosio, C. F. & Hinnebusch, B. J. Retracing the evolutionary path that led to flea-borne transmission of Yersinia pestis. Cell Host Microbe 15, 578–586 (2014).
Article CAS PubMed PubMed Central Google Scholar
Margaryan, A. et al. Ancient pathogen DNA in human teeth and petrous bones. Ecol. Evol. 8, 3534–3542 (2018).
Article PubMed PubMed Central Google Scholar
Jones, K. E. et al. Global trends in emerging infectious diseases. Nature 451, 990–993 (2008).
Article CAS PubMed PubMed Central Google Scholar
Meltzer, D. J. First Peoples in a New World: Colonizing Ice Age America (Univ. California Press, 2009).
Collen, E. J., Johar, A. S., Teixeira, J. C. & Llamas, B. The immunogenetic impact of European colonization in the Americas. Front. Genet. 13, 918227 (2022).
Article CAS PubMed PubMed Central Google Scholar
Barrie, W. et al. Elevated genetic risk for multiple sclerosis emerged in steppe pastoralist populations. Nature 625, 321–328 (2024).
Cutler, S. J. Relapsing fever–a forgotten disease revealed. J. Appl. Microbiol. 108, 1115–1122 (2010).
Article CAS PubMed Google Scholar
Bland, D. M., Long, D., Rosenke, R. & Hinnebusch, B. J. Yersinia pestis can infect the Pawlowsky glands of human body lice and be transmitted by louse bite. PLoS Biol. 22, e3002625 (2024).
Article CAS PubMed PubMed Central Google Scholar
Ellwanger, J. H. & Chies, J. A. B. Zoonotic spillover: understanding basic aspects for better prevention. Genet. Mol. Biol. 44, e20200355 (2021).
Article CAS PubMed PubMed Central Google Scholar
Breitwieser, F. P., Baker, D. N. & Salzberg, S. L. KrakenUniq: confident and fast metagenomics classification using unique k-mer counts. Genome Biol. 19, 198 (2018).
Article CAS PubMed PubMed Central Google Scholar
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Article CAS PubMed PubMed Central Google Scholar
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).
Article PubMed PubMed Central Google Scholar
Michelsen, C. et al. MetaDMG—a fast and accurate ancient DNA damage toolkit for metagenomic data. Preprint at bioRxiv https://doi.org/10.1101/2022.12.06.519264 (2022).
Warinner, C. et al. A robust framework for microbial archaeology. Annu. Rev. Genomics Hum. Genet. 18, 321–356 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lander, E. S. & Waterman, M. S. Genomic mapping by fingerprinting random clones: a mathematical analysis. Genomics 2, 231–239 (1988).
Article CAS PubMed Google Scholar
Renaud, G., Hanghøj, K., Willerslev, E. & Orlando, L. gargammel: a sequence simulator for ancient DNA. Bioinformatics 33, 577–579 (2017).
Article CAS PubMed Google Scholar
Schubert, M., Lindgreen, S. & Orlando, L. AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Res. Notes 9, 88 (2016).
Article PubMed PubMed Central Google Scholar
Carbonetto, P., Sarkar, A., Wang, Z. & Stephens, M. Non-negative matrix factorization algorithms greatly improve topic model fits. Preprint at https://doi.org/10.48550/ARXIV.2105.13440 (2021).
WHO. Zoonoses: key facts. World Health Organization https://www.who.int/news-room/fact-sheets/detail/zoonoses (2020).
Zhao, K. et al. Detecting change-point, trend, and seasonality in satellite time series data to track abrupt changes and nonlinear dynamics: a Bayesian ensemble algorithm. Remote Sens. Environ. 232, 111181 (2019).
Article Google Scholar
Fellows Yates, J. A. et al. Community-curated and standardised metadata of published ancient metagenomic samples with AncientMetagenomeDir. Sci. Data 8, 31 (2021).
Article PubMed PubMed Central Google Scholar
Karger, D. N., Nobis, M. P., Normand, S., Graham, C. H. & Zimmermann, N. E. CHELSA-TraCE21k–high-resolution (1 km) downscaled transient temperature and precipitation data since the Last Glacial Maximum. Clim. Past 19, 439–456 (2023).
Article Google Scholar
Schmid, C. & Schiffels, S. Estimating human mobility in Holocene western Eurasia with large-scale ancient genomic data. Proc. Natl Acad. Sci. USA 120, e2218375120 (2023).
Article CAS PubMed PubMed Central Google Scholar
Allentoft, M. E. et al. Population genomics of post-glacial western Eurasia. Nature 625, 301–311 (2024).
Racimo, F. et al. The spatiotemporal spread of human migrations during the European Holocene. Proc. Natl Acad. Sci. USA 117, 8989–9000 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lindgren, F. & Rue, H. Bayesian spatial modelling with R-INLA. J. Stat. Softw. 63, 1–25 (2015).
Bachl, F. E., Lindgren, F., Borchers, D. L. & Illian, J. B. inlabru: an R package for Bayesian spatial modelling from ecological survey data. Methods Ecol. Evol. 10, 760–766 (2019).
Article Google Scholar
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2014); https://www.R-project.org/
Pebesma, E. Simple features for R: standardized support for spatial vector data. R J. 10, 439–446 (2018).
Article Google Scholar
Massicotte, P. & South, A. rnaturalearth: World Map data from Natural Earth. OpenSci https://docs.ropensci.org/rnaturalearth/ (2025).
Hollister, J. et al. elevatr: access elevation data from various APIs. Zenodo https://doi.org/10.5281/zenodo.8335450 (2023).

Download references

Acknowledgements

The Lundbeck Foundation GeoGenetics Centre is supported by the Lundbeck Foundation (grant nos. R302-2018-2155, R155-2013-16338), the Novo Nordisk Foundation (grant no. NNF18SA0035006), the Wellcome Trust (grant no. UNS69906), Carlsberg Foundation (grant no. CF18-0024), the Danish National Research Foundation (grant nos. DNRF94, DNRF174), the University of Copenhagen (KU2016 programme) and Ferring Pharmaceuticals A/S to E.W. Extra support was provided by Germany’s Excellence Strategy (EXC-2077), project no. 390741603 ‘The Ocean Floor – Earth’s Uncharted Interface’. We thank A. Razeto and P. Selmer Olsen, for administrative and technical assistance. We thankfully acknowledge Illumina Inc. for collaboration. E.W. thanks St. John’s College, Cambridge, for providing a stimulating environment of discussion and learning. This work was further supported by the Swedish Foundation for Humanities and Social Sciences grant (Riksbankens Jubileumsfond grant no. M16-0455:1) to K.K. M.S. was supported by Maritime encounters, Riksbankens Jubileumsfond, grant no. M21-0018. M.E.A. was supported by Marie Skłodowska-Curie Actions of the EU (grant no. 300554), The Villum Foundation (grant no. 10120) and Independent Research Fund Denmark (grant no. 7027-00147B). G.S. is supported by the Marie Skłodowska-Curie Individual Fellowship ‘PALAEO-ENEO’ (grant agreement number 751349). H.S. was supported by a Carlsberg Semper Ardens grant (no. CF19-0601) and a European Research Council (ERC) Consolidator Grant (grant no. 101045643). F.R. is supported by a Villum Young Investigator Grant (project no. 00025300), a Novo Nordisk Fonden Data Science Ascending Investigator Award (grant no. NNF22OC0076816) and by the ERC under the European Union’s Horizon Europe programme (grant agreement nos. 101077592 and 951385). F.V.S. was supported by the Lundbeck Foundation (grant no. R322-2019-2610). N.O., R.Å., L.H. and B.N. are financially supported by Knut and Alice Wallenberg Foundation as part of the National Bioinformatics Infrastructure Sweden at SciLifeLab. A.K.N.I. and L.F. thank the OAK Foundation.

Author information

Authors and Affiliations

Centre for Ancient Environmental Genomics and The Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
Martin Sikora, Antonio Fernandez-Guerra, Gabriele Scorrano, Morten E. Allentoft, Frederik Valeur Seersholm, Charleen Gaunitz, Jesper Stenderup, Lasse Vinner, Kristian Kristiansen, Astrid K. N. Iversen & Eske Willerslev
Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Copenhagen, Denmark
Elisabetta Canteri, Evan K. Irving-Pease, Hannes Schroeder & Fernando Racimo
Department of Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Lund University, Lund, Sweden
Nikolay Oskolkov
Department of Biology and Biological Engineering, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Chalmers University of Technology, Gothenburg, Sweden
Rasmus Ågren
Definitive Healthcare, Gothenburg, Sweden
Lena Hansson
Institute of Virology, Charité – Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, Berlin, Germany
Barbara Mühlemann & Terry C. Jones
German Centre for Infection Research (DZIF), Partner Site Charité, Berlin, Germany
Barbara Mühlemann & Terry C. Jones
Department of Bacteria, Parasites and Fungi, Statens Serum Institut, Copenhagen, Denmark
Sofie Holtsmark Nielsen
Center for Molecular Anthropology for the Study of Ancient DNA, Department of Biology, University of Rome ‘Tor Vergata’, Rome, Italy
Gabriele Scorrano
Trace and Environmental DNA (TrEnD) Laboratory, School of Molecular and Life Sciences, Curtin University, Perth, Western Australia, Australia
Morten E. Allentoft
Centre for Pathogen Evolution, Department of Zoology, University of Cambridge, Cambridge, UK
Terry C. Jones
Department of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
Björn Nystedt
Department of Historical Studies, University of Gothenburg, Gothenburg, Sweden
Karl-Göran Sjögren & Kristian Kristiansen
Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
Julian Parkhill
Oxford Centre for Neuroinflammation, Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK
Lars Fugger
MRC Human Immunology Unit, John Radcliffe Hospital, University of Oxford, Oxford, UK
Lars Fugger
Nuffield Department of Clinical Neurosciences, Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
Lars Fugger & Astrid K. N. Iversen
Department of Genetics, University of Cambridge, Cambridge, UK
Eske Willerslev
MARUM Center for Marine Environmental Sciences and Faculty of Geosciences, University of Bremen, Bremen, Germany
Eske Willerslev

Authors

Martin Sikora
View author publications
Search author on:PubMed Google Scholar
Elisabetta Canteri
View author publications
Search author on:PubMed Google Scholar
Antonio Fernandez-Guerra
View author publications
Search author on:PubMed Google Scholar
Nikolay Oskolkov
View author publications
Search author on:PubMed Google Scholar
Rasmus Ågren
View author publications
Search author on:PubMed Google Scholar
Lena Hansson
View author publications
Search author on:PubMed Google Scholar
Evan K. Irving-Pease
View author publications
Search author on:PubMed Google Scholar
Barbara Mühlemann
View author publications
Search author on:PubMed Google Scholar
Sofie Holtsmark Nielsen
View author publications
Search author on:PubMed Google Scholar
Gabriele Scorrano
View author publications
Search author on:PubMed Google Scholar
Morten E. Allentoft
View author publications
Search author on:PubMed Google Scholar
Frederik Valeur Seersholm
View author publications
Search author on:PubMed Google Scholar
Hannes Schroeder
View author publications
Search author on:PubMed Google Scholar
Charleen Gaunitz
View author publications
Search author on:PubMed Google Scholar
Jesper Stenderup
View author publications
Search author on:PubMed Google Scholar
Lasse Vinner
View author publications
Search author on:PubMed Google Scholar
Terry C. Jones
View author publications
Search author on:PubMed Google Scholar
Björn Nystedt
View author publications
Search author on:PubMed Google Scholar
Karl-Göran Sjögren
View author publications
Search author on:PubMed Google Scholar
Julian Parkhill
View author publications
Search author on:PubMed Google Scholar
Lars Fugger
View author publications
Search author on:PubMed Google Scholar
Fernando Racimo
View author publications
Search author on:PubMed Google Scholar
Kristian Kristiansen
View author publications
Search author on:PubMed Google Scholar
Astrid K. N. Iversen
View author publications
Search author on:PubMed Google Scholar
Eske Willerslev
View author publications
Search author on:PubMed Google Scholar

Contributions

M.S. and E.W. conceptualized the study. M.S., E.C., A.F.-G., S.H.N., A.K.N.I. and F.V.S analysed data. M.S., E.C., A.F.-G., N.O., R.Å., L.H., E.K.I.-P., B.M., S.H.N. and H.S. were involved in method development and implementation. G.S., M.E.A., F.V.S., H.S., C.G., J.S. and L.V. were involved in data generation. M.S., M.E.A., K.-G.S. and K.K. curated bioarchaeological data. M.S. T.C.J., B.N., J.P., L.F., F.R. and E.W. supervised the research. M.S., A.K.N.I. and E.W. wrote the first draft of the paper. M.S., A.K.N.I., L.F., J.P. and E.W. were involved in reviewing drafts and editing. M.S., K.-G.S. and A.K.N.I. wrote the Supplementary Information. All co-authors read, commented on and agreed on the submitted manuscript.

Corresponding authors

Correspondence to Martin Sikora, Astrid K. N. Iversen or Eske Willerslev.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Jonathan Dushoff and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Workflow overview and metagenome composition.

a, Workflow overview. b, Distribution of total number of sequencing reads screened across the n = 1,313 study samples. c, Violin plots showing distributions of proportions of reads classified as human, non-human or not classified for the study samples. Median values for each genus are indicated by horizontal lines. d, Violin plots showing fraction of reads classified on the taxonomic level of genus, for the top 20 most abundant genera. e, Barplots showing inferred proportions for L = 3 topics (indicated by fill colour) from topic model analysis for n = 1,272 study samples with sample material information. f, Factor loadings for the 10 highest loading genera for each of the L = 3 topics from the topic model analysis. g, Boxplots showing distributions of proportions for topic K3 (associated with oral microbiome taxa) in different sample materials.

Extended Data Fig. 2 Reference genome similarity in simulated ancient microbial data.

a, Illustration showing phylogenetic context and expected average nucleotide identity (ANI) for a hypothetical sampled microbial species X and four genomes (A1, A2; B1, B2) of two genera (A, B) present in the reference database. b, Number of unique k-mers classified at the level of genus using KrakenUniq for replicates of different read numbers across all simulated species. Dashed line indicates cutoff used in analysis of real data (150 unique k-mers). c, Number of unique k-mers classified at the level of species as a function of average nucleotide identity for mappings against all individual species reference genomes in the genus of reads simulated for a particular species. Blue diamonds indicate results for the mapping against a reference genome from the same species as the simulated read data, whereas grey circles indicate reference genomes of other species. Selected individual species results are highlighted by species name. Dashed line indicates ANI ≥ 0.97 cutoff value. d, Barplots showing number of replicates where the true positive species reference genome was highest ranking in numbers of unique k-mers classified at level of species.

Extended Data Fig. 3 Read mappings across genera in simulated ancient microbial data.

a, Observed breadth of genomic coverage as a function of average read depth for distinct species hits (i.e., mappings with highest number of unique k-mers at species level for a genus; n ≥ 20 reads mapped). Each panel shows results for reads simulated from species indicated. Results for mappings against the simulated species are indicated by diamond shape, whereas mappings against species from other genera are indicated with circles. Symbol fill colour indicates average nucleotide identity for mapped reads (grey symbols ANI < 0.97). Solid black line shows theoretical expected breadth of coverage for a given average read depth. Vertical dashed line indicates 1X average read depth. b, Relative entropy statistic (1000 bp window size) as a function of average nucleotide identity. Blue diamonds indicate results for the mapping against reference genome from the same species as the simulated read data, whereas grey circles indicate reference genomes for species hits in other genera. Dashed lines indicate cutoffs used in analyses of real data (ANI ≥ 0.97, entropy ≥ 0.9). False positive hits of reads mapped to a reference genome from a different genome passing cutoffs and their final number of mapped reads (out of 5 million total simulated reads) are labelled. c, Illustration showing potential sources of false positive hits and expected results for authentication summary statistics. d, Matrix plot showing all microbial hits with n ≥ 20 reads mapped and their authentication statistics, for all simulated species and read numbers. Symbol colour and size indicates the number of replicates passing the cutoff for each of three summary statistics shown (ANI ≥ 0.97, ratio of observed/expected coverage breadth ≥ 0.8, entropy ≥ 0.9). Hits passing cutoffs for all three statistics are indicated with coloured outline and background lines (black - true positives; grey - cross-genus false positive mappings).

Extended Data Fig. 4 Examples of authentication for microbial hits.

a, Observed breadth of genomic coverage as a function of average read depth. Coloured symbols indicate hits in species Toxoplasma gondii (left panel) and Yersinia pestis (right panel), with symbol colour indicating relative entropy of read start positions. Solid black line shows theoretical expected breadth of coverage for a given average read depth⁶⁶. b, Lengths of contigs in the reference genome of Toxoplasma gondii and number of samples showing n ≥ 20 reads mapped. Symbol colour indicates the average number of reads mapped to a specific contig across samples. c, Bayesian estimator of aDNA damage (D max) and significance (Z-score) obtained from metaDMG, for hits in species Clostridium botulinum (left) and Yersinia pestis (right). Error bars indicate ± 1 standard deviation, and symbol fill colour indicates average read depth for mapped reads. Samples used as examples in aDNA damage curves (d) are labelled and indicated with black circles. d, aDNA damage patterns for four example hits in species Clostridium botulinum and Yersinia pestis. Plots show observed nucleotide misincorporation frequencies (red symbols and line) and metaDMG fit (black line) and 68% credible intervals (shaded region) for C > T transitions as a function of distance from the 5’ read end.

Extended Data Fig. 5 Ancient microbial hit ANI and read recruitment.

a, b, Distributions of ANI (a) and log10-fold change of mapped reads over median of reads classified at taxonomic rank of genus per sample (b) for individual species hits detected in n ≥ 5 samples. Symbol colour indicates species hit category.

Extended Data Fig. 6 Ancient microbial hit characteristics.

a, Odds ratios for association of ancient hits with sample material (tooth or bone) across n = 61 species with ≥ 20 ancient hits. Symbols indicate significance of association (p ≤ 0.01, Fisher’s exact test; white triangles - more frequently identified in tooth; grey circles - no significant association). Error bars indicate 95% confidence interval of odds ratio b, c, Rates of observing multiple alleles in 2 randomly sampled sequencing reads at genomic sites in n = 190 ancient hits (average read depth ≥ 1X) across n = 120 samples. b, Multi-allele rate as a function of ANI. Symbol colour indicates average read depth. c, Distribution of multi-allele rate across species hits. Symbol colour indicates ANI. d, e Barplots showing number of hits identified in each microbial species group (d) or each species within groups of likely infections (e). Novel and previously reported ancient pathogen hits are distinguished by bar colour, with total number in each category labelled.

Extended Data Fig. 7 Spatiotemporal distribution of selected ancient pathogens.

Each panel shows geographic distribution (top) and timeline (bottom) for identified cases of the respective pathogen (indicated by coloured circle). Geographic locations and age distributions of all n = 1,313 study samples are shown in each panel using white squares. The panel for Plasmodium combines the three species detected (P. vivax n = 5; P. malariae n = 3; P. falciparum n = 1).

Extended Data Fig. 8 Additional ancient microbial hit authentication.

a, Bar plots showing proportion of reads assigned to same species (dark blue) or genus (light blue) using BLASTn for all hits with n ≤ 100 final reads (n = 712), stratified by genus and microbial source groups b, Bar plots showing the proportion of ancient microbial hits with n ≤ 100 final reads matching the species with most reads assigned using BLASTn, stratified by microbial source group. c, Heatmap showing number of reads mapped to Yersinia pestis CO92 chromosome and plasmids, for n = 42 Yersinia pestis hits. Cell color indicates ratio of observed over expected breadth of coverage. Results for plasmid pMT1 are shown for full plasmid, as well as separately for the 19 kb region containing the ymt gene absent in the LNBA- strains. Samples are ordered by decreasing age from top to bottom.

Extended Data Fig. 9 Time series of detection rates for ancient microbial groups.

a-h, Panels show estimated trendlines and 95% credible interval for detection rates (top), probability distributions and locations (dotted lines) for change points (middle) and probability of trend slope (bottom) being positive (red), negative (blue) or zero (white), inferred using Bayesian change-point detection and time series decomposition.

Extended Data Fig. 10 Spatiotemporal distribution and host genetic structure for ancient microbial groups.

a, Panels showing geographic distributions (top) and timelines (bottom) for identified cases of ancient microbial hits in the oral microbiome and infection groups classes (indicated by coloured circle). Geographic locations and age distributions of all 1,313 study samples are shown in each panel using white squares. b, Principal component analyses showing ancient and modern human genetic population structure in non-African (left panels) and west Eurasian (right panels) individuals. Grey crosses indicate present-day individuals, whereas coloured symbols indicate ancient individuals (coloured by sample age). Diamonds with black outlines indicate position in PCA space for samples with hits in combined infection groups. Major clines of known ancient and modern human ancestry groups are indicated with labels.

Extended Data Fig. 11 Predictors of ancient microbial species incidence.

a, Matrix showing effect sizes and of n = 12 potential predictors (columns) for presence of selected combined ancient microbial species groups inferred from spatiotemporal modelling. Predictors with positive effect (defined here as 2.5% and 97.5% posterior quantiles both positive) are shown as red triangles, whereas predictors with negative effect (defined here as 2.5% and 97.5% posterior quantiles both negative) are shown as blue inverted triangles. Predictors where the 95% posterior quantile range spans zero are indicated using white circles. Posterior 95% quantiles are indicated by the error bars. b, Deviance information criterion values for each model and response variable. c, Matrix as in (a), showing effect sizes and of n = 12 potential predictors (columns) for presence of combined ancient microbial species groups and individual species from spatiotemporal modelling. For each class, estimates for the full model (gray background) and the model with lowest deviance information criterion (white background) is shown. Symbols and colors as in (a).

Supplementary information

Supplementary Information Sections 1–5 and references.

Reporting Summary

Peer Review file

Supplementary Tables

Supplementary Tables 1–6.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sikora, M., Canteri, E., Fernandez-Guerra, A. et al. The spatiotemporal distribution of human pathogens in ancient Eurasia. Nature 643, 1011–1019 (2025). https://doi.org/10.1038/s41586-025-09192-8

Download citation

Received: 18 August 2023
Accepted: 23 May 2025
Published: 09 July 2025
Version of record: 09 July 2025
Issue date: 24 July 2025
DOI: https://doi.org/10.1038/s41586-025-09192-8

This article is cited by

ParaRef: a decontaminated reference database for parasite detection in ancient and modern metagenomic datasets
- Jonas Niemann
- Yuejiao Huang
- Hannes Schroeder
Genome Biology (2025)
Outbreaks of SARS-CoV-2 in minks: public health, environmental and bioethical perspectives
- Dariusz Halabowski
- Willis Gwenzi
- Piotr Rzymski
Environmental Sciences Europe (2025)
Animal diseases leapt to humans when we started keeping livestock
- Smriti Mallapaty
Nature (2025)
Insights into infectious diseases through ancient pathogen genomics
- Arthur Kocher
- Johannes Krause
- Maria A. Spyrou
Nature Reviews Microbiology (2025)