Inferring DNA methylation in non-skeletal tissues of ancient specimens

Mathov, Yoav; Nissim-Rafinia, Malka; Leibson, Chen; Galun, Nir; Marques-Bonet, Tomas; Kandel, Arye; Liebergall, Meir; Meshorer, Eran; Carmel, Liran

doi:10.1038/s41559-024-02571-w

Download PDF

Article
Open access
Published: 20 November 2024

Inferring DNA methylation in non-skeletal tissues of ancient specimens

Nature Ecology & Evolution volume 9, pages 153–165 (2025)Cite this article

28k Accesses
93 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 05 September 2025

This article has been updated

Abstract

Genome-wide premortem DNA methylation patterns can be computationally reconstructed from high-coverage DNA sequences of ancient samples. Because DNA methylation is more conserved across species than across tissues, and ancient DNA is typically extracted from bones and teeth, previous works utilizing ancient DNA methylation maps focused on studying evolutionary changes in the skeletal system. Here we suggest that DNA methylation patterns in one tissue may, under certain conditions, be informative on DNA methylation patterns in other tissues of the same individual. Using the fact that tissue-specific DNA methylation builds up during embryonic development, we identified the conditions that allow for such cross-tissue inference and devised an algorithm that carries it out. We trained the algorithm on methylation data from extant species and reached high precisions of up to 0.92 for validation datasets. We then used the algorithm on archaic humans, and identified more than 1,850 positions for which we were able to observe differential DNA methylation in prefrontal cortex neurons. These positions are linked to hundreds of genes, many of which are involved in neural functions such as structural and developmental processes. Six positions are located in the neuroblastoma breaking point family (NBPF) gene family, which probably played a role in human brain evolution. The algorithm we present here allows for the examination of epigenetic changes in tissues and cell types that are absent from the palaeontological record, and therefore provides new ways to study the evolutionary impacts of epigenetic changes.

A pan-tissue DNA methylation atlas enables in silico decomposition of human tissue methylomes at cell-type resolution

Article Open access 11 March 2022

A DNA methylation atlas of normal human cell types

Article Open access 04 January 2023

DNA methylation in mammalian development and disease

Article 12 August 2024

Main

Changes in gene regulation often underlie phenotypic divergence^1,2,3,4, making the identification of regulatory differences between archaic humans (Neanderthals and Denisovans) and anatomically modern ones a major goal of palaeogenetics⁵. Because RNA is rarely present in palaeontological remains⁶, changes in gene regulation must be inferred indirectly from the ancient DNA (aDNA) sequences themselves. Previous works showed that aDNA degradation signals can be harnessed to computationally reconstruct premortem genome-wide DNA methylation maps^7,8,9. DNA methylation is a key epigenetic mark that strongly affects the activity level of regulatory regions such as promoters and enhancers¹⁰. Hence, its reconstruction provides information on premortem gene activity patterns in ancient individuals. Indeed, the reconstruction of DNA methylation in Neanderthals, a Denisovan and anatomically modern humans allowed us to identify regulatory differences between these human groups and associate them with phenotypic changes, opening the field of palaeoepigenetics^9,11,12.

However, DNA methylation is tissue-specific to the extent that methylation patterns in two different tissues of the same organism are often more diverged than methylation patterns in the same tissue in two different species^13,14. Although occasionally aDNA is extracted from soft tissues such as skin¹⁵ and liver¹⁶, this is limited to a small number of highly conserved and relatively young samples. Generally, bones and teeth are the main sources of aDNA. In particular, these tissues are the source of all archaic human aDNA. Accordingly, our previous palaeoepigenetic studies focused on the evolution of the skeletal system^9,12, but could provide only limited insights on other systems such as the nervous system⁸.

Yet, DNA methylation in one tissue still carries information on DNA methylation in other tissues of the same individual. The reason is that DNA methylation is erased almost completely in the zygote, and then re-established in a cell type-specific manner during development¹⁰. When a DNA methylation change that separates human groups is established during early developmental stages, it may propagate to all descendant cell types, and be simultaneously manifested in multiple tissues.

Here we show that for a certain class of DNA methylation changes, we are able to use parsimony considerations¹⁷ to predict the timing of the change during development, rendering skeletal DNA methylation changes informative on the methylation state in other tissues. Testing our method on modern primate tissues, we show that we achieve precision of 0.7–0.92 in using DNA methylation changes in one tissue to predict corresponding changes in another tissue.

We applied our method to predict DNA methylation changes in prefrontal neurons between modern humans, archaic humans and chimpanzees. We found 71 genes that are associated with differential neuronal DNA methylation separating modern and archaic humans, as well as 870 genes that are associated with differential neuronal DNA methylation separating all humans from chimpanzees. We show that some of these genes are involved in neural functions. Moreover, we discovered six methylation changes in genes that carry the Olduvai domain, whose copy number is associated with an increase in brain volume and cognitive function, suggesting that these changes might be important in human brain evolution.

Results

Given the almost complete reset of DNA methylation patterns in the zygote as part of the epigenetic reprogramming process¹⁸, evolutionary differences in DNA methylation patterns between species are manifested as changes that build up during embryonic development. A methylation change that occurs in cells that are not fully differentiated would propagate to all descendant cell lineages, unless reverted by a later change. For example, if a site goes through an evolutionary methylation change that builds up at a time that precedes the split of the mesodermal and the ectodermal germ layers, the difference in methylation level would show up both in bone and in neurons (Fig. 1a).

**Fig. 1: Conceptual framework of the algorithm.**

For any pair of tissues or cell types, we denote by ‘fundamental changes’ all those evolutionary changes to DNA methylation that were established before the developmental split between the two tissues or cell types. Evolutionary DNA methylation changes that occurred after this developmental time point would be called tissue-specific changes (Fig. 1a). Note that the terms tissue-specific and fundamental methylation changes should be always understood in the context of two specific tissues or cell types. For example, a methylation change affecting all endodermal tissues is still considered tissue-specific if we compare, for example, lung with bone. The basis of the algorithm we develop below is the fact that if we are able to determine, for two tissues or cell types, whether an evolutionary change in DNA methylation is either fundamental or tissue-specific, then the methylation level in one tissue or cell type would be informative on the methylation level in the other tissue or cell type.

The inference procedure

We consider a three-species phylogeny, in which the species are marked as ‘reference’ (superscript r), ‘target’ (superscript t) and ‘outgroup’ (superscript o). For these three species, we examine five DNA methylation maps in two tissues (or cell types). In one tissue, denoted ‘all’ (superscript a), DNA methylation data are available for all three species. In the other tissue, denoted ‘partial’ (superscript p), DNA methylation data are available for the reference and the outgroup only (Fig. 1b). Our goal is to predict the methylation levels in the target in tissue ‘partial’.

A useful abstraction of the problem, which allows for convenient conceptualization of the approach, is to imagine that the DNA methylation in each CpG position is a binary variable than can be either high or low. We dub the combination of binary methylation levels across the five available DNA methylation maps in a specific CpG position a configuration. There are exactly three possible combinations of methylation states that describe CpG positions with differential methylation across the three species in tissue ‘all’ (Fig. 1c). An evolutionary methylation change in tissue ‘all’ can occur along the branch leading to the reference (reference-derived), along the branch leading to the target (target-derived) or along either of the two branches that separate the outgroup from the reference and target (outgroup-specific). Each of these three combinations may be associated with any of four different combination of methylation states in the tissue ‘partial’, leading to a total of 12 possible configurations that should be examined (Fig. 1c).

For some configurations, it is possible to determine the developmental timing of the methylation change using parsimony, whereas for others it is not. In a previous publication¹⁷ we outlined the procedure and demonstrated it in a particular configuration. Here we examine all possible configurations, identify those in which inference on the ‘target’ methylation can be made, develop a detailed algorithm to carry out the inference procedure and use it to predict DNA methylation levels in archaic human brains.

Of the 12 possible configurations, there are five for which parsimony considerations provide a prediction for the timing of the developmental change in methylation (Fig. 1c). In three configurations we predict that the change was tissue-specific, in two configurations the change was fundamental and in all others that it is impossible to determine the timing of the change. In tissue-specific configurations we end up predicting that the methylation change detected in tissue ‘all’ does not hold in tissue ‘partial’. These cases are less relevant for the current study, where our goal is to identify methylation changes in tissue ‘partial’. In the remainder of the article, therefore, we focus solely on the two configurations in which we conclude that the methylation change was fundamental. Note that because data are missing in tissue ‘partial’ of the ‘target’, we cannot make any predictions regarding developmental timing of the methylation change in target-derived changes.

Quantifying the performance

Based on the abstraction above, we developed an algorithm that accounts for the non-binary nature of DNA methylation and identifies CpG positions that are consistent with having gone through a fundamental methylation change (Methods). In these positions, we predict that the methylation levels in ‘target’ in tissue ‘partial’ are closer to those in the outgroup (for a reference-derived methylation change) or to those in the reference (for an outgroup-specific methylation change) (Fig. 1c).

Eventually, we would like to apply our algorithm to a triad of species made of modern human (reference), archaic human (target) and chimpanzee (outgroup). However, lacking DNA methylation in non-skeletal tissues of archaic humans, we optimized the parameters and quantified the performance based on a triad of extant species with available DNA methylation data in multiple tissues. To resemble the triad that we will eventually use, we used three modern primate species: modern humans as the ‘reference’, chimpanzees as the ‘target’ and rhesus macaques as the ‘outgroup’.

We trained the algorithm for the above triad of extant species, using heart as the tissue ‘all’ and kidney as the tissue ‘partial’ (Methods, Supplementary Tables 1 and 2 and Extended Data Figs. 1 and 2). For predicting reference-derived fundamental changes, we obtained a precision of 0.85, compared with an average of 0.22 in permuted data. For predicting outgroup-specific fundamental changes, we obtained a precision of 0.77, compared with an average of 0.40 in permuted data (Fig. 2a). Elevated precision for permuted data when predicting outgroup-specific fundamental changes is expected and, in fact, inherent to our algorithm. This stems from the fact that the evolutionary distance between the outgroup and the reference and/or target is, by definition, longer than the evolutionary distance between the target and the reference, leading to a greater similarity in the DNA methylation patterns of the reference and the target, and hence to an increased probability of finding by chance configurations that resemble outgroup-specific fundamental changes (‘Discussion’). In any case, for both reference-derived changes and outgroup-specific changes, the algorithm shows training set precision which is significantly higher than random.

**Fig. 2: Precision in detecting fundamental DNA methylation changes using endodermal and mesodermal tissues.**

We validated our algorithm using DNA methylation data from liver and lung in these three species, which are tissues on which the algorithm was not trained. In each comparison, the precision achieved using the real data significantly exceeded the precision that was achieved using permuted data. The observed precision in predicting fundamental changes was in the range 0.7–0.84, compared with 0.21–0.37 on permuted data (Fig. 2b and Supplementary Table 3). Flipping the roles of reference and target, namely picking modern human as the target and chimpanzees as the reference, yielded very similar performance estimations (Supplementary Table 3 and Extended Data Fig. 3). To evaluate the statistical significance of the precision levels we obtain, we performed 1,000 permutations, in which in each permutation we randomly shuffled the tags ‘reference’, ‘target’ and ‘outgroup’ across the samples, keeping the total number of reference, target and outgroup samples fixed. For reference-derived inference, we obtained that 0.004 of the permutations yielded higher precision than our algorithm, whereas this fraction was 0.021 for outgroup-specific inference. Combined, these results show that our algorithm achieves significantly higher precision than expected by chance, even when generalized to tissues and a species combination on which it was not trained.

Because of the need to achieve sufficient statistical power, reconstructing DNA methylation maps of aDNA requires the use of moving averages across windows of consecutive CpG positions⁸. Although we showed in the past that this smoothing does not substantially affect the detection of differentially methylated regions (DMRs)⁹, we nevertheless wished to examine the effect of smoothing on the precision of our algorithm. We therefore applied the algorithm to a smoothed version of the above data (Methods) and found that the precision remains very similar to that obtained for non-smoothed data, and that it is always significantly higher than the precision achieved on permuted data (Supplementary Table 3).

Identifying differential methylation in neurons

The method we developed can be used to infer on aDNA methylation in any tissue. Because the large volume and high complexity of the human brain stand out as hallmarks of modern humans, we decided to demonstrate the applicability of our method to this organ first. To this end, we applied the algorithm with bone serving as the tissue ‘all’ and neuronal cell lines from the prefrontal cortex as the tissue ‘partial’.

For bones, we considered published DNA methylation data (including data generated by us in previous works), data of four human femora measured by us for the current study and reconstructed DNA methylation maps in ancient samples, including maps we reconstructed in previous works^10,17 and a new map of Mesolithic anatomically modern human from Sweden¹⁹ that we reconstructed for this study (Methods and Supplementary Table 4). DNA methylation in prefrontal cortex cell lines was taken from published studies^20,21. In total, neuronal DNA methylation data comprise 26 modern humans, 11 chimpanzees and 15 macaques, and bone DNA methylation data comprise 11 modern humans (5 measured, 6 reconstructed), 2 archaic humans, 7 chimpanzees and 10 macaques.

We first estimated the precision of our algorithm on a triad of extant species, composed of modern human as reference, chimpanzee as target, and rhesus macaque as outgroup. This yielded a precision of 0.81 in predicting outgroup-specific fundamental changes, and 0.92 in predicting reference-derived fundamental changes, both being significantly higher than the values achieved for random permutations (Fig. 3a).

**Fig. 3: Precision in detecting fundamental DNA methylation changes using brain and bone.**

We then applied the algorithm to the triad composed of modern human as ‘reference’, archaic human as ‘target’ and chimpanzee as ‘outgroup’, and identified 1,750 CpG positions presenting putative outgroup-specific fundamental changes, and 122 CpG positions presenting putative reference-derived fundamental changes. Of these, 1,131 CpG positions (64.6%), presenting putative outgroup-specific changes, are located in the promoter or gene body of 870 genes, and 80 CpG positions (61.1%), presenting putative reference-derived changes, are located in the promoter or gene body of 71 genes (Supplementary Table 5).

Archaic humans, which serve as ‘target’, comprise only two samples. To assess the possible reduction in power because of this small sample set, we repeated the analysis on the triad of extant species, but randomly downsampled the ‘target’ chimpanzee from 11 samples to only 2. We still detected 7 of the original 8 CpG positions associated with reference-derived changes, and 483 of the 531 original CpG positions associated with outgroup-specific changes. This suggests that the use of only two samples in the target set reduces the power of our analysis by approximately 10–12%.

We accounted for batch effects and evaluated the false discovery rate (FDR) for our actual triad by performing 1,000 permutations, shuffling the tags ‘reference’, ‘target’ and ‘outgroup’ of the bone samples and counting the number of detected reference-derived and outgroup-specific changes. On average, we observed 0.49 reference-derived CpG positions (FDR = 0.005) and 68.6 outgroup-specific CpG positions (FDR = 0.044).

A considerable number of the CpG positions associated with fundamental changes are grouped in clusters. This observation probably reflects the known correlation between adjacent CpG positions, and indicates regional changes in methylation in regulatory regions, where one often finds a close grouping of multiple differentially methylated CpGs^22,23. We hypothesized that even isolated CpG positions showing fundamental changes would represent regional methylation changes. Namely, that the adjacent CpG positions would show changes in DNA methylation compatible with the same fundamental changes, despite not crossing the significance threshold. To test this, we computed the average methylation in such adjacent CpG positions, and compared it with both the nearby isolated CpG positions showing a fundamental change and the average methylation level in the genome. As expected, we found that in 89.1% of the cases the methylation level in adjacent CpGs clustered with the neighbouring isolated CpG position, suggesting that even loci featuring an isolated CpG position with a fundamental change might, in fact, represent a regional methylation change.

CpG positions with fundamental methylation changes are expected to preferentially fall inside regions we detected in a previous work as differentially methylated across the same triad, using a subset of the bone samples used in the current study⁹. Indeed, 172 of the 1,750 outgroup-specific positions (9.8%) fall in the list of 2,031 DMRs that separate chimpanzees and humans (P < 10⁻²⁰⁰ hypergeometric test), and 3 of the 122 reference-derived positions (2.5%) fall in the list of 873 anatomically modern human-derived DMRs (P = 2.8 × 10⁻⁶). Likewise, 137 of the 870 (15.7%) genes associated with outgroup-specific changes overlap with the list of 1,185 differentially methylated genes separating chimpanzees and humans (P = 7.9 × 10⁻⁴⁰), and 5 of the 71 (7%) genes associated with reference-derived changes overlap with the list of 588 differentially methylated genes that have changed in modern humans (P = 4.8 × 10⁻³). Notably, we do not expect a complete overlap, because we used an upgraded version of the reconstruction algorithm, including new filtering techniques and histogram matching²⁴.

Testing for enrichment in biological functions of our gene lists yielded no significant terms. This is expected, given that a fundamental methylation change is, by definition, a change that is shared across multiple tissues. Specifically, fundamental methylation changes that are shared between bone and brain could be associated with brain, bone or any other tissue (Fig. 1a). Therefore, these changes could be related to a wide range of functions, making it unlikely that specific functional enrichment would be observed. However, given the importance of the brain in human evolution, and the lack of methylation data from brain in archaic human, we focus here on changes that might have a relation to the nervous system.

We focused on genes associated with a particularly large number of positions with fundamental changes (Fig. 4 and Supplementary Table 6). In this context, we identified 26 outgroup-specific changes and 3 reference-derived changes localized in the locus containing the genes ZNF707 and CCDC166. Notably, our knowledge about the functions of these genes is limited, with scant evidence pointing at potential relevance to cancer^25,26. Nonetheless, given its classification as a zinc-finger protein, it is reasonable to hypothesize that ZNF707 probably possesses an as-yet undiscovered regulatory role.

**Fig. 4: Genes featuring multiple fundamental DNA methylation changes.**

Next in terms of the number of fundamental changes is the gene ADAMTS2, which exhibits 15 outgroup-specific changes. This gene product regulates neuronal migration by cleaving reelin²⁷, and has been implicated in schizophrenia because it activates dopaminergic signalling and exhibits overexpression in the blood of people during episodes of psychosis (a phenomenon that can be reversed by antipsychotic medications)^28,29. Moreover, in the Human Phenotype Ontology³⁰, ADAMTS2 shows associations with language impairment, intellectual disability, microcephaly, neurodevelopmental delay and other brain-related terms. The gene NF-YA harbours five outgroup-specific changes. NF-YA is a component of the NF-Y transcription factor, which protects neurons from cell death³¹. Inactivation of NF-Y has been linked to various forms of neuronal pathologies^32,33. PIWIL1 exhibits seven outgroup-specific fundamental changes and regulates neuronal polarization and migration³⁴. BRCA1 harbours five outgroup-specific fundamental changes. Apart from being one of the most well-studied tumour suppressors, BRCA1 plays a role in the survival of neural progenitors³⁵. It is also expressed in the hippocampus, and has been observed to be under-expressed in patients with Alzheimer’s disease. Knocking down this gene in the dentate gyrus leads to learning and memory deficits, impairs synaptic plasticity and neuronal shrinkage³⁶. Notably, hypomethylation in a specific region in BRCA1, which does not overlap any of the positions with fundamental changes, has been associated with Alzheimer’s disease³⁷. Moreover, a variant of BRCA1 has been associated with intellectual disability³⁸. AHRR, which also harbours five changes, represses aryl hydrocarbon receptor (AHR), an important mediator for many cellular events. The main hallmarks of brain ageing, including oxidative stress, neuroinflammation and neurogenesis, are affected by AHR, suggesting its function as a regulator for ageing in the brain, together with its effect on the nervous system development^39,40. The deubiquitylase OTUB1 also exhibit five changes encoding the gene OTUB1. This gene mediates neuronal survival after intracerebral haemorrhage^41,42. The effects of this protein on neuroimmune response contribute to pathologies associated with brain diseases such as Parkinson’s disease and multiple sclerosis^43,44.

Turning our attention to genes associated with reference-derived changes, of eight such genes with three altered positions (the highest number of such alternations in a gene), three show a relationship to the brain. Mutations in TMEM216 have been linked to syndromes that involve brain abnormalities^45,46. TMEM216 regulates ciliogenesis, a process with substantial importance in brain development, and disruptions in TMEM216 function lead to impairments in various neurodevelopmental processes⁴⁷. Another gene linked to ciliary function is CROCC. CROCC encodes rootletin, a protein involved in the formation of ciliary rootlets. Mutations in this gene in Drosophila have been associated with sensory deficits⁴⁸. PSMC2 is a component of the 26S proteasome complex that plays a role in various neural processes, including synaptic plasticity^49,50 and brain autophagy⁵¹. Notably, impairments in the 26S proteasome complex have been strongly associated with the presence of brain inclusions such as alpha-synuclein and tau tangles⁵².

Five CpG positions associated with outgroup-specific fundamental changes, and another one associated with reference-derived changes, are located in genes from the neuroblastoma breaking point family (NBPF) gene family, inside the LOC100288142 locus (Fig. 4). NBPF proteins are enriched with a protein domain called the Olduvai domain that is key to human brain evolution. This domain, which is found almost exclusively in NBPF proteins, went through a human-specific increase in copy number, reaching approximately 300 copies. Chimpanzees show less than half that number of copies, and the number drops to several dozens of copies in other Old World monkeys, and to around a single copy in non-primate mammals. The domain is absent in non-mammals⁵³. Olduvai domain copy number variation is correlated with brain volume^54,55, cognitive functions⁵⁶ and autistic symptoms^57,58,59. It is also negatively correlated with schizophrenia⁶⁰. NBPF genes in general have been found to enhance neural stem cell proliferation⁶¹. Four of the five outgroup-specific fundamental changes are located in NBPF10, one of the NBPF genes with the highest number of human-specific Olduvai domain copies. NBPF10 is adjacent to NOTCH2NL, an additional gene linked to human brain volume^62,63, and it has been suggested that the two evolved in tandem⁶⁴.

To further explore potential regulatory effects of the differential methylation, we examined whether CpG positions with fundamental methylation changes tend to overlap enhancers associated with bones and neurons more than would be expected by chance (Methods). In bones, we observed a significant overlap for both outgroup-specific (298 intersecting positions; P = 9.04 × 10⁻¹⁰) and reference-derived CpG positions (28 intersecting positions; P = 2.66 × 10⁻⁴). In neurons, significant overlap was only evident for outgroup-specific CpG positions (81 intersecting positions; P = 5.37 × 10⁻⁷), whereas no significant overlap was observed for reference-derived CpG positions (4 intersecting positions). Next, we examined genes that are associated with enhancers that overlap CpG positions with differential methylation. Although there are no such genes for reference-derived CpG positions, there are 330 and 480 genes for outgroup-specific CpG positions in bones and neurons, respectively. Enrichment analysis of these genes revealed similar patterns, despite representing two distinct tissues, highlighting organelle functions over tissue-specific ones (Supplementary Table 7). This aligns with expectations, considering that fundamental changes affect multiple tissues.

We next searched for motifs that overlap these enhancers and transcription factors that bind these motifs (Supplementary Table 8). In outgroup-specific changes we found enrichment of transcription factors that are related to brain development (GLI1, GLI2, ZIC1, NHLH2, NR2F2, ZIC2 and ZNF148; FDR = 2.3 × 10⁻³) and specifically forebrain development (GLI1, GLI2, ZIC1, NHLH2 and NR2F2; FDR = 0.01).

Application to bulk prefrontal cortex

In addition to prefrontal neurons, DNA methylation is also available from bulk prefrontal cortex tissue from humans, chimpanzees and rhesus macaque (Methods and Extended Data Fig. 4)^65,66. To examine whether the methylation changes we identified in neurons can be detected in whole tissues, we repeated the analysis with these bulk DNA methylation measurements.

We first evaluated the performance of our algorithm using the same triad of extant species as we used before, setting modern human as ‘reference’, chimpanzee as ‘target’ and rhesus macaque as ‘outgroup’, with bone serving as the tissue ‘all’ and bulk prefrontal cortex as the tissue ‘partial’. Unfortunately, there is a small number of samples of DNA methylation in bulk prefrontal cortex (three humans, three chimpanzees and two rhesus macaques) with relatively lower coverage (Supplementary Table 4), compromising the power of our approach to detect differentially methylated positions. Indeed, although the precision we achieved is significantly higher than for permuted data (P < 10⁻⁴), we observed lower values than in previous analyses. We obtained a precision of 0.66 in predicting reference-derived changes (compared with a mean of 0.32 for permuted data) and a precision of 0.55 in predicting outgroup-specific changes (compared with a mean of 0.36 for permuted data) (Fig. 3b).

We then considered archaic humans as ‘target’ and chimpanzees as ‘outgroup’ and applied our algorithm to predict differential methylation in bulk brain tissue. We found 396 CpG positions associated with putative outgroup-specific changes and 53 CpG positions associated with putative reference-derived changes. The outgroup-specific CpG positions are associated with 248 genes and the reference-derived CpG positions are associated with 35 genes (Supplementary Table 9). Reassuringly, we found high consistency between these CpG positions and those identified using prefrontal neurons. In total, 156 of the 396 of the outgroup-specific CpG positions appear among the 1,750 positions identified using prefrontal neurons (P < 10⁻³⁰⁰, hypergeometric test) and 8 of the 53 of the reference-derived CpG positions appear among the 122 positions identified using prefrontal neurons (P = 4.7 × 10⁻⁴⁴). Similarly, of the 248 genes associated with outgroup-specific changes in bulk prefrontal cortex, 116 were found in the corresponding list of 870 genes identified using prefrontal neurons (P < 10⁻¹⁰⁰) and 8 of the 35 genes associated with reference-derived changes were also identified using neurons (P = 3.54 × 10⁻¹⁶).

Discussion

Based on the embryonic developmental timing of an evolutionary methylation change, we defined two families of changes; fundamental changes are those that occurred before the developmental split between a pair of tissues, whereas tissue-specific changes are those that occurred after the split. In this work, we developed an algorithm to detect CpG positions that underwent fundamental changes, because they result in DNA methylation differences across the examined species that are also replicated in the tissue ‘partial’ (Fig. 1c). Similar methodology can be used to detect tissue-specific changes as well, although these result in no change in DNA methylation across the species in the tissue ‘partial’ (Fig. 1c), hence potentially presenting more limited evolutionary interest. The main algorithmic modification that is required would be to use statistical tests for equivalence, such as the two one-sided t-tests (TOST) procedure⁶⁷.

The algorithm can be used to determine the methylation state in non-skeletal tissues of archaic humans, as long as methylation data are available for this tissue in both modern humans and chimpanzees (or any other non-human ape, in the case that bone methylation data are available for this species as well). This ability to obtain information on DNA methylation in archaic tissues that are not accessible in the palaeontological record opens new ways to examine changes in gene regulation and their potential effect on evolutionary adaptations in humans.

To quantify performance, we tested the algorithm on three extant species, comprising modern humans, chimpanzees and rhesus macaques (Fig. 2a). The longer branches on this phylogenetic tree (median divergence time between modern humans and rhesus macaques is 28.9 million years ago (Ma), compared with the median divergence time between modern humans and chimpanzees of 6.4 Ma (ref. ⁶⁸)) result in an underestimation of the performance of the algorithm, because the likelihood of independent reversal of the methylation state along each branch increases. We therefore estimate that our algorithm works with even higher precisions than those reported here. We used the algorithm to find genomic positions in which the methylation state can be determined in the brains of archaic humans. We focused on the brain because of its central role in human evolution and the potential implications of brain-related differences between human groups, and between humans and chimpanzees. However, the algorithm we present is general and can be used to determine the methylation state in other archaic tissues. In fact, brain is probably among the most challenging choices, given its large developmental distance from bone (Fig. 1a). A tissue that is developmentally closer to bone, such as muscle or heart, is likely to result in a larger number of positions whose methylation state can be determined.

We showed that we are able to detect two types of fundamental DNA methylation changes. Reference-derived changes are those in which DNA methylation levels in modern humans are different from those in both archaic humans and chimpanzees, representing methylation patterns that are unique to modern humans. Outgroup-specific changes are those in which DNA methylation levels in both modern and archaic humans are different from those in chimpanzees, representing methylation patterns that are shared across modern and archaic humans. In all analyses, we detect fewer reference-derived positions than outgroup-specific ones, although with higher precision. This is expected considering the much shorter evolutionary time span in which the methylation change could have occurred for reference-derived changes compared with outgroup-specific ones (Fig. 1b). Shorter distances mean fewer methylation change events, which result in lower numbers of detected events, but also in lower probability for reversal of methylation changes, and therefore higher precision of parsimony-based inference.

We present here evidence for a strong connection between genes associated with fundamental changes and neuronal functions. Among the most interesting findings is the identification of six fundamental changes in NBPF genes carrying Olduvai domains, which seem to be associated with human brain evolution. Whereas the association of Olduvai domains to brain evolution has focused thus far on their copy number, our findings suggest that methylation changes might also have a role in this; in particular, NBPF10, has four fundamental methylation changes and is also one of the genes with the largest number of Olduvai domains. With that, future research is imperative to establish causality and validate the functional importance of these methylation changes on phenotypes.

The number of CpG positions that can be tested for differential methylation is constrained by the limited availability of DNA methylation data from primate bone, because part of the data was produced using Illumina Infinium MethylationEPIC BeadChip methylation array (Supplementary Table 4), which represent approximately 3% of (but enriched for functional) CpG positions in the genome. As a result, our set of CpG positions associated with fundamental changes represents only a subset of the positions that show such an association. In addition, restrictions come from the fact that we considered a CpG position only if data were available in at least two samples from each group. This criterion resulted in a restricted set of CpGs available for analysis, ranging from 1% to 17% of autosomal CpGs, depending on the tissues selected for each analysis. These limitations underscore the need for an expanded dataset of bone methylation.

For reference-derived changes, we require that the mean methylation in the ‘target’ will be closer to the mean methylation of the ‘outgroup’ than to that of the ‘reference’, whereas for outgroup-specific changes we require the opposite. Addition of this requirement was important for reducing the level of false predictions, which result from the small number of samples in the ‘target’ compared with the other groups. However, when this additional requirement is used, the algorithm is unable to detect CpG positions that have gone through gradual methylation change (for example, that the ‘outgroup’ is unmethylated, the ‘target’ is partially methylated and the ‘reference’ is highly methylated). Such positions could potentially be detected as being simultaneously reference-derived and outgroup-specific methylation changes. However, mean methylation in the ‘target’ group will be closer to either the ‘outgroup’ or the ‘reference’. Therefore, the additional requirement will lead the algorithm to detect only one of the two types of fundamental changes, and information on gradual changes will be lost. We hope that in the future more methylation maps of archaic humans will be published, making use of the additional requirement unnecessary, and allowing for the detection of gradual fundamental methylation changes.

Overall, we present an algorithm to infer DNA methylation in non-skeletal archaic tissues. The algorithm can be applied to any triad of species or groups and could be useful whenever DNA methylation in a certain tissue is unavailable or difficult to obtain.

Methods

DNA methylation data

We collected and generated DNA methylation data from seven tissues and cell types (neurons, bone, kidney, heart, liver, lung and bulk brain) across three species (modern human, chimpanzee and rhesus macaque). In addition, we used reconstructed DNA methylation data in bones of modern and archaic humans. Data were collected only from autosomes, to avoid the unique methylation patterns that characterize sex chromosomes.

Bone DNA methylation of present-day individuals published in this study

DNA methylation maps from four femur head bones from present-day humans were generated using whole-genome bisulfite sequencing (WGBS). Patients were females with osteoarthritis. Samples were extracted during total hip replacement surgery and were taken from the healthy part of the bone. All four patients signed a form according to Helsinki approval 0178-13-HMO. DNA extraction and the WGBS protocol are described in a previous publication⁹ (Supplementary Table 4). In brief, DNA was extracted from bones using a QIAamp DNA Investigator kit (Qiagen, catalogue no. 56504) and DNA libraries were built with the Illumina TruSeq Sample Preparation kit. Bisulfite treatment was applied in two rounds using the EpiTect Bisulfite kit (Qiagen) and paired-end sequencing was performed on an Illumina Hi-Seq 2000 instrument. We aligned the reads to hg19 genome assembly using Bismark v.0.23.0 with the following parameters: --bowtie2 --non_bs_mm -p 4 --multicore 4. We then used bismark_methylation_extractor to extract methylation calls with the following parameters: -p --parallel 4 --bedGraph. We did not perform filtration of CpG positions by coverage, as we used the default Bismark setting of including positions with a minimum coverage of 1×.

Previously published modern DNA methylation

Neuronal data include WGBS DNA methylation data measured in neurons isolated from prefrontal cortices of humans, chimpanzees and macaques. Data for humans were downloaded from Gene Expression Omnibus (GEO) accession number GSE107638 (ref. ²⁰), taking only the 25 healthy controls. Data for chimpanzees and macaques were downloaded from GEO accession number GSE151768 (ref. ²¹). WGBS DNA methylation data for kidneys, hearts, livers and lungs of four specimens for each of these species were downloaded from GEO accession number GSE112356 (refs. ^69,70). Bulk WGBS brain data that were generated from prefrontal cortices of three modern humans and three chimpanzees was downloaded from GEO accession number GSE37202 (ref. ⁶⁶). Bulk WGBS data of two macaques were downloaded from GEO accession number GSE77124 (ref. ⁶⁵). For bone, we used ten Illumina Infinium MethylationEPIC BeadChip (850K) methylation arrays from rhesus⁷¹ and data acquired by Gokhman et al.⁹ including one modern human and one chimpanzee WGBS map, one chimpanzee reduced representation bisulfite sequencing map and four chimpanzee 850K methylation arrays (Supplementary Table 4).

Most non-human DNA methylation data were already mapped to the human hg19 reference genome. However, this was not done for rhesus bulk brain samples, and we therefore mapped these data to hg19 using liftOver⁷². This mapped ~30% of the rhesus CpG positions to a corresponding CpG position in human. However, in a significantly enriched number of cases, the rhesus CpG positions were mapped to a position preceding a human CpG position by a single base (Extended Data Fig. 4). We considered these as valid mappings, ending up with ~40% of the rhesus CpG positions being successfully mapped to the human hg19 genome reference.

Ancient DNA methylation maps

We used our newest version of the RoAM software tool²⁴ to reconstruct the methylome of a Mesolithic anatomically modern human individual from Stora Karlsö, Sweden (~9,000 years ago) that had been previously sequeneced¹⁹ (Supplementary Table 4). In addition, we used our previously published methylation maps of five ancient anatomically modern humans⁹, one Neanderthal and one Denisovan⁸, which at the time of writing are the only uracil DNA glycosylase-treated high-coverage archaic samples, two prerequisites for computational reconstruction of aDNA methylation¹⁷.

Batch effects

To minimize batch effects stemming from combining data from different sources, we took several measures. First, all data used for optimization and cross-validation were taken from the same laboratory^69,70. Similarly, all the brain data, including both prefrontal cortex neurons and bulk brain tissue, were taken from the same laboratory^20,21,65,66.

Most of the bone data were produced by us, either specifically for this paper or in previous work⁹. To address consistency between the modern bone samples and the ancient ones, we used histogram matching during the DNA methylation reconstruction, fitting the reconstructed histogram to that of previously published modern bone data^9,73.

Finally, remaining batch effects are accounted for in the estimated false discovery rates of our method.

Detecting CpGs positions with differential DNA methylation

The algorithm receives methylation data from three species and two tissues (Fig. 1b). Only CpG positions where, for each tissue and species, data are available from at least two samples were considered. In addition, only positions where the variance in methylation levels in each tissue and species was <0.08 were considered. This threshold was selected to guarantee that a minimum set of two samples will have at most a 40% difference in methylation level.

Identifying reference-derived fundamental changes

Let us have methylation data for N CpG positions and let $0\le {m}_{i,\;j}^{s,u}\le 1$ be the measured methylation level of sample j at CpG position i (i = 1, …, N), in species s and tissue u. We say that the methylation level at CpG position i is reference-derived in tissue ‘all’ if

$$\left|{t}_{i}^{r,a}\right|\ge {t}^{r,a}+{k}_{a}^{{rd}}{\sigma }^{r,a}.$$

Here, ${t}_{i}^{r,a}$ is the statistic of an equal-variance t-test at CpG position i in tissue ‘all’, contrasting the ‘reference’ with the ‘target’ and ‘outgroup’, and ${t}^{r,a}$ and ${\sigma }^{r,a}$ are the estimators of the mean and the standard deviation of the statistic across the N positions. ${k}_{a}^{{rd}}$ is a parameter.

Similarly, we say that CpG position i shows differential methylation between the ‘reference’ and the ‘outgroup’ in tissue ‘partial’ if

$$\left|{t}_{i}^{{ro},p}\right|\ge {t}^{{ro},p}+{k}_{p}^{{rd}}{\sigma }^{{ro},p}.$$

Here, ${t}_{i}^{{ro},p}$ is the statistic of an equal-variance t-test at CpG position i in tissue ‘partial’, contrasting the ‘reference’ with the ‘outgroup’, and ${t}^{{ro},p}$ and ${\sigma }^{{ro},p}$ are the estimators of the mean and the standard deviation of the statistic across the N positions. ${k}_{p}^{{rd}}$ is a parameter.

To identify reference-derived fundamental changes (left-hand column in Fig. 1c), we look for CpG positions i that satisfy:

$$\left|{t}_{i}^{\,r,a}\right|\ge {t}^{r,a}+{k}_{a}^{{rd}}{\sigma }^{r,a}$$

(1a)

$$\left|{t}_{i}^{\,{ro},p}\right|\ge {t}^{{ro},p}+{k}_{p}^{{rd}}{\sigma }^{{ro},p},$$

(1b)

$${t}_{i}^{\,r,a}\cdot {t}_{i}^{\,{ro},p} > 0.$$

(1c)

Inequality (equation (1c)) guarantees that the methylation difference between the ‘reference’ and the ‘outgroup’ in both tissues is in the same direction.

We found it useful to add a fourth condition to equation (1a–c), verifying that the mean methylation in ‘target’ is closer to the mean methylation of the ‘outgroup’ than to the mean methylation of the reference. If $\overline{{m}_{\iota }^{r,a}}$, $\overline{{m}_{\iota }^{t,a}}$ and $\overline{{m}_{\iota }^{o,a}}$ are the mean methylation across samples in tissue ‘all’ in position i of the ‘reference’, ‘target’ and ‘outgroup’, respectively, then

$$\left|\overline{{m}_{\iota }^{t,a}}-\overline{{m}_{\iota }^{r,a}}\right| > \left|\overline{{m}_{\iota }^{t,a}}-\overline{{m}_{\iota }^{o,a}}\right|.$$

(1d)

In the implementation, we kept this additional condition optional because the algorithm detects methylation changes in high precision even without it. However, it is advisable to use it when the number of samples in ‘target’ is small compared with the other groups, because it prevents mispredictions that arise as a result of the lower weight of the ‘target’ samples in the t-test. Because only two archaic human samples serve as ‘target’ in the current study, we used this additional condition throughout the paper. Positions that satisfy equation (1a–d) are those for which we predict that the methylation in ‘target’ in tissue ‘partial’ clusters with that in ‘outgroup’, and different from that in ‘reference’ (Fig. 1c).

Identifying outgroup-specific fundamental changes

We follow a similar approach to identify fundamental changes in which the methylation change in tissue ‘all’ is outgroup-specific (right-hand column in Fig. 1c). CpG positions that we predict have gone through outgroup-specific fundamental changes should satisfy:

$$\left|{t}_{i}^{o,a}\right|\ge {t}^{o,a}+{k}_{a}^{{os}}{\sigma }^{o,a}$$

(2a)

$$\left|{t}_{i}^{{ro},p}\right|\ge {t}^{{ro},p}+{k}_{p}^{{os}}{\sigma }^{{ro},p},$$

(2b)

$${t}_{i}^{o,a}\cdot {t}_{i}^{{ro},p} > 0.$$

(2c)

Here, ${t}_{i}^{o,a}$ is the statistic of an equal-variance t-test at CpG position i in tissue ‘all’ contrasting the ‘outgroup’ with the ‘target’ and ‘reference’, ${t}^{o,a}$ and ${\sigma }^{o,a}$ are the estimators of the mean and the standard deviation of the statistic across the N positions, and ${k}_{a}^{{os}}$ and ${k}_{p}^{{os}}$ are parameters. Note that the t-test for tissue ‘partial’, equation (2b), is identical to the one for the reference-derived fundamental changes, equation (1b), because in both cases we wish to find a significant difference in tissue ‘partial’ between the ‘outgroup’ and the ‘reference’. Yet, we allow the parameters ${k}_{p}^{{rd}}$ and ${k}_{p}^{{os}}$ to be potentially different.

Here, too, we added a fourth condition that guarantees that the mean methylation in ‘target’ is closer to ‘reference’ than to ‘outgroup’,

$$\left|\overline{{m}_{\iota }^{t,a}}-\overline{{m}_{\iota }^{o,a}}\right| > \left|\overline{{m}_{\iota }^{t,a}}-\overline{{m}_{\iota }^{r,a}}\right|.$$

(2d)

In the implementation, this condition is optional, but we used it throughout this work.

Measuring performance

Whenever the true value of the methylation in the ‘target’ at tissue ‘partial’ is known, we can use it to estimate the precision of our algorithm. To this end, we define CpG positions for which there is truly a reference-derived methylation change in tissue ‘partial’ as those that satisfy

$$\left|{t}_{i}^{\,r,p}\right|\ge {t}^{r,p}+{k}_{t}^{{rd}}{\sigma }^{r,p}.$$

(3)

Here, ${t}_{i}^{\,r,p}$ is the statistic of an equal-variance t-test at CpG position i in tissue ‘partial’, contrasting the ‘reference’ with the ‘target’ and ‘outgroup’, ${t}^{r,p}$ and ${\sigma }^{r,p}$ are the estimators of the mean and the standard deviation of the statistic across the N positions, and ${k}_{t}^{{rd}}$ is a parameter. Similarly, we define CpG positions for which there is truly an outgroup-specific methylation change in tissue ‘partial’ as those that satisfy

$$\left|{t}_{i}^{\,o,p}\right|\ge {t}^{\,o,p}+{k}_{t}^{{os}}{\sigma }^{\,o,p},$$

(4)

where ${t}_{i}^{o,p}$ is the statistic of an equal-variance t-test at CpG position i in tissue ‘partial’, contrasting the ‘outgroup’ with the ‘target’ and ‘reference’, ${t}^{o,p}$ and ${\sigma }^{o,p}$ are the estimators of the mean and the standard deviation of the statistic across the N positions, and ${k}_{t}^{{os}}$ is a parameter.

Let S_p be the set of CpG positions for which we predict a reference-derived fundamental change, namely, those positions that satisfy equation (1a–d). Let S_e be the set of reference-derived fundamental changes that are considered ‘true’ changes, defined as the collection of positions that satisfy

$$\left|{t}_{i}^{r,a}\right|\ge {t}^{r,a}+{k}_{a}^{{rd}}{\sigma }^{r,a}$$

(5a)

$$\left|{t}_{i}^{r,p}\right|\ge {t}^{r,p}+{k}_{t}^{{rd}}{\sigma }^{r,p},$$

(5b)

$${t}_{i}^{r,a}\cdot {t}_{i}^{r,p} > 0,$$

(5c)

where equation (5a) is identical to equation (1a) and (5b) is identical to equation (3). Then, the precision of our algorithm that predicts reference-derived fundamental changes is

$${{\mathrm{Prc}}}=\frac{|{S}_{p}\cap {S}_{e}|}{|{S}_{p}|},$$

(6)

where |S| is the size of set S.

Analogously, for outgroup-specific fundamental changes we define S_p as the set of positions that satisfy equation (2a–d), and S_e as the set of positions that are taken as ‘true’ outgroup-specific fundamental change, namely the set of positions that satisfy

$$\left|{t}_{i}^{o,a}\right|\ge {t}^{o,a}+{k}_{a}^{{os}}{\sigma }^{o,a},$$

(7a)

$$\left|{t}_{i}^{o,p}\right|\ge {t}^{o,p}+{k}_{t}^{{os}}{\sigma }^{o,p},$$

(7b)

$${t}_{i}^{o,a}\cdot {t}_{i}^{o,p} > 0,$$

(7c)

where equation (7a) is identical to equation (2a) and (7b) is identical to equation (4). Precision is defined as in equation (6).

Parameter estimation

Overall, the algorithms that predict reference-derived changes and outgroup-specific changes are independent of each other, and each respective set of parameters can be optimized separately. Each algorithm uses two parameters, ${k}_{a}^{{rd}}$ and ${k}_{p}^{{rd}}$, for predicting reference-derived changes (equation (1a–d)), and ${k}_{a}^{{os}}$ and ${k}_{p}^{{os}}$ for predicting outgroup-specific changes (equation (2a–d)). Moreover, the performance of each algorithm is determined using an additional parameter, either ${k}_{t}^{{rd}}$ for the reference-derived changes or ${k}_{t}^{{os}}$ for the outgroup-specific changes. All three parameters for each algorithm were optimized together. Permuted data were generated by pairing a random CpG position in tissue ‘partial’ to each position in tissue ‘all’. For each analysis, 10,000 permuted datasets were generated. The optimized parameters were chosen as the values that gave the largest difference between the precision of the algorithm on the real data and its precision on permutated data.

We performed the optimization process by using chimpanzee as ‘target’, modern humans as ‘reference’ and rhesus macaque as ‘outgroup’ and choosing heart as tissue ‘all’ and kidney as tissue ‘partial’. We scanned all values of the parameters in a grid of values from 2 to 4 using linear spacing of 0.2 for each parameter. Notably, for each value of the parameters, both in the reference-derived and the outgroup-specific comparisons, the precision of the algorithm when applied to the real data was higher than the average precision of the corresponding permutated data (Extended Data Fig. 1). Moreover, not a single permutation in any set of parameters achieved precision that exceeded that of the real data. The optimized values of the parameters are given in Supplementary Table 1.

Some of the optimized parameters take their value at the edge of the grid, pointing at the possibility of achieving even higher precision using values that are outside the current grid. However, we noticed that in these cases precision reaches a plateau, so that the improvement is minor when changing the value of the parameter. Combined with the fact that values of the parameters outside the grid decreased the number of detections and hence reduced the power of the analysis, we decided to bound the value of the parameters and not extend the grid (Extended Data Fig. 2). The set of optimized parameters is remarkably robust to the selection of the training dataset and changed only slightly when using other tissue combinations (Supplementary Table 2).

Smoothing data

Smoothing of methylation data in modern samples was performed in the cross-validation part only. It was done using a moving average with a fixed window size over CpG positions. We used a window size of 31 CpG positions, to match typical window sizes used for reconstructing ancient methylation^8,9.

Gene enrichment analysis

A CpG position with a predicted fundamental change is associated with a gene if it resides in the gene body or in the promoter region, defined as 5,000 bp upstream of the transcription start site to 1,000 bp downstream of it. Gene body coordinates and gene names were determined using the hg19 genome reference downloaded from UCSC genome assembly⁷⁴.

We used DAVID^75,76 to test for enrichment of our gene lists in biological functions. We used all default datasets.

Enhancer enrichment analysis

Enhancer locations were downloaded from EnhancerAtlas⁷⁷. For bone enhancers we used the osteoblasts datasets. For neuron enhancers we combined data of embryonic stem cell neurons and cerebellum neurons. Enrichment was evaluated using a hypergeometric test followed by Benjamini–Hochberg FDR procedure. EnhancerAtlas includes a comprehensive list of genes associated with each enhancer. We used DAVID to investigate functional enrichments for genes linked to enhancers that intersect with the detected positions. Further, we used MEME-ChIP⁷⁸ to examine which of the positions that overlap enhancers also overlap binding motifs, and then used TOMTOM⁷⁹ to identify transcription factors that bind these motifs.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All sequencing and methylation data generated for this study was deposited in the National Center for Biotechnology Information’s Gene Expression Omnibus under GEO access number GSE276666.

Code availability

The MATLAB code can be downloaded from http://carmelab.huji.ac.il/software.html.

Change history

05 September 2025
A Correction to this paper has been published: https://doi.org/10.1038/s41559-025-02865-7

References

King, M. C. & Wilson, A. C. Evolution at two levels in humans and chimpanzees. Science 188, 107–116 (1975).
Article PubMed CAS Google Scholar
Romero, I. G., Ruvinsky, I. & Gilad, Y. Comparative studies of gene expression and the evolution of gene regulation. Nat. Rev. Genet. 13, 505–516 (2012).
Article PubMed PubMed Central CAS Google Scholar
Wray, G. A. The evolutionary significance of cis-regulatory mutations. Nat. Rev. Genet. 8, 206–216 (2007).
Article PubMed CAS Google Scholar
Carroll, S. B. Evo-Devo and an expanding evolutionary synthesis: a genetic theory of morphological evolution. Cell 134, 25–36 (2008).
Article PubMed CAS Google Scholar
Colbran, L. L. et al. Inferred divergent gene regulation in archaic hominins reveals potential phenotypic differences. Nat. Ecol. Evol. 3, 1598–1606 (2019).
Article PubMed PubMed Central Google Scholar
Fordyce, S. L. et al. Deep sequencing of RNA from ancient maize kernels. PLoS ONE 8, e50961 (2013).
Article PubMed PubMed Central CAS Google Scholar
Pedersen, J. S. et al. Genome-wide nucleosome map and cytosine methylation levels of an ancient human genome. Genome Res. 24, 454–466 (2014).
Article PubMed PubMed Central CAS Google Scholar
Gokhman, D. et al. Reconstructing the DNA methylation maps of the Neandertal and the Denisovan. Science 344, 523–527 (2014).
Article PubMed CAS Google Scholar
Gokhman, D. et al. Differential DNA methylation of vocal and facial anatomy genes in modern humans. Nat. Commun. 11, 1189 (2020).
Article PubMed PubMed Central CAS Google Scholar
Jones, P. A. Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat. Rev. Genet. 13, 484–492 (2012).
Article PubMed CAS Google Scholar
Mathov, Y., Batyrev, D., Meshorer, E. & Carmel, L. Harnessing epigenetics to study human evolution. Curr. Opin. Genet. Dev. 62, 23–29 (2020).
Article PubMed CAS Google Scholar
Gokhman, D. et al. Reconstructing Denisovan anatomy using DNA methylation maps. Cell 179, 180–192.e10 (2019).
Article PubMed CAS Google Scholar
Pai, A. A., Bell, J. T., Marioni, J. C., Pritchard, J. K. & Gilad, Y. A genome-wide study of DNA methylation patterns and gene expression levels in multiple human and chimpanzee tissues. PLoS Genet. 7, e1001316 (2011).
Article PubMed PubMed Central CAS Google Scholar
Böck, J. et al. Cell type and species-specific patterns in neuronal and non-neuronal methylomes of human and chimpanzee cortices. Cereb. Cortex 28, 3724–3739 (2018).
Article PubMed PubMed Central Google Scholar
Palkopoulou, E. et al. Complete genomes reveal signatures of demographic and genetic declines in the woolly mammoth. Curr. Biol. 25, 1395–1400 (2015).
Article PubMed PubMed Central CAS Google Scholar
Seguin-Orlando, A. et al. Pros and cons of methylation-based enrichment methods for ancient DNA. Sci. Rep. 5, 11826 (2015).
Article PubMed PubMed Central Google Scholar
Gokhman, D., Meshorer, E. & Carmel, L. Epigenetics: it’s getting old. Past meets future in paleoepigenetics. Trends Ecol. Evol. 31, 290–300 (2016).
Article PubMed Google Scholar
Morgan, H. D., Santos, F., Green, K., Dean, W. & Reik, W. Epigenetic reprogramming in mammals. Hum. Mol. Genet. 14, R47–R58 (2005).
Article PubMed CAS Google Scholar
Günther, T. et al. Population genomics of Mesolithic Scandinavia: investigating early postglacial migration routes and high-latitude adaptation. PLoS Biol. 16, e2003703 (2018).
Article PubMed PubMed Central Google Scholar
Berto, S. et al. Accelerated evolution of oligodendrocytes in the human brain. Proc. Natl Acad. Sci. USA 116, 24334–24342 (2019).
Article PubMed PubMed Central CAS Google Scholar
Jeong, H. et al. Evolution of DNA methylation in the human brain. Nat. Commun. 12, 2021 (2021).
Article PubMed PubMed Central CAS Google Scholar
Zhang, W., Spector, T. D., Deloukas, P., Bell, J. T. & Engelhardt, B. E. Predicting genome-wide DNA methylation using methylation marks, genomic position, and DNA regulatory elements. Genome Biol. 16, 14 (2015).
Article PubMed PubMed Central Google Scholar
Affinito, O. et al. Nucleotide distance influences co-methylation between nearby CpG sites. Genomics 112, 144–150 (2020).
Article PubMed CAS Google Scholar
Mathov, Y. et al. RoAM: computational reconstruction of ancient methylomes and identification of differentially methylated regions. Preprint at bioRxiv https://doi.org/10.1101/2024.08.08.607143 (2024).
Chen, S. et al. A novel DNA methylation signature associated with lymph node metastasis status in early gastric cancer. Clin. Epigenetics 14, 18 (2022).
Article PubMed PubMed Central CAS Google Scholar
Kim, J. et al. Identification of candidate genes associated with susceptibility to ovarian clear cell adenocarcinoma using cis-eQTL analysis. J. Clin. Med. 9, 1137 (2020).
Article PubMed PubMed Central CAS Google Scholar
Yamakage, Y. et al. A disintegrin and metalloproteinase with thrombospondin motifs 2 cleaves and inactivates Reelin in the postnatal cerebral cortex and hippocampus, but not in the cerebellum. Mol. Cell. Neurosci. 100, 103401 (2019).
Article PubMed CAS Google Scholar
Crespo-Facorro, B., Prieto, C. & Sainz, J. Schizophrenia gene expression profile reverted to normal levels by antipsychotics. Int. J. Neuropsychopharmacol. 18, pyu066 (2015).
Article PubMed Central Google Scholar
Ruso-Julve, F. et al. Dopaminergic control of ADAMTS2 expression through cAMP/CREB and ERK: molecular effects of antipsychotics. Transl. Psychiatry 9, 306 (2019).
Article PubMed PubMed Central Google Scholar
Gargano, M. A. et al. The human phenotype ontology in 2024: phenotypes around the world. Nucleic Acids Res. 52, D1333–D1346 (2024).
Article PubMed CAS Google Scholar
Yamanaka, T. et al. NF-Y inactivation causes atypical neurodegeneration characterized by ubiquitin and p62 accumulation and endoplasmic reticulum disorganization. Nat. Commun. 5, 3354 (2014).
Article PubMed Google Scholar
Yamanaka, T. et al. Differential roles of NF-Y transcription factor in ER chaperone expression and neuronal maintenance in the CNS. Sci. Rep. 6, 34575 (2016).
Article PubMed PubMed Central CAS Google Scholar
Yamanaka, T. et al. Mutant Huntingtin reduces HSP70 expression through the sequestration of NF-Y transcription factor. EMBO J. 27, 827–839 (2008).
Article PubMed PubMed Central CAS Google Scholar
Zhao, P. P. et al. Novel function of PIWIL1 in neuronal polarization and migration via regulation of microtubule-associated proteins. Mol. Brain 8, 39 (2015).
Article PubMed PubMed Central Google Scholar
Pao, G. M. et al. Role of BRCA1 in brain development. Proc. Natl Acad. Sci. USA 111, E1240–E1248 (2014).
Article PubMed PubMed Central CAS Google Scholar
Suberbielle, E. et al. DNA repair factor BRCA1 depletion occurs in Alzheimer brains and impairs cognitive function in mice. Nat. Commun. 6, 8897 (2015).
Article PubMed CAS Google Scholar
Mano, T. et al. Neuron-specific methylome analysis reveals epigenetic regulation and tau-related dysfunction of BRCA1 in Alzheimer’s disease. Proc. Natl Acad. Sci. USA 114, E9645–E9654 (2017).
Article PubMed PubMed Central CAS Google Scholar
Algouneh, A. et al. Dual BRCA1 and BRCA2 pathogenic variants in an adolescent with syndromic intellectual disability. Clin. Case Rep. 10, e6202 (2022).
Article PubMed PubMed Central Google Scholar
Gassmann, K. et al. Species-specific differential AhR expression protects human neural progenitor cells against developmental neurotoxicity of PAHs. Environ. Health Perspect. 118, 1571–1577 (2010).
Article PubMed PubMed Central CAS Google Scholar
Ojo, E. S. & Tischkau, S. A. The role of AhR in the hallmarks of brain aging: friend and foe. Cells 10, 2729 (2021).
Article PubMed PubMed Central CAS Google Scholar
Xie, L. et al. OTUB1 attenuates neuronal apoptosis after intracerebral hemorrhage. Mol. Cell. Biochem. 422, 171–180 (2016).
Article PubMed CAS Google Scholar
Shen, J. et al. Deubiquitylating enzyme OTUB1 facilitates neuronal survival after intracerebral hemorrhage via inhibiting NF-κB-triggered apoptotic cascades. Mol. Neurobiol. 61, 1726–1736 (2024).
Article PubMed CAS Google Scholar
Wang, X. et al. OTUB 1 inhibits CNS autoimmunity by preventing IFN‐γ‐induced hyperactivation of astrocytes. EMBO J. 38, e100947 (2019).
Article PubMed PubMed Central Google Scholar
Xing, X., Xu, F., Wang, Y. & Liu, H. Role of the OTUB1/IRF7/NOX4 axis in oxidative stress injury and inflammatory responses in mice with Parkinson’s disease. Psychogeriatrics 23, 32–44 (2023).
Article PubMed Google Scholar
Edvardson, S. et al. Joubert Syndrome 2 (JBTS2) in Ashkenazi Jews is associated with a TMEM216 mutation. Am. J. Hum. Genet. 86, 93–97 (2010).
Article PubMed PubMed Central CAS Google Scholar
Valente, E. M. et al. Mutations in TMEM216 perturb ciliogenesis and cause Joubert, Meckel and related syndromes. Nat. Genet. 42, 619–625 (2010).
Article PubMed PubMed Central CAS Google Scholar
Guo, J. et al. Developmental disruptions underlying brain abnormalities in ciliopathies. Nat. Commun. 6, 7857 (2015).
Article PubMed CAS Google Scholar
Chen, J. V. et al. Rootletin organizes the ciliary rootlet to achieve neuron sensory function in Drosophila. J. Cell Biol. 211, 435–453 (2015).
Article PubMed PubMed Central CAS Google Scholar
Tai, H. C., Besche, H., Goldberg, A. L. & Schuman, E. M. Characterization of the brain 26S proteasome and its interacting proteins. Front. Mol. Neurosci. 3, 1236 (2010).
Google Scholar
Yi, J. J. & Ehlers, M. D. Emerging roles for ubiquitin and protein degradation in neuronal function. Pharm. Rev. 59, 14–39 (2007).
Article PubMed CAS Google Scholar
Ugun-Klusek, A. et al. Continued 26S proteasome dysfunction in mouse brain cortical neurons impairs autophagy and the Keap1-Nrf2 oxidative defence pathway. Cell Death Dis. 8, e2531–e2531 (2017).
Article Google Scholar
Myeku, N. et al. Tau-driven 26S proteasome impairment and cognitive dysfunction can be prevented early in disease by activating cAMP-PKA signaling. Nat. Med. 22, 46–53 (2015).
Article PubMed PubMed Central Google Scholar
Popesco, M. C. et al. Human lineage-specific amplification, selection, and neuronal expression of DUF1220 domains. Science 313, 1304–1307 (2006).
Article PubMed CAS Google Scholar
Zimmer, F. & Montgomery, S. H. Phylogenetic analysis supports a link between DUF1220 domain number and primate brain expansion. Genome Biol. Evol. 7, 2083–2088 (2015).
Article PubMed PubMed Central CAS Google Scholar
Dumas, L. J. et al. DUF1220-domain copy number implicated in human brain-size pathology and evolution. Am. J. Hum. Genet. 91, 444-454 (2012).
Article PubMed PubMed Central CAS Google Scholar
Davis, J. M. et al. DUF1220 copy number is linearly associated with increased cognitive function as measured by total IQ and mathematical aptitude scores. Hum. Genet. 134, 67 (2015).
Article PubMed Google Scholar
Davis, J. M. et al. DUF1220 dosage is linearly associated with increasing severity of the three primary symptoms of autism. PLoS Genet. 10, 1004241 (2014).
Article Google Scholar
Davis, J. M., Searles Quick, V. B. & Sikela, J. M. Replicated linear association between DUF1220 copy number and severity of social impairment in autism. Hum. Genet. 134, 569 (2015).
Article PubMed PubMed Central CAS Google Scholar
Davis, J. M., Heft, I., Scherer, S. W. & Sikela, J. M. A third linear association between Olduvai (DUF1220) copy number and severity of the classic symptoms of inherited autism. Am. J. Psychiatry 176, 643–650 (2019).
Article PubMed PubMed Central Google Scholar
Searles Quick, V. B., Davis, J. M., Olincy, A. & Sikela, J. M. DUF1220 copy number is associated with schizophrenia risk and severity: implications for understanding autism and schizophrenia as related diseases. Transl. Psychiatry 5, e697 (2015).
Article PubMed PubMed Central CAS Google Scholar
Keeney, J. G. et al. DUF1220 protein domains drive proliferation in human neural stem cells and are associated with increased cortical volume in anthropoid primates. Brain Struct. Funct. 220, 3053–3060 (2014).
Article PubMed PubMed Central Google Scholar
Fiddes, I. T. et al. Human-specific NOTCH2NL genes affect Notch signaling and cortical neurogenesis. Cell 173, 1356–1369.e22 (2018).
Article PubMed PubMed Central CAS Google Scholar
Florio, M. et al. Evolution and cell-type specificity of human-specific genes preferentially expressed in progenitors of fetal neocortex. eLife 7, e32332 (2018).
Article PubMed PubMed Central Google Scholar
Fiddes, I. T., Pollen, A. A., Davis, J. M. & Sikela, J. M. Paired involvement of human-specific Olduvai domains and NOTCH2NL genes in human brain evolution. Hum. Genet. 138, 715–721 (2019).
Mendizabal, I. et al. Comparative methylome analyses identify epigenetic regulatory loci of human brain evolution. Mol. Biol. Evol. 33, 2947–2959 (2016).
Article PubMed PubMed Central CAS Google Scholar
Zeng, J. et al. Divergent whole-genome methylation maps of human and chimpanzee brains reveal epigenetic basis of human regulatory evolution. Am. J. Hum. Genet. 91, 455–465 (2012).
Article PubMed PubMed Central CAS Google Scholar
Schuirmann, D. J. A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability. J. Pharmacokinet. Biopharm. 15, 657–680 (1987).
Article PubMed CAS Google Scholar
Kumar, S., Stecher, G., Suleski, M. & Hedges, S. B. TimeTree: a resource for timelines, timetrees, and divergence times. Mol. Biol. Evol. 34, 1812–1819 (2017).
Article PubMed CAS Google Scholar
Pavlovic, B. J., Blake, L. E., Roux, J., Chavarria, C. & Gilad, Y. A comparative assessment of human and chimpanzee iPSC-derived cardiomyocytes with primary heart tissues. Sci. Rep. 8, 15312 (2018).
Article PubMed PubMed Central Google Scholar
Blake, L. E. et al. A comparison of gene expression and DNA methylation patterns across tissues and species. Genome Res. 30, 250–262 (2020).
Article PubMed PubMed Central CAS Google Scholar
Housman, G., Quillen, E. E. & Stone, A. C. Intraspecific and interspecific investigations of skeletal DNA methylation and femur morphology in primates. Am. J. Phys. Anthropol. 173, 34–49 (2020).
Article PubMed PubMed Central Google Scholar
Hinrichs, A. S. et al. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 34, D590–D598 (2006).
Article PubMed CAS Google Scholar
Barouch, A., Mathov, Y., Meshorer, E., Yakir, B. & Carmel, L. Reconstructing DNA methylation maps of ancient populations. Nucleic Acids Res. 52, 1602–1612 (2024).
Article PubMed PubMed Central CAS Google Scholar
Karolchik, D. et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 32, D493–D496 (2004).
Article PubMed PubMed Central CAS Google Scholar
Huang, D. W., Sherman, B. T. & Lempicki, R. A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37, 1–13 (2009).
Article PubMed Google Scholar
Huang, D. W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).
Article PubMed CAS Google Scholar
Gao, T. & Qian, J. EnhancerAtlas 2.0: an updated resource with enhancer annotation in 586 tissue/cell types across nine species. Nucleic Acids Res. 48, D58–D64 (2020).
PubMed CAS Google Scholar
Machanick, P. & Bailey, T. L. MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics 27, 1696–1697 (2011).
Article PubMed PubMed Central CAS Google Scholar
Gupta, S., Stamatoyannopoulos, J. A., Bailey, T. L. & Noble, W. S. Quantifying similarity between motifs. Genome Biol. 8, R24 (2007).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This publication was made possible through the support of a grant from the John Templeton Foundation (grant ID 61739 to L.C. and E.M.). The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation. This study was also funded by the Israel Science Foundation (grant no. 2436/22 to L.C.) and by the Ministry of Innovation, Science & Technology (grant no. grant 1001584586 to L.C. and E.M.). T.M.-B. is supported by funding from the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 864203), PID2021-126004NB-100 (MICIIN/FEDER, UE), ‘Unidad de Excelencia María de Maeztu’, funded by the AEI (CEX2018-000792-M), NIH 1R01HG010898-01A1 and Secretaria d’Universitats i Recerca and CERCA Programme del Departament d’Economia i Coneixement de la Generalitat de Catalunya (GRC 2021 SGR 00177). L.C. is the Snyder Granadar chair in Genetics. E.M. is the Arthur Gutterman Family Chair for Stem Cell Research. We wish to thank Avigal Be’er for the design of Fig. 1.

Author information

Authors and Affiliations

Department of Genetics, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
Yoav Mathov, Malka Nissim-Rafinia, Chen Leibson, Nir Galun, Eran Meshorer & Liran Carmel
Edmond and Lily Safra Center for Brain Sciences (ELSC), The Hebrew University of Jerusalem, Jerusalem, Israel
Yoav Mathov & Eran Meshorer
Institute of Evolutionary Biology (UPF-CSIC), PRBB, Barcelona, Spain
Tomas Marques-Bonet
Catalan Institution of Research and Advanced Studies (ICREA), Barcelona, Spain
Tomas Marques-Bonet
CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
Tomas Marques-Bonet
Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain
Tomas Marques-Bonet
Orthopedic Department, Hadassah – Hebrew University Medical Center, Jerusalem, Israel
Arye Kandel & Meir Liebergall

Authors

Yoav Mathov
View author publications
Search author on:PubMed Google Scholar
Malka Nissim-Rafinia
View author publications
Search author on:PubMed Google Scholar
Chen Leibson
View author publications
Search author on:PubMed Google Scholar
Nir Galun
View author publications
Search author on:PubMed Google Scholar
Tomas Marques-Bonet
View author publications
Search author on:PubMed Google Scholar
Arye Kandel
View author publications
Search author on:PubMed Google Scholar
Meir Liebergall
View author publications
Search author on:PubMed Google Scholar
Eran Meshorer
View author publications
Search author on:PubMed Google Scholar
Liran Carmel
View author publications
Search author on:PubMed Google Scholar

Contributions

Y.M., E.M. and L.C. designed the research. Y.M. conducted the analysis. M.N.-R., C.L., T.M.-B., A.K. and M.L. contributed the production of data. N.G. helped collecting data. E.M. and L.C. supervised the study. Y.M., E.M. and L.C. wrote the manuscript.

Corresponding authors

Correspondence to Eran Meshorer or Liran Carmel.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Ecology & Evolution thanks Alison Barton and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Precision on training set as a function of the parameters ka and kp.

Precision of the algorithm in detection of a outgroup-specific fundamental changes and b reference-derived fundamental changes, for each value of the parameters ka and kp. In yellow: the precision when running the algorithm on real data, in blue: the precision when running on permutations.

Extended Data Fig. 2 Precision versus number of detections.

Precision reaches plateau in higher values of the parameters, whereas power is decreasing. a, Precision and the number of outgroup-specific changes detected as a function of the parameter Kp when the other parameters are fixed to the optimized value. b, A similar comparison for the parameter Ka while the rest of the parameters are fixed to the optimized value.

Extended Data Fig. 3 Precision on an independent dataset.

Validating the algorithm by measuring precision on independent dataset. To complement the results presented in the main text, here chimpanzees were selected as reference and modern humans as target.

Extended Data Fig. 4 Mapping of CpG positions in the rhesus genome to the human genome.

Percentage of CpG positions successfully mapped to a human CpG position from rhesus bulk brain samples. If a rhesus CpG position was originally mapped by liftOver to coordinate c, the figure shows the percentage of positions where the human reference genome has a CpG at coordinate c + shift.

Supplementary information

Reporting Summary

Supplemantary Tables 1–9.

Table 1. Precision in predicting fundamental changes on validation datasets. ‘Smooth’ determines whether the DNA methylation signal was smoothed. Table 2. List of DNA methylation samples used in this study. WGBS, whole-genome bisulfite sequencing; RRBS, reduced representation bisulfite sequencing. Table 3. Optimized parameters for finding reference-derived and outgroup-specific DNA methylation changes. Table 4. Optimized parameters on different tissue pairs. Rows, tissue all; columns, tissue partial. Table 5. CpGs positions associated with outgroup-specific (a) and reference-derived (b) fundamental DNA methylation changes when comparing bone to prefrontal neurons. Five right columns report average methylation level in the relevant samples. Table 6. Genes with large number of CpG positions associated with outgroup-specific (a) and reference-derived (b) fundamental changes. CGI, CpG island. Table 7. DAVID output of enrichment analysis for genes that are targeted by enhancers intersecting with CpG positions associated with fundamental methylation changes. Count, number of genes associated with the term from our list; list-length, total number of detected genes; pop term, total number of human genes associated with this term; pop total, total number of human genes in the Gene Ontology database. Enrichment tests were done using DAVID^75,76, accounting for multiple comparisons. Table 8. MEME-ChiP output for motifs overlapping fundamental changes linked to enhancers, and the transcription factors that bind these motifs. Table 9. CpGs and genes associated with outgroup-specific (a) and reference-derived (b) fundamental DNA methylation changes using bulk prefrontal cortex as tissue partial. Five right columns report average methylation level in the relevant samples.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Mathov, Y., Nissim-Rafinia, M., Leibson, C. et al. Inferring DNA methylation in non-skeletal tissues of ancient specimens. Nat Ecol Evol 9, 153–165 (2025). https://doi.org/10.1038/s41559-024-02571-w

Download citation

Received: 07 May 2023
Accepted: 01 October 2024
Published: 20 November 2024
Issue date: January 2025
DOI: https://doi.org/10.1038/s41559-024-02571-w