Introduction

Neanderthals are the best-documented pre-modern humans in the fossil record in terms of morphology, genetics, behavior, and culture1,2,3,4,5,6. Recent molecular-clock-based analyses place their divergence—together with Denisovans—from the modern human lineage 765–550 ka3,7,8, or older based on morphological data9,10. The Neanderthal lineage differentiated soon thereafter, as testified by the genetic8,11 and morphological12 evidence from the Middle Pleistocene Sima de los Huesos humans8,11, previously considered to represent Homo heidelbergensis12,13,14 and now regarded as a basal population of the Neanderthal lineage8,11,15. While the genetic divergence time is now well established for the clade as a whole3,7,8, debate1,16,17,18,19 persists on the relationships between European Middle Pleistocene populations, Middle and Late Pleistocene Neanderthal populations, as well as the evolutionary processes that led to the evolution of a full “classic Neanderthal” (sensu ref. 20) morphology in late Neanderthals. This is due to the mosaic morphology of the Middle Pleistocene specimens, the so-called “muddle-in-the-Middle”, from which Neanderthals are argued to have evolved1,20,21.

African Homo erectus, sometimes regarded as a separate species (Homo ergaster)22, is widely considered to be ancestral to many later Homo species, including two, highly debated taxa: Homo antecessor and H. heidelbergensis. Each of these taxa has previously been suggested as the ancestor of the Neanderthal and modern human clades13,16,17,23, and the phylogenetic position of these species has several implications for human evolution in the Middle Pleistocene (see ref. 24 and references therein for a review). While H. antecessor remains are limited to Europe, ideas surrounding H. heidelbergensis have alternatively suggested its presence in both Africa and Europe or as a taxon restricted to the European continent. If H. heidelbergensis is confined to Europe, this implies the presence of a distinct African species (namely, Homo rhodesiensis/bodoensis), which has been proposed as the direct ancestor of modern humans25. Despite extensive research devoted to solving Middle Pleistocene taxonomy and phylogeny24,25 most of these questions remain unresolved.

At present, two main hypotheses have been proposed to explain the onset of the derived late Neanderthal morphology. On one hand, the accretion model hypothesizes that the derived Late Pleistocene Neanderthal morphology evolved as a result of largely anagenetic processes, with traits accumulating slowly and continuously through time, likely as the result of complex processes of deme isolation and genetic drift driven by dramatic climate change1,20,26,27,28. On the other hand, the two-phase model (organismic model) hypothesizes that the derived late Neanderthal morphology developed through two important evolutionary processes. First, Middle Pleistocene European populations, potentially H. heidelbergensis, experienced allometric changes to the cranio-facial system driven by changes in body size. Second, evolutionary developmental biases, such as dramatic increases in encephalization or other local evolutionary processes reoriented the cranio-facial structure, which ultimately led to the speciation of Homo neanderthalensis29,30. While support for one or the other hypothesis has shifted through time, a scenario combining elements of both hypotheses appears likely30,31,32,33,34,35.

Paleodemographic approaches to Neanderthal evolution have argued that these populations were subjected to frequent fluctuations throughout the Middle to Late Pleistocene, due to highly variable climatic conditions and population migration1,28,36,37. Drastic climatic changes likely have had profound impacts on the genetic and morphological variability of the Neanderthal lineage, and it is now possible to make robust inferences based on ancient DNA (aDNA) about population dynamics of the clade in the last 100 kyr. Several studies have shown that Neanderthals would have had a small effective population size, suggesting a highly fragmented metapopulation36,38,39. The demographic history of Neanderthals also shows complex population dynamics, including long-term isolation of some Neanderthal groups40, genetic evidence of admixture events between different Neanderthal groups, at least two population turnovers during the Late Pleistocene2,6,39, and reduced genetic variation. Recent research provided information for an even more complex history of admixture where gene flow (up to 5%) came also from early Homo sapiens41. While advances in DNA extraction techniques enable the retrieval of aDNA from increasingly old remains, as well as directly from sediments6,42,43, only a few aDNA sequences are known for sites older than 100 ka6,8,34,44, meaning that less information is available about the genetic diversity of the Neanderthal clade during the Middle Pleistocene. Current aDNA estimates suggest that bottlenecks occurred at different stages of their evolution, starting with one just before the split of Neanderthals from Denisovans, ~473 ka42 or even earlier ( > 700 ka)45. These studies have predicted that Middle Pleistocene European populations would have recovered soon thereafter and then gradually declined over time33,38,46. However, while the genetic diversity of the Neanderthal clade is unknown for this span of time, the fossil record highlights the presence of marked morphological differences between European Middle Pleistocene individuals. In contrast, studies have shown that late Neanderthals are a fairly uniform group by the beginning of the Late Pleistocene, exhibiting less variation than modern humans both genetically7,47,48 and morphologically1,49. This reduced variation might be the result of a bottleneck still unidentified in the genetic record, which would have occurred sometime between MIS 8 (300 ka)50 and MIS 6 (191 ka)50, coinciding with the appearance of a nearly fully derived Neanderthal morphology in the fossil record29,36,51.

In the absence of aDNA evidence, the morphology of phylogenetically informative anatomical structures is the best source of data to investigate the evolutionary history of Pleistocene humans, including identifying potential bottlenecks related to the origin of Neanderthals. Although phenotypic diversity (i.e., disparity) results from the complex interaction of environmental and genetic factors, the disparity of a species (and of the populations it is made of) is largely influenced by its underlying genetic diversity52. Hence, greater phenotypic variation should be present before a drastic reduction of the genetic variation, such as before a bottleneck event52. Thus, a drastic decrease in disparity might represent strong evidence for the occurrence of a bottleneck, although evolutionary phenomena, such as stabilizing selection and genetic flow might also play a role in reducing the amount of phenotypic variation. It has been demonstrated that anatomical structures that evolved following a neutral pattern, as is the case of the bony labyrinth53, are best suited to track changes in the underlying genotype54,55,56. Hence, the study of the inner ear bony labyrinth has considerable potential. Housed in the petrosal portion of the temporal bone, the bony labyrinth is comprised of the semicircular canals and vestibule (enclosing the membranous structures responsible for detecting angular and linear accelerations of head movements), and the cochlea (housing the organ of hearing). Not only are petrosals abundant in the fossil record, but they are also often well-preserved, even in fragmentary skeletal remains57. Inner ear variation has also been shown to be useful for reconstructing phylogenetic relationships between hominin species as well as a marker of population history and dispersal in modern humans58,59,60,61,62.

Late Pleistocene Neanderthals (and possibly Denisovans63) are known for the possession of derived states in the morphology of their bony labyrinth relative to modern humans, which retained the ancestral condition shown by early Homo species in several features (Table 1). Among others, Neanderthal derived traits include small anterior and posterior canals relative to a larger lateral canal, and a clockwise rotation (when the left bony labyrinth is observed in lateral view) along the sagittal plane of the vertical canals causing the presence of a so-called high sagittal labyrinthine index—i.e., their lateral canal plane intersects the posterior one very superiorly (Fig. 1c–e)64,65,66. Vertical canal size reduction is currently deemed as the earliest character to have emerged during Neanderthal evolution, while a high sagittal labyrinthine index—possibly their most distinctive feature—evolved later in time and is not present in the earliest representatives of the clade, such as Sima de los Huesos humans from Spain67,68, as well as other European Middle Pleistocene individuals65,69. The bony labyrinth of a partial calvaria from Portugal (Aroeira 3) shows a more primitive morphology, in spite of its spatial and temporal proximity to the Sima de los Huesos humans. Indeed, it lacks both the derived canal proportions of Neanderthals and their distinctive superiorly placed lateral canal. The plesiomorphic morphology of Aroeira 3, matching the inferred primitive condition for late hominins (Table 1), instead suggests the likely coexistence of relatively isolated paleodemes even in close temporal and geographical proximity69,70. A landmark-based 3D geometric morphometric analysis of the Sima de los Huesos and Late Pleistocene Neanderthal bony labyrinth highlighted a remarkable variation in the latter group68. This result contrasts with the evidence coming from overall cranial morphology, showing that modern humans possess much greater shape variation relative to Late Pleistocene Neanderthals49. However, that study did not evaluate any population within the Neanderthal clade dated between MIS 10 (374 ka)50 and MIS 5 (123 ka)50. Therefore, an analysis of the bony labyrinth morphology from a wider chronological framework may shed light on the processes related to the transition between Sima de los Huesos humans and late Neanderthals, particularly regarding the phenotypic, and presumed underlying genetic, diversity of these populations.

Table 1 Summary of previously inferred polarity states for bony labyrinth variables in hominins
Fig. 1: Summary of individuals included in the analyses.
figure 1

a The geographical and chronological range of the Neanderthal clade individuals and modern humans included in the analysis. The symbols correspond to the n of individuals from each site and are colored as follows: blue = Sima de los Huesos; purple = Krapina; green = late Neanderthals; orange = modern humans. Orange ellipses correspond to the approximated geographical range of the extant modern human assemblage. Age estimates of fossil specimens are provided in Table 1. Background satellite image was downloaded from © Mapbox, © OpenStreetMap, © Maxar and rescaled to fit to figure panel size. be Three-dimensional models of the left semicircular canal and vestibule of selected individuals for the four groups considered in the present study.

Here we investigate the evolution of the semicircular canals in the Neanderthal clade using diffeomorphometry (DMorph)—a landmark-free approach that allows direct comparison of continuous surfaces and is readily sensitive to changes in canal volumetric proportions71. To fulfill this aim, DMorph was used on the semicircular canal shape of three samples that encapsulate the chronological and geographical distribution of the European and Western Asian Neanderthal clade. Two relatively large and chronologically constrained samples—a Middle Pleistocene population from Sima de los Huesos and a late Middle Pleistocene population from Krapina—and a spatially and chronologically heterogeneous sample of late Neanderthal individuals. The latter was designed to maximize the morphological variation for Late Pleistocene Neanderthals. These groups were compared to an H. sapiens sample comprised of extant Europeans and two Upper Paleolithic modern humans (Cro-Magnon 1 and Laugerie-Basse).

We conceived four scenarios (Fig. 2) for the variation of the Neanderthal clade throughout its evolution, with a variable number of potential bottlenecks in phenotypic diversity:

Fig. 2: Scenarios of phenetic variation along the evolution of the Neanderthal clade.
figure 2

a Scenario #1: no occurrence of bottleneck events; b Scenario #2: occurrence of an early bottleneck event between Sima de lo Huesos and Krapina populations; c Scenario #3: occurrence of a late bottleneck event after population of Krapina and before late Neanderthals; d Scenario #4: occurrence of both an early and late bottleneck events. The horizontal line is a schematic representation of time with earlier occurring groups on the left and later occurring ones on the right. The thickness of the horizontal bars corresponds to the suggested phenetic variation. Colors are coded as follows: blue = Sima de los Huesos; purple = Krapina; green = late Neanderthals.

Scenario #1: No bottleneck events (Fig. 2a). Based on this scenario, we predict that there would not be a clear morphological break in the fossil record, either (1) after the estimated age of the Sima de los Huesos and prior to the onset of a fully derived Neanderthal morphology or (2) between the estimated age of Krapina and late Neanderthals. In addition, Sima de los Huesos, Krapina, and late Neanderthal samples should show similar levels of morphological variation (i.e., Sima de los Huesos population ≈ Krapina population ≈ late Neanderthals). This scenario is consistent with a fully anagenetic accretion model.

Scenario #2: Early bottleneck event (Fig. 2b). If a bottleneck occurred after the estimated age of Sima de los Huesos and prior to that of Krapina, we predict that the variation of the Sima de los Huesos humans should be greater than that of both the Krapina population and late Neanderthals (i.e., Sima de los Huesos population > Krapina population ≈ late Neanderthals). This scenario is consistent with the two-phase model as originally conceived29,30.

Scenario #3: Late bottleneck event (Fig. 2c). If a bottleneck occurred after the estimated age of Krapina and prior to that of late Neanderthals (Fig. 2c), we predict that the variation in the Sima de los Huesos and Krapina populations should be greater than in late Neanderthals (i.e., Sima de los Huesos population ≈ Krapina population > late Neanderthals). This scenario would represent a variant of the two-phase model, with a bottleneck event occurring later in time, within late Neanderthals.

Scenario #4: Early and late bottleneck events (Fig. 2d). If both early and late bottleneck events occurred, we predict a significant decline in variation both (1) after the estimated age of the Sima de los Huesos and prior to that of Krapina as well as (2) between the estimated age of the Krapina and that of late Neanderthals (i.e., Sima de los Huesos > Krapina > late Neanderthals). In this scenario, population dynamics (i.e., drift and/or extinction) would be the main driver of the Neanderthal clade evolution, and it would be compatible with an accretion model where Neanderthal features become fixed in the population through time, reaching their highest frequency in late Neanderthals.

These scenarios are put to test by comparing the disparity of semicircular canal and vestibule shape, as captured by our DMorph approach, of each of the considered groups. Disparity is a measure of morphological diversity and describes the shape (or trait) space occupancy—i.e., the size, density, and interindividual distance—of groups of individuals, as well as (more frequently) species or clades72. The disparity of a group can be measured by means of metrics that allow testing a diverse array of ecological, taxonomical, and evolutionary hypotheses at a given moment or through time73. Although disparity analyses are most frequently used in macroevolutionary studies74, the amount of morphological diversity in anatomical areas considered to have likely evolved following a neutral pattern has been used to shed light on population structure and dispersal scenarios49,75.

Here, we use a similar approach and measure the morphological disparity in the semicircular canal and vestibule of Neanderthal clade groups to use it as a proxy for their underlying genetic diversity. Results highlight a high morphological disparity in the Middle Pleistocene and show the presence of a phenotypic bottleneck within the Neanderthal lineage after the estimated age of Krapina. No reductions are instead identified at earlier stages, as postulated based on an organismic model of evolution. Differences in vestibular system shape also hint at the possible presence of local evolution in the analyzed Middle Pleistocene paleodemes.

Results

DMorph shape analysis results are shown as scatterplots of the first four principal components of a principal component analysis (PCA) performed on the deformation fields (Fig. 3; see Supplementary Data 1 for an interactive 3D plot of PC1–3). The first principal component (PC1; 32.3% of variance) separates modern humans from the Neanderthal clade specimens, with minimal overlap among confidence intervals (Fig. 3a; Supplementary Fig. 1a). This is due to the smaller and thicker semicircular canals and common crus in the Sima de los Huesos, Krapina, and late Neanderthal samples. PC1 also captures changes in the relative position of the lateral and posterior canals (the lateral canal is positioned more superiorly relative to the posterior canal in the Neanderthal clade), in accordance with the results of previous analyses65,66. Posterior canal size and shape also distinguishes modern humans from the Neanderthal clade individuals along PC1, being rounder and smaller in the latter. PC2 (8.2% of variance) is influenced by subtler shape changes in the position of the lateral canal insertion on the common crus, as well as in the morphology of all three canals (Supplementary Fig. 1b). Although considerable overlap is observed between modern humans and the Neanderthal clade groups (Fig. 3a; Supplementary Data 1), PC2 shows the minimal overlap between the ranges (greater overlap when confidence intervals are considered) of late Neanderthals and Sima de los Huesos (on negative and intermediate values, respectively), as well as Krapina (positive scores) samples. This is due to the thicker lateral canal insertion on the common crus, smaller anterior and more medio-laterally compressed anterior and lateral canals, as well as a more laterally projecting posterior canal in Krapina specimens relative to late Neanderthals. Along PC3 (7.6% of variance) the group ellipses broadly overlap, although several Sima de los Huesos specimens are separated from those of Krapina and late Neanderthals on positive scores (Fig. 3b) due to the possession of a smaller, lower positioned lateral canal with a squared outline, larger ampullary portions of the anterior and lateral canals, and an oval shaped posterior canal with the superior portion that is superiorly projecting (Supplementary Fig. 1c). Furthermore, in Sima de los Huesos humans the anterior and lateral canals are larger and smaller, respectively, than in late Neanderthals (Supplementary Fig. 1c). PC4 (7.5% of variance) is mostly driven by changes in lateral canal shape, size, and position, as well as anterior canal shape (Supplementary Fig. 1d). All groups appear highly variable in these aspects, with modern humans that show the highest variation in lateral canal size.

Fig. 3: Bivariate scatterplots of a principal component analysis performed on the deformation-fields (raw shape data) obtained from the diffeomorphometric analysis of semicircular canals.
figure 3

a PC2 vs. PC1; b PC4 vs. PC3. The variance explained by each PC is given within parentheses. 95% confidence ellispes, symbols, and cf group average shapes (in posterolateral and posteromedial views) are color-coded based on group membership: blue = Sima de los Huesos; purple = Krapina; green = late Neanderthals; orange = modern humans. Source data for this figure can be reproduced using the code and files provided in Supplementary Code 1 and Supplementary Data 3.

The four groups are well separated, and the ellipsoids barely touch one another when the PC1–3 (46.7% of variance) are considered (Supplementary Data 1). Particularly, there is a clear separation between modern humans and the Neanderthal clade groups in PC1 (driven by canals and common crus thickness), whereas Neanderthal clade populations are apart along PC2-3 due to differences in the lateral canal position and posterior canal shape (Fig. 3b).

Modern humans show tighter clustering than the Neanderthal clade groups as a whole in the multidimensional shape space (PC1-4; Fig. 3), thus occupying a smaller hypervolume of the morphospace with their convex hull (modern humans = 42.84; Neanderthal clade = 237.05). This is an expected result as the latter covers a broader chronological and geographical range, whereas modern humans, besides two individuals from the Late Pleistocene (Cro-Magnon 1 and Laugerie-Basse), belong to a recent European-only population. Within the Neanderthal clade, late Neanderthals show the smallest variation (hypervolume = 5.17) despite differences in age and geographical provenance. Krapina individuals are more variable along these PCs (hypervolume = 7.85), whereas Sima de los Huesos humans are even more diverse (hypervolume = 19.29).

The disparity metrics based on shape space (PC1-19; 86.7% of variance), which are insensitive to sample size and thus allow the inspection of small samples, show significant differences between each of the considered groups (Fig. 4a–c; Table 2, p < 0.001). Late Neanderthals represent the least variable group for all metrics (Fig. 4a–c), even after group rarefaction (Supplementary Table 1) and regardless of the CT scan type or voxel size (Supplementary Fig. 3a–c). Modern humans and Sima de los Huesos humans show intermediate phenotypic variation, while the Krapina population is most variable both in terms of group dispersion size and looser clustering (Fig. 4a–c). Thus, we observe an increase in phenotypic variation between Sima de los Huesos and Krapina populations, followed by a variation drop between the Krapina and late Neanderthal samples.

Fig. 4: Shape and size differences.
figure 4

Frequency plots of morphological variance (a) and disparity (b) for each group in the semicircular canal shape space. Bootstrapped group distributions (20,000 bootstraps) were computed based on PC1-19 scores (86.7% of variance) to account for the small number of available specimens for fossil groups. The solid lines represent density curves of the frequencies for morphological variance and disparity values between all possible pairs of individuals within each bootstrapped group. The vertical axis represents the frequency of a given disparity or variance value obtained from the bootstrapping of the sample. Vertical dashed lines correspond to group averages for morphological variance and disparity. c Line plot of pair-wise Euclidean distances for the bootstrapped group distribution (20,000 bootstraps). The solid line corresponds to the second and third quartile; dashed lines represent first and fourth quartiles; black dot shows median of the distribution. d Natural logarithm (ln) of the centroid size of semicircular canals for the individuals included in the analysis. Classic Neanderthals display the largest size variation together with modern humans. Color coding as in Figs. 1 and 3. Source data for this figure can be reproduced using the code and files provided in Supplementary Code 1 and Supplementary Data 3.

Table 2 Results of morphological variance, disparity, and pair-wise Euclidean distance analyses

The observed shape variation might be a consequence of allometric effects, since it is known that shape variance is partly influenced by semicircular canal and vestibule size59. However, ordinary least squares regressions of PC scores against centroid size natural logarithm were not significant. Furthermore, the groups with the least (late Neanderthals) and most (Krapina individuals) variation show the largest and smallest ranges of values for semicircular canal and vestibule size, respectively (Fig. 4d), suggesting that overall size does not affect morphological variation (see Supplementary Fig. 3d for assessment of voxel size influence on 3D mesh centroid size). Although centroid size ranges are very variable among the groups, their averages do not differ significantly when inspected with a one-way analysis of variance, a result that is somewhat unexpected based on previous findings76. Sexual differences might also play a role in the increased observed variation of Krapina specimens, as some bony labyrinth variables might be informative of biological sex (ref. 77 but see ref. 78), and larger sexual differences relative to modern Europeans were reported for the temporal bone morphology of the Krapina sample79. However, no clear, sex-related patterning is observable in our data (although information is not available for several individuals; Supplementary Fig. 2), thus excluding sex as the driver of the increased morphological variance of the Krapina population.

Discussion

The analysis of the Neanderthal semicircular canal shape highlighted the presence of a newly described feature—thicker and more voluminous canals and common crus—differentiating this lineage from modern humans. Variation in the volume of the canal lumina has been shown to differentiate great apes and humans from other primates58,59, yet functional interpretations of this feature are not straightforward. A greater semicircular duct lumen size would imply a reduced response steadiness to external angular stimuli (i.e., the abrupt changes of the position and/or posture)80,81, which might be related to differences in the kinematic properties of the Neanderthal clade head-and-neck complex65. However, the thickness of the semicircular ducts and ampullae (the sensory organs for angular accelerations) is poorly predicted by semicircular canal lumen size, as the membranous ducts occupy a variable percentage of the bony canal lumen and volume82,83. Rather, the thicker canals of Neanderthals might be attributable to non-adaptive processes such as genetic drift or pleiotropy, as has been shown to be the case for enamel thickness and taurodontism84,85,86.

The distinctiveness within the Neanderthal clade of Sima de los Huesos, Krapina, and late Neanderthal samples in our study corroborates that semicircular canal morphology represents a valuable proxy to analyze affinities in closely related taxa62,64,65,87 and even species populations61. DMorph methods have revealed additional group-specific differences beyond those previously identified based on linear measurement-67,88,89,90 and 3DGM landmark-based approaches59,63,68. The highlighted group differences are especially relevant for Krapina individuals, which are customarily considered as showing clear similarities with late Neanderthals and, as such, belonging to the same species20,66,91. Our analysis has highlighted that they differ from late Neanderthals in showing a thicker common crus and a more posterolaterally projecting posterior canal shape, in addition to a distinctive temporal bone morphology92. The separation of the Sima de los Huesos sample from Neanderthals (Krapina + late Neanderthals) is also coherent with the clear genetic7,8,11 and morphological differences shown by the Neanderthal clade groups in several anatomical regions1,12,14, which in some cases led to the allocation of these specimens in two separate species (i.e., H. heidelbergensis and H. neanderthalensis)12,13,14. Our results further remark on the post-Sima de los Huesos population acquisition of a late Neanderthal-like lateral and posterior canal relative positioning (i.e., a high sagittal labyrinthine index), which is instead more frequently recorded in the individuals from Krapina65,66,67. It is noteworthy that labyrinth size does not play a role in the observed shape differences in the later Homo taxa included in our study, in contrast to what previously reported for more taxonomically diverse samples59,62,93.

The present study has revealed a clear pattern of phenotypic diversity through time in the Neanderthal clade. Considering that bony labyrinth61 and semicircular canal shape59 likely reflect underlying genetic variation and population history, our results further strengthen the argument of a strong role played by drastic fluctuations in population size in the evolution of Neanderthals. Previous analyses suggested that the Neanderthal lineage underwent a severe population reduction 500–600 ka, just before the Denisovan-Neanderthal split45. The similarities in the morphological variance and disparity between Sima de los Huesos (an assemblage which likely represents a single Middle Pleistocene paleodeme14,94) and European modern human samples suggest that the loss of variation caused by this bottleneck event was partly recovered as early as ~430 ka95, at least at the phenotype level. Morphological disparity would have then peaked by the end of the Middle Pleistocene, as demonstrated by the greater diversity of Krapina individuals. While our results are in accordance with the population size recovery at ~400 ka suggested by molecular studies33,45, the remarkable phenotypic disparity reported for individuals that lived during MIS 5 is in contrast with the constant and gradual decline of the genetic variation and population size, previously inferred for the Neanderthal clade33,38,46. Similarly, the disparity pattern observed within the Neanderthal lineage demonstrates that Krapina individuals were more variable than Sima de los Huesos humans, thus contradicting our Scenario #2 that a bottleneck event occurred early in the evolution of the Neanderthal clade, between the Sima de los Huesos and Krapina populations.

Identifying the causes of the increased phenotypic diversity of the Krapina sample is not a trivial task. It might be the result of character fixation caused by genetic drift, which has been suggested as an important driver for morphological variation increase53,75,96. As an alternative, it might be caused by a complex population history for Neanderthals where, despite the overall small size of the groups, admixture events among paleodemes might have been common75, possibly favored by the milder climatic circumstances97. The more favorable climatic conditions might themselves be a source of increased morphological variation, as it has been shown that stable environmental conditions favor the enlargement of niches and provide enhanced ecological opportunities, ultimately leading to greater phenotypic diversity98. Recent aDNA analyses highlighted multiple gene flow events from the earliest H. sapiens immigrants in Europe into the Neanderthals’ genetic pool as early as 200 ka41,99. Thus, we must consider whether the greater morphological variation of the Krapina assemblage could be the result of hybridization. Indeed, previous studies have demonstrated that hybridization is a major source of phenotypic variation100,101, and there is increasing evidence suggesting frequent admixture during the Middle Pleistocene24,102.

In addition to discounting Scenario #2, our results also show a lack of evidence for an early-occurring bottleneck event and thus warrant the rejection of Scenario #4, which would imply the occurrence of both an early and late bottleneck along the Neanderthal clade evolution. Instead, morphological variation remained high throughout MIS 6 within the Neanderthal clade, which is not compatible with the occurrence of an early bottleneck event prior to the onset of a full Neanderthal morphology29,36,51. Nevertheless, given our low sample coverage between MIS 11 (424 ka)50 and MIS 5, we cannot rule out the possibility of a temporary decline in population size and genetic diversity within the Neanderthal clade, which could have then rapidly recovered before Krapina. This instead must be tested with the inclusion of additional Middle Pleistocene individuals (e.g., Steinheim, Petralona, Aroeira 3), although the overall limited number of specimens for this time range might partially hinder a proper assessment.

Our study suggests that there was a late bottleneck event in the evolution of the Neanderthal clade, as demonstrated by the significantly lower morphological variation in late Neanderthals relative to Krapina and (less so) Sima de los Huesos populations. This is further evidenced by the fact that late Neanderthals have a significantly lower variation even when compared to modern humans—thus providing strong support to Scenario #3 that an abrupt and marked reduction in the population, and associated phenotypic and, likely, genetic diversity, occurred after the currently accepted date for the Krapina sample (130–120 ka103). The evidence for a late-occurring reduction in genetic and phenotypic variation also eliminates Scenario #1, which posited the absence of any bottleneck event occurring in the evolution of the Neanderthal clade. The low diversity recovered for late Neanderthals, but not for Sima de los Huesos or Krapina populations, thus suggests that only the latest phase of the Neanderthal clade evolution was characterized by lower levels of morphological and genetic heterogeneity, contrary to previous assertions1,31,44,47,48,49,104.

Our results are instead consistent with paleoenvironmental104, paleoecological105, and molecular6,34,39 data, which testify to the presence of at least two population turnovers and genetic variation reduction events for Western European Neanderthals in the last 100 kyr—during MIS 56 and later in the Late Pleistocene ( ~ 48 ka or between 77–54 ka)39. On one hand, the population turnover, accompanied by mtDNA diversity loss documented for Southwestern Europe between 130–105 ka6 would be compatible with the timing of phenotypic variation reduction revealed by our results. This would imply the occurrence of a bottleneck event at the boundary between Middle and Late Pleistocene, immediately after the age of Krapina102. This would also slightly predate the estimated age for the last common ancestor of all later European Neanderthal populations ( ~ 97 ka)2,44. On the other hand, the younger estimates for genetic variation reduction provided by Dalen et al. ( ~ 48 ka)39 postdate most of the specimens included in the late Neanderthal sample for the current analysis and would therefore imply a later occurring event. However, the lower range of their estimates (77–54 ka) for the disappearance of most Western Europe populations of Neanderthals (but see refs. 40,106) and the decline in Neanderthal mtDNA variation could be reconciled with the evidence presented here. Additional insights into population dynamics and their role in shaping the morphological diversity distribution observed here could be obtained by analyzing the variation of western Asian Neanderthals (Shanidar, Obi-Rakhmat, and Teshik Tash, among others), which were unfortunately not available for the present analysis.

The identification of a phenotypic bottleneck later than 130–120 ka has consequences for our understanding of Neanderthal evolution. Together with the evidence accumulating from genetic analyses 6,7,33,34,39,42,107, our results highlight that neither accretion nor organismic models are entirely capable of describing the complex processes that shaped the variation observed in Middle and Late Pleistocene fossil record of Neanderthals. In particular, the drop in phenotypic variation does not support a linear model of evolution and cannot be reconciled well with the progressive accumulation of Neanderthal-like features. Instead, the presence of distinctive and primitive features (the posterior canal shape and the modern human-like lateral canal position, respectively) in early members of the Neanderthal lineage (i.e., Sima de los Huesos) suggests mosaic evolution, at least in the bony labyrinth, with some features that might have evolved locally in paleodemes69,108,109. On the other hand, even though the morphological differences identified among the Sima de los Huesos, Krapina, and late Neanderthal samples could be reconciled within an organismic model of Neanderthal evolution, the rebuttal of our first prediction (i.e., the presence of diversity drop between Sima de los Huesos and Krapina populations) does not support the presence of a speciation event during MIS 829 or MIS 650. The late-occurring bottleneck we observe might nonetheless be interpreted as a cladogenetic event for H. neanderthalensis, conceived as a species restricted to post-Krapina Neanderthals. In this case, the Krapina population would either form part of the relict variation of the metapopulation from which late Neanderthals evolved, or represent a distinct and locally diversified lineage within the Neanderthal clade. However, the clear morphological affinities between Krapina and late Neanderthal individuals in many anatomical regions throughout the skeleton strongly suggest that they belong to the same species110,111,112.

In sum, the constant and high phenotypic variation we observed until the end of the Middle Pleistocene is best interpreted in view of a large and structured Neanderthal metapopulation45, although the effective population size might have been smaller33,38,46. Our results show the existence of a bottleneck within the Neanderthal lineage after the time of Krapina. This reduction in morphological and genetic diversity might have been caused by known population turnovers6 and niche shrinkage106. However, we find no evidence of a similar bottleneck during MIS 8-6, as previously proposed under an organismic model of evolution. Hence, the evidence provided by semicircular canal shape variation suggests the need to reconsider the scenario for the origin and evolution of the Neanderthal clade and highlights the role played by population structure. It also suggests that paleodemes might have started to locally evolve and diverge more frequently than previously understood, posing the possibility that some of them (i.e., Krapina individuals) might not have a direct phyletic relationship with later Neanderthal populations.

Methods

No relevant ethical regulations were required for the present study.

Studied sample and comparative material

The analyzed sample (Fig. 1; see Supplementary Data 2 for scanning parameters) comprises medical (CT) and micro computed tomography (µCT) scans of the Sima de los Huesos humans (n = 13, ~430 ka95), later Middle Pleistocene Neanderthals from the Central Europe site of Krapina (n = 10, 130–120 ka103), as well as late Neanderthals (n = 9, 64–40.8 ka113,114,115,116,117,118,119) from Eurasian Late Pleistocene sites, thus covering most of the geographic and temporal range of the Neanderthal clade (Table 3). The studied sample included adult males and females, as well as juveniles (Supplementary Data 2), and was compared to a collection of CT and µCT scans (see Supplementary Data 2 for scanning parameters) of extant H. sapiens from France, Spain, and Switzerland (n = 18; previously published in refs. 59,60,69,120), as well as of two Upper Paleolithic H. sapiens individuals from France (Cro-Magnon 1121 and Laugerie-Basse122). The sex for extant humans was available only for a few individuals and is reported as the one assigned at birth and included in the collection’s metadata (Supplementary Data 2).

Table 3 Geographical and chronological details for the analyzed fossil sample

Segmentation and digital data preparation

The slice stacks of the specimens were segmented in Avizo 7.1 (Visualization Sciences Group, Mérignac) using a semiautomated threshold-based approach123 (integrated by manual corrections in the case of partially filled canals of fossil specimens) to generate the 3D models (.PLY) of the left inner ear cavity (when not available, the right side was segmented and mirrored prior to surface alignment, 48.1% of the cases; Supplementary Data 2). The vestibular system was then separated from the cochlea using a protocol described in ref. 61, which uses landmarks placed along the maximum curvature at the junction between these structures as a reference for a straight clipping plane. The models were decimated in Avizo to 30000 ± 100 triangles and then processed in Geomagic Studio v.2014.3.0 (3D Systems, Rock Hill, USA) to remove non-manifold faces. As a prerequisite for DMorph analysis, we first roughly prealigned the meshes by manually superimposing them to ensure biological correspondence. We then systematically and automatically refined the alignment using the Avizo module Align Surface with the rigid + uniform option (see Supplementary Fig. 4 for error tests related to alignment and segmentation procedures). This module minimizes the distances between the triangles composing the meshes using an iterative process that translates, rotates, and scales the 3D object in a similar way to Procrustes superimposition. The aligned.PLY files were then converted to.VTK format in the opensource software ParaView v.5.6.0 (Kitware).

Diffeomorphometric shape analysis

We applied a DMorph geometric morphometrics approach71,124 to the previously aligned .VTK models to inspect patterns of shape variation occurring in the sample. This approach has been proven to be very efficient in capturing taxonomically and phylogenetically relevant differences in canal morphology of extinct primate species58,59,60. Relative to landmark-based methods87,125, DMorph possesses the advantage of being readily sensitive to changes in the volumetric proportions of the canals. At the same time, it successfully captures shape differences in the relative size and orientation of the canals, as well as nuances in the local morphology of each canal59. By allowing direct comparison between continuous surfaces, DMorph represents a landmark-free approach, thus reducing user-related uncertainties at the time of landmark identification and placement. Diffeomorphometric shape analysis relies instead on the amount of deformation—mathematically modeled as smooth and invertible functions termed diffeomorphisms—of each specimen from the average of a considered sample71. From an operational viewpoint, DMorph first estimates the average object configuration based on the analyzed sample (here composed by the 3D models of the semicircular canals) and attaches an automatically optimized number of control points to the most variable portions of the considered objects. The mean shape is then iteratively deformed towards each of the specimens to match their shape. Finally, a set of momenta (also called deformation fields) is computed to parametrize the displacement of the control points from the average to the target objects and compiled into an atlas of momenta for all the specimens in the sample124. The DMorph analysis was performed with the software Deformetrica v.4.3 (http://www.deformetrica.org/)124 in the DeepFry cluster computer (LOEWE Centre for Translational Biodiversity Genomics, Frankfurt am Main, Germany) using the estimate command to compute a DeterministicAtlas based on our sample and generate an atlas of momenta. This was then imported in Rstudio v.2023.9.0.463126 for R v.4.3.1127 using the RToolsForDeformetrica v.0.1 package128. The raw shape data (i.e., the deformation fields) was investigated with a principal components analysis (PCA) performed with the prcomp function of the stats R package127. Given that DMorph analysis does not inherently allow computations of centroid size, we approximated by computing the centroid size for each of the decimated meshes (prior to their alignment) using the cSize function of Morpho v.2.11 R package129.

Disparity analysis

Besides analyzing the relationship among groups in the morphospace based on shape affinities, in the present study we considered how the groups (and the individuals constituting them) occupy the multidimensional space. We thus performed a morphological disparity analysis to assess how packed or dispersed individuals are in shape space. Disparity analyses are used to test research questions in paleontology, ecology, evo-devo, taxonomy, and macroevolutionary analyses73. When it comes to paleoanthropology, disparity analyses have been used to test dispersal hypotheses for modern humans43,55, assess the relationship between genetic and morphological variation75,130, as well as to track radiation events related to the diversification of platyrrhines131.

Disparity can be measured through a number of metrics, which allow capturing different aspects of point dispersion such as size, density, and position72. For the present study, we decided to inspect the size of the space occupied by each of the analyzed groups (within-group disparity), to be used as a proxy of the range of phenotypic variation, as well as the density of their occupation (i.e., how tight the points cluster together in the shape space), which will provide us with information about the phenetic closeness of the individuals composing the inspected groups. We first computed the hypervolume of the convex hulls occupied by each of the four groups in PC1-4 (55.6% of variance) using the convhull.volume function of the dispRity v.1.7.0132 package for R. We also computed two additional metrics to assess group dispersion size: Euclidean distance from group centroid and sum of variances (i.e., the variances for each of the dimensions of the analyzed matrix). Both metrics are less affected than directly summing ranges in the multidimensional space (i.e., convex hulls, ellipsoids, etc.) by gaps in the observed range than the sum of ranges in space that may lead to an overestimation of range size, with the latter that is also least sensitive to group size changes133. Pair-wise distances between individuals of the same group were instead used to assess the clustering of specimens within the group distribution. These three morphological diversity metrics were computed based on the first 19 PCs (the ones accounting for more than 1% of variance and overall, for 86.7% of total variance)134 of a PCA performed on the whole sample. We relied on the dispRity function of the dispRity package, using bootstrapped group distributions (20000 bootstraps) to account for small group size. Significant differences in group disparity (with and without rarefaction; see Supplementary Table 1) were tested using the test.dispRity function of the same package using pair-wise comparison and Bonferroni’s adjustment method.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.