Abstract
The discovery of Shimao city (around 2300–1800 bce1), a premier state-level Neolithic fortified settlement in Shaanxi, China2, played an important role in helping us understand the emergence of socially stratified urban societies. However, key questions remain regarding how ancestry and kinship shaped the hierarchy of this class-based society characterized by human sacrifice. The origin of the founding populations of Shimao and other Loess Plateau settlements, and their interactions within the broader ancestral landscape, have yet to be determined. Here we present, by sequencing 144 ancient genomes from Shimao city and its satellites, pedigrees among tomb owners spanning up to four generations. These findings reveal a predominantly patrilineal descent structure across Shimao communities, and possibly sex-specific sacrificial rituals. We also characterize the population history, revealing that Shimao culture-related populations originated mostly from a Yangshao culture-related population present at least 1,000 years earlier, and the lasting inflow of Yumin-related populations from Inner Mongolia did not interrupt regional genetic continuity. Broader genetic influence from southern mainland ancestry over Shimao culture-related populations supports evidence of rice farming expanding further north than previously expected. Together, these results uncover fine details of the regional peopling and social structure of early state establishment.
Similar content being viewed by others
Main
Shimao, bordering the northern Loess Plateau and Ordos desert, is among the largest prehistoric settlements discovered in China. The stone-walled site encompasses roughly 4 km2 and can be divided into outer and inner enclosures, showing features typical of state-level societies: craft production, large fortifications and high social stratification with abundant forms of human sacrifice2,3,4,5,6. With a strict hierarchical polity and unique culture of human sacrifice—more than 80 human skulls were found buried under its East Gate2—Shimao can serve as an excellent illustration of the roles of family and ancestry in the structuring of political and social relationships in early state-level human societies. Two cemeteries were found at Shimao: one attributed to the ruling class at the city centre, Huangchengtai, and another to the elite class southwards at Hanjiagedan within the inner enclosure. The East Gate (Dongmen) in the outer enclosure contains mass burial pits of sacrificed victims. The burials at Shimao culture cemeteries are organized into four to five categories, corresponding to classes from high to low status residents7, which together feature a strict hierarchy within Shimao society8,9. The layout shows signs of urban planning and clear social stratification. Previous efforts to explain the emergence of hierarchically organized societies in East Asia analysed a range of dispersed archaeological sites10,11,12, or focused on the early dynasties such as Xia and Shang, or other large Late Neolithic settlements such as Taosi10 or Liangzhu13, which have been considered early forms of regional states14,15. The extensive archaeological record of this large urban settlement has further expanded our knowledge of early state-level communities. Two ideas have attempted to explain the origins of Shimao city and its diffused cultural aspects. One proposes that Shimao was the cosmopolitan centre of a vast trade network, yielding bronze knives reminiscent of those in eastern steppe cultures, jade blades and alligator skins, possibly from coastal northern East Asian or the Yangtze River cultures2,3,6, and pottery types similar to the Central Plains Longshan culture. Another interprets Shimao as a separate regional cultural centre, possibly originating locally, boasting some of the mural paintings and mouth harp in China. This view is based in part on differences in construction techniques and cultural assemblages and argues that later similarities in the region may have been due to Shimao’s growing influence10,16.
Extensive sampling and the large-scale recovery of DNA from many individuals and burial sites make it possible to rebuild large family trees, providing an unprecedented opportunity to describe past mating and burial practices of ancient cultures17,18,19. Studies of megalithic elite tombs or massive family burials frequently illustrate patrilineal and patrilocal kinship systems19,20,21, although this is not always the case, as occasional instances have shown matrilineal or combined patterns22,23. Trans-regional studies supply further knowledge not only for social coherence but also record individual and familial mobility within or between large ancient communities24,25. So far, these studies have mostly focused on the regions of Mesoamerica or West and Central Eurasia. More recently, these genetic studies have begun to explore kinship in a Neolithic settlement of East Asia26. However, there is still no comprehensive genetic analysis comparable in scope to the large, organized settlements with social hierarchies of East Asian prehistoric cultures. Flourishing in a corridor between farming and nomadic communities, Shimao culture, represented by Shimao city and its contemporary satellite sites (for example, Muzhuzhuliang, Shengedaliang, Xinhua and Zhaishan), played a crucial role in establishing the model of large settlements at the very beginning of Chinese civilization, opening a unique genetic window into the population history and early social structures of their inhabitants.
Previous surveys of uniparental markers of Shimao and its surrounding sites have depicted a diversity of mitochondrial haplotypes, contrasting with relatively fewer Y haplotypes27,28. A deeper sampling of nuclear genomes from those large prehistoric settlements would allow a more comprehensive investigation into the genetic and societal history of the inhabitants. To illuminate the population origin and kinship practice of Shimao society, we have undertaken a dense genomic sampling from the Middle to Late Neolithic and Bronze Age of nine sites showing many prehistoric cultures (Yangshao, Shimao and Taosi culture) and covering the area on the Loess Plateau from the Ordos Desert to the lower Yellow River of Shaanxi and Shanxi provinces. We generated genome-wide data for 169 ancient individuals out of 207 human remains tested from seven archaeological sites of Shaanxi and two sites of Shanxi Province, China (142 out of 169 samples overlapped with the previous mitogenomic study27, and two were genetically identical with three previously reported Shengedaliang samples29; Fig. 1, Supplementary Table 1 and Supplementary Figs. 1–5). In total, radiocarbon dates were collected from 32 individuals representing each genetic cluster of retained individuals from nine sites (Fig. 1 and Supplementary Table 1). After excluding 24 genetically identical individuals (Supplementary Tables 1 and 2), 13 individuals with low numbers of single-nucleotide polymorphisms (SNPs) and one individual with a high mapping mismatch rate to the reference genome (Methods and Supplementary Tables 1 and 3), we carried out the population analysis on 144 unrelated individuals and kinship analysis on another 25 individuals having first-degree or second-degree kinships in total. DNA libraries were enriched for 1.2 million SNPs30, resulting in SNP counts ranging from 29,604 to 976,271 with an average captured SNP coverage of 2.74 times (Supplementary Table 1).
a, Geographic locations of newly sampled archaeological sites (filled, coloured symbols) from the Loess Plateau of Northern Shaanxi Province to Southern Shanxi Province, China. Purple symbols represent the geographical sites of previously published ancient samples from northern, western and central East Asia. b, Temporal distribution of new samples with direct radiocarbon dates from the Middle Neolithic (MN) to Late Neolithic (LN) and the Bronze Age (BA). Colours and symbols correspond with those in the geographic map. c, PCA projecting ancient samples (coloured) onto the genomic variation observed in Eurasian present-day humans (grey circles). Here cEA refers to the cline of central East Asians and Deep/wEA refers to the cline of Deep Asians/West East Asians. Yumin-related outliers are encircled by a dashed line and marked as Yumin cline. d, PCA analysis projecting ancient samples (coloured) onto the genomic variation observed in East Asian present-day humans. The sEA and Yumin clines are shown by dashed circles. The symbols are the same as c. Image in a reproduced with permission from ref. 46, Wiley-Blackwell, created with WorldClim (https://www.worldclim.org/) and Natural Earth (https://www.naturalearthdata.com) data.
Genetic make-up of Shimao populations
Efforts to explain the origins of the Shimao population have centred on cultural commonalities with populations from the neighbouring Central Plain and nearby northeastern populations in the Ordos region2,6, and to more distant cultural features from northeastern China, for example, the Amur River basin or coastal regions31,32. We investigated the genetic formation of various populations showing Shimao culture from the Loess Plateau through their genetic connections with a large panel of published Eurasian populations. First, we found that 4,200- to 3,800-year-old populations (upper range given for all carbon dates) attributed to Shimao culture (roughly 2300–1800 bce1) from Shimao city and its surrounding sites (Muzhuzhuliang, Shengedaliang, Xinhua and Zhaishan, together referred to hereafter as Shimao_4k) were closely related to the earlier population, Yangshao culture-associated roughly 4,800-year-old populations (Miaoliang and Wuzhuangguoliang, referred to together as preShimao_5k) in Northern Shaanxi province. Both Shimao_4k and preShimao_5k clustered with northern East Asian (nEA) ancestries (for example, YR_MN, YR_LN, WLR_LN and Miaozigou_MN) from outside Shaanxi province, supported by principal component analysis (PCA) and admixture analysis when K = 3 (where ‘K’ represents the number of ancestral source components; Fig. 1, Extended Data Fig. 2 and Supplementary Fig. 11). In outgroup-f3 analysis and D statistics, comparing Shimao culture-related populations with various ancient Eurasian populations, including nEA ancestries from the Yellow River basin, represented by YR_MN and YR_LN, and those further away such as Early Neolithic Shandong (Xiaogao and Bianbian), the Amur River basin (AR19K and DevilsCave_N) and the West Liao River basin (WLR_MN and WLR_LN), Shimao_4k populations had the highest overall affinity with the preShimao_5k compared with other nEA ancestries (Fig. 1c, Extended Data Fig. 1, Supplementary Tables 4, 9 and 10 and Supplementary Fig. 10). In addition, a maximum likelihood phylogeny with admixture (m = 2; Fig. 2) confirmed that Shimao_4k populations were found to have a clear genetic continuity with preShimao_5k (Fig. 2 and Supplementary Fig. 6). Genetic continuity of the Shaanxi populations was also validated using qpGraph, in which Shimao_4k could be modelled as a single source (100%) from preShimao_5k (here represented by the better covered Wuzhuangguoliang; Fig. 2 and Supplementary Figs. 7–9).
a, qpAdm analysis showing the ancestry proportions of Yumin and Wuzhuangguoliang for Middle to Late Neolithic Shaanxi populations. Colours represent the different ancestral sources of Yumin (green) and Wuzhuangguoliang (yellow). The measure of centres of the error bars is presented as the mean value of Yumin ancestry proportion ±1 standard error for the estimated admixture proportions by qpAdm using the block-Jackknife analysis. b, Admixture graphs built by the qpGraph module in AdmixTools, the selected admixture graph is built on a base graph containing the central African Mbuti as an outgroup, the early western Eurasian Ust’-Ishim and early Asian Tianyuan, sEastAsia_EN, YR_MN, preShimao_5k (Wuzhuangguoliang), Shimao_4k (Shimao_HJGD1), Yumin, preShimao_5k_o (Wuzhuangguoliang_o1) and Shimao_4k_o (Xinhua_o) added interactively. c, Constrained graph with two admixture events in 100 algorithm iterations. The log-likelihood (LL) score is 31.19. Mbuti, Ust’-Ishim, sEastAsian_EN (represented by Qihe2 and Liangdao2) and Tianyuan are constrained as a non-admixed population. For a more detailed fitting graph, see Supplementary Figs. 8 and 9. d, Treemix analysis setting two migration branches (m = 2). The range of bootstrap values (n = 1,000) is marked on the tree node in different colours and shapes, and the individuals included in the group of preShimao_5k, preShimao_5k_o, Shimao_4k, Taosi_4k and Shimao_4k_o are described in Supplementary Table 1.
To further clarify whether extra ancestries could be included along with the preceding Yangshao ancestry to the population of Shimao, a broad f4 analysis was performed (Supplementary Table 4 and Supplementary Fig. 10). Notably, several individuals within the Shimao culture-related populations of both Shimao city and its satellite sites (denoted as the Shimao southern East Asian (sEA) cline in Fig. 1) differed from the Late Neolithic Longshan population represented by YR_LN29 in showing diverse affinities of southern East Asian ancestry (represented by indigenous Ami population of Taiwan and the Xitoucun population of Fujian), evidenced by D statistics (Supplementary Table 5). qpAdm modelling indicates these Shimao sEA outliers harbour predominantly 70–90% Yangshao culture-related ancestry (represented by Wuzhuangguoliang) with a further 10–30% southern ancestries, which can be represented by 22–31% of southern mainland ancestry (Xitoucun) or 7–20% southeast coastal ancestries, represented by an Iron Age indigenous Hanben33 or Ami populations in Taiwan (Supplementary Table 6), suggesting influences from southern ancestry during the Late Neolithic expansion of rice farming had extended further north than the Central Plain, in line with a recent finding34 (see Supplementary Note 2 for further discussion; Supplementary Tables 5 and 6). Admixture modelling using qpAdm for Shimao and its contemporaneous related populations (that is, Muzhuzhuliang, Shengedaliang Xinhua and Zhaishan) of the Late Neolithic shows an extremely high contribution from the 4,800–4,600-year-old ancestry represented by Wuzhuangguoliang (9 out of 18 populations, listed and highlighted in grey in Supplementary Table 4, have a single ancestry source and 5 have more than 80% Wuzhuangguoliang ancestry; Supplementary Table 6), further supporting the hypothesis that the Shimao people mostly originated from a Yangshao culture-related population that was established in the region at least 1,000 years before. To further understand the ancestry sources of the earlier Wuzhuangguoliang population, we applied a simulation method (Methods). Our results indicated a mixed ancestry source for the Wuzhuangguoliang population (Methods and Supplementary Figs. 12–15), distinguished from Yellow River farming ancestries. We further explored the genetic relationship between Shimao_4k populations with the contemporary populations attributed to the Taosi culture (Taosi and Zhoujiazhuang, together referred to as Taosi_4k) located further south in Shanxi Province, indicating a close connection between Shimao and Taosi culture-related populations (see Supplementary Note 3 for further discussion).
Persistent Yumin-related presence
Agro-pastoralist societies in the Ordos region have frequently transitioned between herding and farming lifestyles from the Middle to Late Neolithic35. Located in the transitional corridor, Shimao showed steppe-related features with the introduction of herding animals36 and the presence of chiselled stone faces found at Shimao. To explore whether neighbouring steppe-culture populations genetically influenced Shimao populations and, if so, the timing and extent, we looked at the genetic connections between preShimao_5k, Shimao_4k, ancient western and eastern steppe populations37,38,39,40 (Afanasievo37, Yamnaya_EMBA38 and Shamanka39), West and Central Eurasians41,42, and other East Asians, including the nearby northern East Asian ancestry, Yumin32, represented by an 8,000-year-old individual from the Yumin site in Inner Mongolia, who inhabited the Inner Mongolian steppe and was absent from northern East Asia throughout the Neolithic and Bronze Age periods. We found the predominant preShimao_5k populations had little to no evidence of admixture with ancestries outside East Asia (Supplementary Table 8). When compared with the various nEA ancestries, we observed some outlier individuals from the Middle Neolithic Wuzhuangguoliang site having ancestries different from those predominant in the remaining populations. Two genetic outliers among the Wuzhuangguoliang population (Wuzhuangguoliang_o1; 4,831–4,585 calibrated years before present (Cal. BP); Supplementary Table 1) clustered with the Yumin population in the PCA (Fig. 1). The Treemix analysis also showed these Wuzhuangguoliang outliers (preShimao_5k_o) clustering with the Yumin branch (Fig. 2 and Supplementary Fig. 6). To further investigate the genetic make-up of these two Yumin-related outliers, we applied distal admixture modelling (Methods), which supported a 2-source admixture of 50.2 ± 11.5% Yumin-related ancestry and 49.8 ± 11.5% 4,832–4,820-year-old predominant Yangshao ancestry represented by Wuzhuangguoliang (Supplementary Table 6).
Looking at whether Yumin-related ancestry had a continuing influence on Shimao culture-related populations 1,000 years later, we observed an incidence of increasing Yumin-related ancestry lasting to the Late Neolithic but without obscuring the local genetic continuity of the previous 1,000 years. PCA and f3 analysis detected six genetic outliers (4,148–3,390 Cal. BP; Xinhua_o, Shimao_HCT_o, Shimao_DM_o1 and Shimao_DM_o2, and two belonging to Muzhuzhuliang_o) among the Shimao culture-related populations clustering with or close to Yumin (Fig. 1 and Extended Data Fig. 1). Those later outliers (Xinhua_o, Shimao_DM_o1 and Muzhuzhuliang_o) shared more alleles with Yumin than with Shimao or other nEA ancestries, as shown by the following D statistics: D(Yumin, Shimao/nEA; Shaanxi outliers, Mbuti) > 0 (−0.3 < Z < 9.2) and D(Shaanxi outliers, Yumin; Shimao/nEA, Mbuti) roughly 0 (−2.9 < Z < 2.9) (Supplementary Tables 9 and 11). Treemix analysis also showed that these Late Neolithic outliers (Shaanxi_4k_o) act as sister clades with the Yumin branch (Fig. 2 and Supplementary Fig. 6). Only one Late Neolithic outlier in Shimao (Shimao_HCT_o) was admixed with roughly 28–31% Yumin-related and roughly 69–72% Yangshao ancestry (ancestry proportion ranges are based on qpAdm models presented in Fig. 2 and Supplementary Table 6). Despite a time span of more than 4,500 years between Yumin and the most recent dated outlier in Dongmen of Shimao city (3,390–3,253 Cal. BP; Shimao_DM_o2), we found no evidence of admixture in the other five Late Neolithic outliers (Xinhua_o, Shimao_DM_o1, Shimao_DM_o2 and two from Muzhuzhuliang_o). This is evident by qpAdm analysis of distal or proximal modelling in which these five Late Neolithic outliers in Shaanxi province are best modelled as having a single source of ancestry related to Yumin (100%; Fig. 2 and Supplementary Table 6), and further supported by D statistics (Fig. 2 and Supplementary Tables 9 and 11). Together, these results indicate long-term interaction through coexistence and occasional admixture between ancient Shaanxi inhabitants and Yumin-related populations, and even an increase of Yumin-related influence from the Middle to Late Neolithic, in line with the discovery of the increasing incidence of herd animal exploitation36. It is unclear whether these interactions were related to trade, the maintenance of an agro-pastoralist lifestyle, perhaps in response to seasonal climate fluctuations, or other causes, but they were not substantial enough to interrupt the genetic continuity of the local ancestry36.
Sex-specific sacrifice at Shimao
The diversity of the sacrificial traditions of Shimao culture indicates a high degree of social stratification and strict hierarchy2. Sacrificial traditions at Shimao and its surrounding sites consisted of two forms: mass burials that may have served public ritual purposes, as found in Shimao Dongmen or on a raised area potentially containing a palace at Huangchengtai (Supplementary Figs. 1 and 2), and sacrifice accompanying high-status burials, where the sacrificed victims would be entombed with the tomb owners, as found at the cemeteries in Shimao and Zhaishan sites (Figs. 3 and 4 and Supplementary Figs. 3–5). To explore whether we could detect a demographic bias of the victims selected for sacrifice, we investigated the site of Dongmen (East Gate) at Shimao. In contrast to previous archaeological reports that identified these sacrifices as female-biased on morphological criteria, our results showed the sacrificial victims in Dongmen showed no evidence of female bias, with 9 out of 10 victims being men (female/male assigned female/male at birth). Three of these male individuals were previously identified in these reports as female by morphology. The archaeological context, beneath the foundation of the gate2,5, suggested that these sacrifices were probably connected to a construction ritual of the walls or gate, a custom observed at later sites in China2. To further understand these findings, we explored the genetic composition and kinship relationships of these sacrificed individuals in comparison to the dominant populations of Shimao. Our analysis identified two genetic outliers at Dongmen who possessed Yumin-related ancestry32, including a sacrificed victim from the pit and an individual from a late tomb (Shimao_DM_o1 and Shimao_DM_o2; Figs. 1 and 2 and Supplementary Table 1), who were buried alongside inhabitants with predominantly Wuzhuangguoliang ancestry (Supplementary Table 6). No pairwise kinships or shared identity by descent (IBD) segments were detected between these outliers and others within or across the sites (Fig. 4, Extended Data Figs. 3 and 4 and Supplementary Figs. 16–21). Except for these two Yumin-related sacrificed individuals, no differences in ancestry were detected between those selected for sacrifice at Dongmen and the elite class of tomb owners at interior Shimao sites.
Grave locations and reconstructed pedigree at the Zhaishan site. The connections inferred from trustworthy IBD sharing are marked in pink, representing sample pairs with either coverage above 1× for both samples. The IBD edges bar was added based on the maximum IBD length when IBD 12 cM.
a, Map of sites within Shimao city including Dongmen (DM), Huangchengtai (HCT), Hanjiagedan (HJGD), Houyangwan (HYW) and Mahuangliang (MHL); the symbols represent the inferred social organizations in each Shimao site. Sacrificed victims from the high-level graves are marked as bold golden rectangles (men) or circles (women) of Dongmen, Huangchengtai and Hanjiagedan sites, and individuals from the same grave are marked with the same colour. b, Grave locations in the cemetery of Huangchengtai (the grave level at this site is higher than those at Hanjiagedan), and the kinships between residents. The high-level graves at Huangchengtai typically feature one to three sacrificed individuals alongside a niche containing burial goods. Tomb owners and individuals with unknown identity but who have kin connections with the sacrificed individuals are also plotted. c, Grave locations in the cemetery of Hanjiagedan within Shimao city and the reconstructed pedigree spanning at least three generations between tomb owners. Haplotypes of the mitochondrial and Y chromosomes are marked as circles and rectangles in different colours. Here the light blue dotted line represents one possible case of a matrilineal pedigree among several possibilities. d, Sampled individuals in the grave from Houyangwan, the East Gate (Dongmen) of Shimao city (Shimao_DM_o2 of M2: a later resident grave, different from other sacrificed people) and the Mahuangliang site. The mitochondrial haplotypes of b and d are simplified into lineages from A to Z. All second-degree kinships are marked with a dashed line in golden yellow. Connections inferred from trustable or low-confidence IBD sharing are marked in pink or green, representing sample pairs with both coverage above 1× or with coverage below 1× for either sample, respectively. The IBD edges bar was added based on the maximum IBD length when IBD 12 cM. Map in a reproduced with permission from Shaanxi Academy of Archaeology7. ‘Rob hole’ denotes an illegal looting tunnel; a specimen recovered from such a feature is thereby deprived of its original burial context.
Although the limited sample size reduces our statistical power to detect significant sex biases, we observe a marked contrast between the predominantly male sacrifices at Dongmen in Shimao city and the predominantly female sacrifices associated with elite burials at several Shimao cultural sites: including Hanjiagedan and Huangchengtai within Shimao city (Fig. 4), as well as secondary settlements such as Zhaishan (Fig. 3). Among these, Hanjiagedan, located south of the inner enclosure, served as a noble cemetery. Nearly all sampled sacrificed individuals were female (six out of seven) and unrelated to the tomb owners. Another high-level cemetery in the city centre, Huangchengtai of Shimao, potentially where the ruling class resided, also showed predominantly female sacrifices (14 out of 19). However, unlike other sites, second-degree kinships were observed among the sacrificed victims (Fig. 4, Supplementary Table 3 and Supplementary Fig. 19), indicating that families or communities may have been selected for burial sacrifices by the ruling elite. A small burial ground, similar to the mass burial at Dongmen, was unearthed near the palace area of Huangchengtai. All individuals buried there were women (three out of three) and none showed detectable familial ties to individuals from nearby communities (Fig. 4, Extended Data Fig. 3 and Supplementary Table 3). The identity of these female sacrifices could be extrapolated from the handcrafted products excavated alongside them, offering an assumption that the craftsmen who mastered the core production technology were concentrated in the upper elite residential quarters. In concordance with practices at Shimao, its secondary settlement, Zhaishan, also featured only female sacrifices (two out of two), with no close kinships (first-degree to second-degree kin) observed between the sacrificed individuals and their tomb owners (Fig. 3 and Supplementary Fig. 18). These patterns of mostly female sacrifices starkly contrast with Dongmen, in which decapitation and mass burial involved mostly sampled men. The sacrificial practices observed in the cemeteries of Shimao city and Zhaishan may represent ancestor veneration, in which women were sacrificed to honour elite nobles or rulers. The divergent traditions seen in these sacrificial customs suggest a complex, hierarchical social system in Shimao culture, which was not observed in previous ancient genomics studies. However, our analysis is based on a limited number of well-preserved remains, which may not fully represent the entire population of sacrificed individuals. This sample size, unfortunately, lacks robust statistical power, limiting the interpretation of the bias ratios.
We then looked for signs of consanguinity in Middle to Late Neolithic Shaanxi communities and found three individuals whose parents were probably first-degree or second-degree relatives (Extended Data Fig. 5 and Supplementary Table 14). At Zhaishan, a sacrificed woman (C6213) showed extensive long runs of homozygosity (ROH) (roughly 400 centimorgans (cM); Extended Data Fig. 5a), consistent with her being the offspring of a second-degree relative mating (Extended Data Fig. 5). However, distinct from the high-status consanguineous offspring reported in Neolithic Ireland whose parentage was considered to have had high social sanction43, we found that this sacrificed woman (Fig. 3) shared only distant kinships (third to fifth degrees kin) with two lineage tomb owners (individuals on the pedigrees denoted in Fig. 3). Close-kin mating was not observed among other elites or commoners with available pedigree or ROH information, suggesting such unions may have been avoided or less common in higher-status lineages, although larger sample sizes are needed to confirm this pattern.
Dominant patrilineal descent structure
In the absence of more advanced political systems, researchers have traditionally regarded family relationships as a means of maintaining elite status and perpetuating power. To investigate the family ties among the tomb owners of Shimao culture, we sampled individuals from low-level to high-level graves of the Middle to Late Neolithic large communities and uncovered a web of relatedness ranging from two to four generations. We were able to identify 25 kinship pairs within-second-degree kinship with high confidence and 31 pairs for IBD sharing that showed a possible third-degree to fifth-degree kinship in total across all sites (Figs. 3 and 4, Extended Data Fig. 3, Supplementary Tables 3 and 12 and Supplementary Figs. 18–21). We then extended pedigrees up to four generations among tomb owners in low-level to high-level graves at Shimao city and Zhaishan, finding that the largest pedigrees at both sites were established by a high-status man. Their male offspring also appeared to have had high social status with the right to wealth inheritance (for example, burial goods and offerings of sacrifice). This is indicative of a predominant patrilineal descent structure both at Zhaishan and Hanjiagedan, although we cannot exclude one possible matrilineal case at Hanjiagedan (Fig. 4). We determined the uniparental haplotypes of all higher-status individuals, both lineage and non-lineage members (as shown on or off the pedigrees in Fig. 4), and found that all lineage male tomb owners of Zhaishan (Fig. 3) and Hanjiagedan (Fig. 4) nearly universally carried the same paternal haplogroup (O3a2c). Except for one non-lineage man at Huangchengtai, who had a different Y haplotype (C2e2). This contrasts with the diverse maternal haplogroups of ten female tomb owners observed in these three cemeteries (Figs. 3 and 4). Likewise, human remains in the contemporaneous settlements attributed to the Shimao culture (Muzhuzhuliang, Shengedaliang, Xinhua and Zhaishan) also demonstrated a diversity of mitochondrial haplotypes but relatively limited paternal haplotype structures among the residents (Supplementary Figs. 18 and 19), showing a patrilineal descent structure in which group membership primarily derives from the father’s lineage.
The practice of female exogamy can be useful for maintaining genetic diversity and reducing the incidence of close-kin mating and has been identified in several Neolithic communities in West Eurasia18,19,44. To see whether these practices played a role in maintaining the developed social hierarchy system at Shimao, we checked all the lineage and non-lineage female individuals along with their male relatives. The constructed pedigree at Hanjiagedan showed the second-generation males’ female partners came from different biological families, a circumstance that is also found at Zhaishan, although it is not clear whether these partnerships occurred serially or were polygamous (Fig. 3). We observed no close biological relatives—such as daughters, parents or siblings—of these female tomb owners at the site from high to low-level graves at Zhaishan and Hanjiagedan cemeteries, which may suggest that they were not descended from local families but rather originated outside the community (Figs. 3 and 4). Whether these are instances of female exogamy practices is unclear due to the influence of incomplete sampling. A better understanding of the Shimao culture’s mating customs would require a broader sampling of tomb owners.
Tracking how burial goods as indicators of status can help infer patterns of wealth transmission and patrilineal influence in Shimao’s potentially hierarchical society. We found two non-lineage female tomb owners at Huangchengtai, a presumed ruling-class cemetery, having high social status as evidenced by their rich burial goods and sacrifice offerings. This is comparable to the five male tomb owners at cemeteries at Hanjiagedan and Zhaishan, conveying that in Shimao culture, high social status and associated wealth may not have been constrained to men, and that women could also have had political powers. Because these female tomb owners do not belong to a discernible family lineage, or their direct relatives were not recovered, it is challenging to determine whether their wealth was inherited from parents or husbands, or accumulated independently. Overall, both pedigrees of Hanjiagedan and Zhaishan depicted a core role of families in Shimao communities. We used the pedigree information to determine whether the spatial arrangement of tombs could reflect familial ties. Notably, the arrangement through geographical proximity and direction of tombs has no strong correlation with the first-degree or second-degree kinship among tomb owners, demonstrating that the blood relations were not a factor in grave placement (Figs. 3 and 4c). At Zhaishan cemetery, father–adult son tombs were spatially closer than those of fathers and adult daughters (Fig. 3), supporting a patrilineal and potentially patrilocal system.
Discussion
The extensive and high-resolution dataset from well-preserved settlements of Shaanxi and Shanxi provinces offered us a genetic window into the human migration, interaction and kinship practice of these distinctive and important past societies in prehistoric China. The populations of Shimao society within the Ordos loop were found to have originated from a single ancestral source, corresponding to a regional genetic continuity from the Middle to Late Neolithic, which shows consistency with the hypothesis from archaeologists that Shimao city was founded by the agro-pastoralist elites of the Loess Plateau and Ordos region2. This region acted as an interaction corridor between farming-associated and herding-associated ancestries, separating the large settlements attributed to Shimao culture from those in the Central Plain. In addition, we detected the presence of an inland northern East Asian ancestry from the Inner Mongolian steppe, Yumin, before and during the occupation period of Shimao. The lasting presence of Yumin ancestry suggests a regular and lengthy interaction with periodic genetic inflow from the Yumin populations of northern China without disrupting the genetic continuity of the dominant local Shimao ancestry. Together, our results reveal new insights into the lasting coexistence and interactions of Yumin ancestry from Inner Mongolia with the Yangshao and Shimao culture-related ancestry in northern Shaanxi, in line with the progressive transition from exclusive farming to integrated agro-pastoral subsistence across this region36. In addition, our findings showed a broader genetic contribution from southern mainland Xitoucun ancestry and southeast coastal indigenous ancestries, represented by Taiwan-Hanben or Ami, extending over a long distance from Fujian or Taiwan to the Shanxi and Shaanxi populations. This aligns with evidence of rice farming expanding further north with a broader population contact34. Nevertheless, it remains unclear whether these genetic affinities originated directly from southern coastal or mainland populations, or were mediated through Yangzi River Longshan populations. Further sampling is needed to resolve this question.
Given the genetic continuity with populations inhabiting the same region 1,000 years previously, apart from the Yumin-related introgression, the people of Shimao showed little to no admixture with outside groups that shared some cultural similarities, such as populations to the west occupying the western Eurasian steppe and Northern and Central Asia, or coastal Shandong to the east. This suggests that the presence at Shimao of anthropomorphic stone carvings, specialized knives and artefacts such as jade blades and alligator bone plates were most likely sourced from these regions through expansive trade networks without genetic exchange. Despite the distance between the two, the inhabitants of Taosi, the contemporary large settlement comparable to Shimao, and a nearby settlement, Zhoujiazhuang, share close ancestry with pre-Shimao populations from the northern Ordos plain. This is not at odds with proposals based on archaeological data that a more complicated relationship involving both trade and pillage may have existed between the two large communities2. We have shown that Yangshao culture-related populations living at least 5,000 years ago at Wuzhuangguoliang are ancestral to Shimao and Taosi regions of cultural influence, with limited interactions with Yumin-related populations to the north, resolving important questions about the origins of the Shimao city builders and the relationship between Shimao and Taosi, but more questions remain to more precisely define these relationships, and the role both ancestral and familial lineages may have played in the early societies of the Shaanxi–Shanxi region.
Along with the settlement patterns of the emerging agro-pastoral society at Shimao, the rich quantity of burials attributed to different social classes allowed us to examine the three dimensions of social structure based on kinship practices: lineage descent, marriage patterns and residential rules45. Through genomic sampling of low-level to high-level graves and sacrificial burial pits, we further clarified the social organization and kinship patterns of the hierarchy-driven Shimao society. Within these burials from elite to common people, our results support a predominantly patrilineal organization along with apparent male-specific and female-specific sacrifice customs, together shaping a hierarchically structured Shimao society. Unlike the extensive pedigrees recovered so far from the family burials from the West to Central Eurasia17,20, many first-degree and second-degree kinships and IBD pairs among the diverse burial practices allowed us to reconstruct several extended pedigrees spanning from both high-level and low-level graves. On the larger scale, we show that Yangshao culture-attributed pre-Shimao and Shimao culture populations maintained healthy population diversity with little close-kin mating and a large effective population size for more than 1,000 years. Furthermore, we coanalysed the spatial distance and the genealogies among individuals between all the sites located in the Loess Plateau (Extended Data Fig. 3). We found no close kinship practices or shared IBD blocks with high confidence within Shimao cultural communities or between the southward Taosi cultural communities in Shanxi Province, implying restrained movement and mating patterns between families belonging to different cultures, or those far from the dominant local communities.
Within the Shimao communities, no direct familial linkages were detected between social elites and sacrificed individuals, suggesting the presence of constrained mating practices and social boundaries. However, given the observed kinship connection between the elite and a lower-status individual, these boundaries were probably permeable to a certain extent. Further data from individuals of intermediate status will be essential to more comprehensively evaluate the social stratifications of Shimao society. Instead, the lack of kinship between the elite tomb owners and the sacrificed people in the tombs may suggest that graves were centred around social elites and their families, and the sacrifice rite was administered according to social status. Whereas the uniparental genetic data indicate a patrilineal kinship system, the presence of high-status female individuals suggests that gender roles did not strictly limit access to elevated social positions in the Shimao communities. In summary, our analyses highlight the utility of extensive genomic sampling to reveal detailed patterns of prehistoric social organization. Tracking how burial goods such as weapons, pottery, as well as animal and human sacrifices, follow the pedigrees has also clarified patterns of wealth inheritance and class differentiation within an early East Asian political power centre and its satellite settlements. Situating these findings within a broader cultural and symbolic framework enriches our understanding of the ancient ritualistic practices and social dynamics of Shimao.
Methods
Ethics and inclusion statement
Permission to access the ancient DNA in the human remains in this study was approved by the archaeological team that lead the excavation from Shaanxi Academy of Archaeology, Institute of Archaeology (Chinese Academy of Social Sciences) and Archaeology Institute of National Museum of China. The Institutional Review Board of the Chinese Academy of Sciences, Institute of Vertebrate Paleontology and Paleoanthropology provided further monitoring and permission for the sampling of ancient humans in this research. All the work was done in collaboration with local archaeologists, who were named co-authors for their contributions to the collection of material and archaeological information, such as on-site photographs, classification of high-level tombs and/or discussions that contributed to the associations derived from the archaeological research cited in this study. All wet laboratory work and data analysis are performed with equipment from the Molecular Paleontology Laboratory, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences.
Ancient DNA experiments and sequencing
We sampled and sequenced 207 human remains from Shaanxi and Shanxi provinces, China, among which 169 individuals were analysed in this study (Supplementary Table 1). All extraction, sequencing and data processing of ancient human samples were carried out in dedicated laboratories at the Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, in Beijing. Following standard protocols47, DNA was extracted from each sample from less than 100 mg of bone powder, obtained through drilling. We prepared double-stranded libraries (denoted ‘DS’ in Supplementary Table 1) for 134 samples with uracil-DNA glycosylase partially treated library protocol (denoted ‘half UDG’)48,49 (Supplementary Table 1). For 35 samples, we prepared single-stranded libraries (denoted ‘SS’, Supplementary Table 1) with full UDG treatment (4 samples; denoted ‘UDG’, Supplementary Table 1) or no UDG treatment (31 samples; denoted ‘No’ in Supplementary Table 1). To collect enough DNA for capture, libraries were amplified for 35 cycles using the AccuPrime Pfx polymerase. We then evaluated the amount of DNA extracted per sample using a Thermo Scientific NanoDrop 2000 spectrometer. We applied a capture strategy on both mitochondrial and nuclear DNA. For mitochondrial DNA (mtDNA), we used oligonucleotide probes synthesized from the complete human mitochondrial genome50, for nuclear DNA, oligonucleotide probes targeted 1.2 million SNPs (the ‘1,240k’ SNP panel) were applied. After enrichment, sequencing was performed on an Illumina MiSeq sequencing platform to generate 2 × 76 base pairs (bp) paired-end reads for the mtDNA and an Illumina Hiseq 4000 sequencing platform to generate 2 × 100 bp and 2 × 150 bp paired-end reads.
Read alignment and variant calling
We used leeHom51 to trim adaptors and merge paired-end reads into a single sequence (minimum overlap of 11 bp), keeping only merged reads with a length of at least 30 bp. Reads were aligned with BWA (v.0.5.10)52 using the bam2bam command with default parameters, except for samples with no UDG treatment, for which we used the parameters -n 0.01, -l 16500 and -o 2. We aligned the mtDNA reads to the revised Cambridge Reference Sequence53 and the nuclear DNA reads to the human reference genome hg19 (ref. 54). Duplicate reads with the same orientation, start and end positions were removed, and reads with a minimum mapping quality score of 30 were kept for analysis. The frequency of terminal C-to-T misincorporations was used to validate ancient DNA sequences, and contamination rates were estimated on the basis of two approaches. For all the individuals, we applied ContamMix55 to compare mtDNA fragments between the new consensus mitochondrial genomes with the present-day sequences50. To minimize the impact of damaged bases, we ignored the first and last five positions of the fragments during estimation. We treated the libraries as contaminated if the estimated contamination rate was greater than 5%. Contamination rates for men were also estimated using ANGSD56, leveraging the fact that men have one copy of the X chromosome, and verified using HapCon57, to improve the performance of low-coverage data. To keep enough individuals for further analysis, for 12 individuals with contamination above 5% (‘b’ annotated in the column of SNP number, Supplementary Table 1), we restricted our analysis to only damaged fragments with ancient DNA characteristics. The damaged fragments were obtained by pmdtools v.0.60 (ref. 58) with the --customterminus parameter, keeping fragments with at least one C → T substitution in the first three positions at each end. To eliminate the potential bias caused by the terminal deaminated cytosines, we masked 2 bp at the end of mapped reads for all the double-strand libraries with half UDG treatment, and 5 bp at the end of the reads were masked for all the single-strand libraries with no UDG treatment. To generate pseudo-haploid genotypes, heterozygote SNPs were randomly sampled to determine a single allele for the individual. During genotyping, the first and last 5 or 2 positions of the fragments were ignored for non-UDG-treated and UDG-treated libraries, respectively, and 13 poorly covered samples (with fewer than 27,000 SNPs) were removed.
Uniparental haplogroup identification
Mitochondrial sequences for each individual were mapped to the revised Cambridge Reference Sequence59. We only kept reads of a minimum of 30 bp in length and with a minimum mapping quality of 30. Haplogroups for each individual were called using HaploGrep2 (ref. 60) based on PhyloTree Build v.17 (ref. 59). We also confirmed all the haplogroups using the phylogenetic tree constructed with mtphyl v.5.003 and found that two individuals with an R# haplogroup (R + 16189)27 were assigned into the subclade of Haplogroup B4c1a since the 9-bp deletion (8281–8289). In comparison, four individuals with B haplogroups were assigned to the ancestral haplogroup with the best fit (that is, B4a, B4b1 and B4c1b). For the male individuals, Y-chromosome haplogroups were determined by identifying the assigned position in the phylogenetic tree on the basis of the International Society of Genetic Genealogy dataset version 9.77 (www.isogg.org/tree). In cases in which the most derived allele upstream of the Y-chromosome was a C to T or G to A substitution, indicative of possible deamination, at least two derived alleles were required to assign the Y-chromosome haplogroup. Otherwise, the haplogroup of the tested individual would be assigned to the ancestral haplogroup. When the subclade of the haplogroup assignment could not be determined, the haplogroup of the individual would be assigned to the most recent ancestral haplogroup they best fit (for example, No).
Population structure analysis
We conducted a principal component analysis (PCA) with smartpca in the EIGENSOFT package61. To calculate the principal components, we used 82 present-day populations from the Affymetrix Human Origins dataset62. We merged newly sequenced and published ancient individuals to the Human Origins dataset and projected them using the following program settings: ‘lsqproject: YES’, numoutlieriter: 0, and shrinkmode: YES. Newly sequenced or previously published ancient individuals were projected onto the principal components calculated based on present-day Eurasians (Fig. 1c) or only the East Asians (Fig. 1d). We estimated individual ancestries by model-based maximum likelihood clustering using ADMIXTURE63. We used 44 of the 82 populations used in the PCA, along with 10 present-day Han and Tibetan populations from ref. 64. Before the admixture analysis, we pruned genotypes with high linkage disequilibrium (r2 > 0.4) using PLINK (version v.1.90)65 and the parameters ‘-indep-pairwise 200 25 0.4’ were applied for SNP filtering, leaving 597,573 SNPs. ADMIXTURE analysis was conducted with K from 2 to 10. For each K, we ran the analyses ten times with different seeds to estimate the cross-validation error, and the best K was determined according to the lowest cross-validation error.
Relatedness analyses
Kinship patterns among the samples from Shaanxi and Shanxi provinces were analysed using READ v.2 (ref. 66) to determine the degree of kinship, and confirmed by lcMLkin67. Further connections of the pedigrees or individuals were investigated using ancIBD68. For Huangchengtai samples, we introduced the third hidden Markov-model-based approach by KIN69 to test all the kinships within the second-degree. After filtering 22 genetically identical individuals, we found 25 kinship pairs with high confidence (Supplementary Table 3), consisting of 8 pairs of first-degree relationships, including 1 full sibling and 7 parent–offspring, and 17 second-degree relatives (Supplementary Table 3). For the READ analysis, a genome-wide approach was applied for calculating a single value (P0) across all sites without splitting the genome into windows, and the average P0 was then normalized by the median of all average pairwise P0 across all samples. To estimate standard errors, a block-jackknife approach was applied with blocks of 50 mega bp (Mbp). For the differentiation between parent–offspring and siblings, a different window size of 20 Mbp was used. We ran separate analyses by three groups: all Middle Neolithic Shaanxi samples, all Late Neolithic Shaanxi samples and all Shanxi samples. We used unrelated individuals without having first-degree and second-degree kinships estimated by READ for subsequent genetic analyses.
A further genotype likelihoods-based method to determine kinship, lcMLkin, was applied that considers the inaccuracy of genotype calling when sequence coverage is low. lcMLkin outputs the estimated probability of two diploid individuals sharing zero (k0), one (k1) or two (k2) alleles that are identical by descent (IBD) and calculates the combined kinship coefficient by the equation: r = k1/2 + k2. The kinship categories (for example, identical twins or self, parent–offspring, full siblings, second-degree and unrelated) were determined by comparing with the theoretical expectation for k0, k1 and k2 (ref. 70). The method requires a SNP set with minor allele frequency higher than 5% and without linkage disequilibrium with each other. SNPs with allele frequency lower than 5% among all present-day East Asians from the Simons Genome Diversity Panel dataset71 were removed, and the resulting data were pruned for linkage disequilibrium using PLINK, with the parameter ‘-indep-pairwise 200 25 0.5’, which resulted in 135,642 SNPs available for the downstream analysis. We called genotype likelihoods at these SNP sites for our ancient individuals using the script SNPbam2vcf.py available with lcMLkin and estimated their biological relatedness using lcMLkin (Supplementary Table 3). When two samples from the same burial site were identified as identical twins or the same individual, the sample with the lower coverage was removed from analysis. For samples with first-degree or second-degree familial relationship, we only retained genome-wide data for the individual who had the higher coverage and no first-degree or second-degree relationships with the remaining individuals in the same site for further downstream genetic analysis. This resulted in 25 samples being excluded from the population analyses.
For the shared IBD analysis, genomes from individuals with coverage above 0.5× were imputed by GLIMPSE2 (refs. 72,73) setting parameters for quality control in phasing as --mapq 30 and --baseq 30. The 1000 Genomes Phase 3 dataset was used as a reference panel and was downloaded at http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/. To obtain the IBD sharing information, imputed files were merged by chromosomes and analysed by the software ancIBD68 with suggested parameters, including genotype posterior probabilities higher than 0.99, gap maximum at 0.0075 and IBD blocks with more than 220 SNPs. The results of IBD sharing between pairs of individuals are recorded in Supplementary Table 12. We focused on IBD connections between individual pairs both having coverages above 1×. We also provided the results for IBD connections when the coverage of either of the individuals in a pair is between 0.5 and 1×, but denoted them as ‘low confidence’. To confirm the within-second-degree kinship relationships estimated by READ, lcMLkin and KIN (Supplementary Table 3 and Supplementary Figs. 16 and 17a), we counted and plotted the sum and number of IBD segments longer than 12 cM, which could be applied to infer kinship relationships68. We also depicted IBD segments that were longer than 8 cM (Supplementary Figs. 17b–d) in karyotype plots for these individuals who shared within-second-degree kinships.
To further infer distant kinships of more than second degrees (denoted as third-degree to fifth-degree degree kinships in Fig. 3, Extended Data Fig. 3 and Supplementary Figs. 18–21), we counted and plotted the length distribution and the karyotype of shared IBDs longer than 8 cM. The kinship relations were determined based on the goodness of fit between the observed and the expected curves of various kinship categories, denoted in the top right of each karyotype plot (Supplementary Figs. 17–21). Individual pairs within each site with IBD sharing shown in figures were denoted as possibly having third-degree to fifth-degree kinship relationships (Figs. 3 and 4 and Supplementary Figs. 17–21). Individual pairs across sites sharing single IBDs of at least 25 cM in length were also counted (Extended Data Fig. 3). We also compared the sum and number of IBDs that the segments longer than 8 cM or 12 cM between individuals with different social status (that is, tomb owners and sacrificed victims) at different sites (Supplementary Table 13 and Extended Data Fig. 3). We only included individuals with an average SNP panel coverage of at least 1×. Only one individual was considered for pairs identified as genetically identical.
ROH
The presence of close-kin mating and kinship-based mating systems could be reflected by the ROH, which represent long stretches of homozygous segments along the genome of an individual. For 171 individuals with SNP counts at least 29,000, we applied hapROH74 that detects ROH in low-coverage ancient DNA data using haplotype information from a modern phased reference panel. We detected ROH with four length classes (4–8 cM, 8–12 cM, 12–20 cM and more than 20 cM). Results for individuals with SNP counts more than 400,000 were recorded as ‘high confidence’, whereas others were recorded as low confidence (Supplementary Table 14). Observed and expected ROH blocks above 4 cM were plotted using hapROH.
Genetic clustering among new samples
Samples were grouped using a combination of D statistics and PCA. We calculated the D statistics in form D(Sample 1, Sample 2; Population, Mbuti) for each pair of samples. Here, we took Mbuti as the outgroup. Sample 1 or sample 2 are two unrelated individuals from each archaeological site. The 45 individuals and populations (both ancient and present-day) making up the population in the above statistic are grouped as follows:
aNorthEA(14)
AR_EN, Bianbian, DevilsCave_N, Chokhopani, HMMH_MN, Kolyma, Mebrak, Mongolia_N_East, Okunevo_EMBA, Shamanka_EN, WLR_MN, WLR_LN, WLR_BA, Yumin; aSouthEA(7): Liangdao1, Liangdao2, Longlin, Man_Bac, Nui_Nap, Qihe and Qihe3.
DeepAsia(9)
G1, Jomon, Malta1, Onge, Papuan, Tianyuan, USR1, Vanuatu, Yana; aWest/SouthAsia(15): Afanasievo, Anatolia_N, Botai_CA, Ganj_Dareh_N, IndusPeriphery, Hajji_Firuz_C, Harappan, Iran_N, Karelia, Kotias, Kostenki14, Shahr_I_Sokhta_BA3, Ust-Ishim, Vestonice16 and Yamnaya_Kalmykia.SG.
If the samples can be grouped in the same genetic cluster, we predict that D ≈ 0 (|Z score| < 3) for most populations. Samples that deviated from this expectation within groups, or were found to be outliers in the PCA, were separated from the main group. Results were only considered from sample pairs having at least 25,000 overlapping SNPs. We summarize the matrix for the count of significant pairwise D statistics for each archaeological site in Supplementary Table 2. The detailed genetic grouping is listed in Supplementary Table 1 and the PCA clustering is presented in Fig. 1. The 18 populations defined in the text are denoted and highlighted in grey in Supplementary Table 4.
Outgroup f 3 and D statistics
We calculated f3 statistics using qp3Pop (v.412) with the form f3(Population X, Population Y; Mbuti), measuring the shared genetic drift between all combinations of populations relative to the outgroup. We used the present-day central African population, Mbuti, as the outgroup and compared newly sequenced and previously published ancient populations within or outside East Asia (Extended Data Fig. 1). The higher the f3 statistic, the more genetic drift (or shared genetic similarity) two populations share relative to Mbuti. We calculated D statistics using qpDstat (version 712) with the form D(population X, population Y; population Z, Mbuti), measuring the shared number of alleles between all combinations of grouped new populations and a diverse array of previously published ancient and present-day populations (Supplementary Tables 4, 5 and 7–11). A negative D statistic means that population Y shares more alleles with population Z than it does with population X. A positive D statistic means that population X shares more alleles with population Z than it does with population Y. Both the outgroup f3 and D statistics were calculated using AdmixTools75.
Admixture modelling with qpAdm
The ancestry proportions of ancient populations were estimated using qpAdm (v.634) in the AdmixTools package, modelling one, two or three different sources. Distal and proximal modelling were used to model the ancestry of target populations, in which two modelling types differ in the relative age of the source populations. Distal modelling considers older source populations with larger genetic distance (Yumin, sEastAsia_EN, Xitoucun, Man_Bac, Coastal_nEastAsia_EN, Xingyi_EN, AR14K, YR_MN, Wuzhuangguoliang and WLR_MN) and proximal modelling looks at younger source populations with closer genetic distances (Yumin, Wuzhuangguoliang and YR_MN), which was applied for the outlier modelling (sEastAsian_EN = Qihe2 and Liangdao2; Coastal_nEastAsia_EN = Bianbian, Boshan and Xiaogao)32. Both model types used the same set of reference populations, which include Mota, Ust-Ishim, Kostenki14, Iran_N, IndusPeriphery, LBK_EN, Motala12, Kotias, AR33K, Yana, Karelia (IndusPeriphery = Shahr_I_Sokhta_BA2 and Shahr_I_Sokhta_BA3 (5--4 ka) from Shahr-i-Sokhta in Iran, and Gonur Depe (Gonur2_BA) (roughly 4 ka) from the Bactria-Margiana Archaeological Complex in Turkmenistan)32,76. As for the subgroups (Shimao_HJGD3 and Zhoujiazhuang3) potentially having connections with Western or Central Asian populations, we consider extra sources (Anatolia_N, Afanasievo, AYTH, Saidu_Sharif_H, Kashkarchi_BA, Zevakinskiy_LBA, IndusPeriphery, Bustan_BA, Botai_CA, Satsurblia, Gonur1_BA, Dzharkutan1_BA, Gogdara_IA, Loebanr_IA, Shahr_I_Sokhta_BA1, LaBrana1, Levant_N, Stuttgart and Loschbour) along with the source populations in the distal model described above, and with the same reference population but excluding Iran_N and IndusPeriphery. A ‘rotating’ scheme77 of source and reference populations was used for distal and proximal modelling. Wuzhuangguoliang and the extra sources were used only as the source population. As the genetic make-up of the proximal sources is too close to compete for the most optimal model, on the basis of the determined sources (Wuzhuangguoliang or further sources) from distal modelling, we fixed them as the source populations. The other parameters we adopted for qpAdm modelling are: ‘allsnps: YES’, ‘details: YES’ and ‘summary: YES’. The model was deemed plausible if the tail probability of rank0 is above 0.05 and the estimated admixture proportions are between 0 and 1. A valid model with more source populations was considered only when fewer sources were rejected, and results with the lower number of sources were marked as the high-confidence model in Supplementary Table 6.
We observed two-way admixture models for Shimao_HJGD3, including 87–93% Yellow River Yangshao ancestry (represented by Wuzhuangguoliang or YR_MN) with an extra 7–13% ancestry components from diverse non-East Asian ancestries, represented mainly by Iranian farmer-related and Western Hunter-gatherer-related ancestry. Zhoujiahzuang3 could still be modelled as having one source from either Coastal_nEastAsia_EN or Wuzhuangguoliang (Supplementary Table 6). To get a broader source for the Wuzhuangguoliang population, we also conducted further qpAdm testing by swapping AR14K with the later ARpost9K population, and we found ARpost9K could better represent the AR14K and Yumin ancestry components for Wuzhuangguoliang ancestry.
Modelling of Wuzhuangguoliang ancestry
To further understand the composition of ancestry sources for Wuzhuangguoliang, we simulated the admixed populations with 20 replicates to better represent the potential ancestries with admixture proportions from 0% to 100% in increments of 1% following the method implemented in ref. 78. The measurement is based on f4 statistics in the form of f4(Wuzhuangguoliang, simulated mixture populations; nEA/outEA/steppe/AR-related, Mbuti) with a two-sourced mixture of Early Neolithic Shandong including Bianbian, Boshan and Xiaogao subpopulations with YR farming ancestry (YR_MN), or a three-sourced mixture by adding the third group with Yumin or Amur River ancestry (AR14K and ARpost9K) in comparison to nEA (Miaozigou_MN, WLR_MN and AR-related such as AR14K, AR19K, ARpost9K and DevilsCave_N), outside East Asian (outEA = Shamanka_EN, Lokomotiv_EN and Loschbour), steppe-related (Mongolia_N_East, Mongolia_N_North and Yumin) and Tibetans (Shannan2k and Zongri5.1k). The ancestral component selection is mainly based on the valid models from the qpAdm analysis. For the proportional simulation of the ternary simulated population, we first use the admixed population from YR and Coastal_nEA_EN/ARpost9K as the first admixed component, varying proportionally from 0% to 100%, and then add the second component of ARpost9K, Yumin or WLR_MN proportionally from 0% to 100%. The ranges of proportion thresholds that marked a plausible range of the second or third admixed ancestry are estimated by qpAdm modelling (Supplementary Table 6) with the source proportion ± standard error rate. The past models of admixed ancestries of Wuzhuangguoliang could be explained by Wuzhuangguoliang harbouring an ancestry that contains more affinity with Neolithic Shandong, Amur River, West Liao River or Tibetan ancestries. Although none of the specific nEA groups served as a good proxy for this unknown ancestry because adding them as the second or third source was still insufficient to model Wuzhuangguoliang ancestry (Supplementary Figs. 12–15).
Demographic modelling with qpGraph
We applied qpGraph and findGraphs functions from AdmixTools and AdmixTools2 packages79, respectively. First, we followed a general approach for modelling admixture graphs using qpGraph (v.6065) in the AdmixTools package, which started with a basic and well-understood tree (including the central African Mbuti as an outgroup, the early western Eurasian UstIshi and early East Asian Tianyuan). We then added extra populations (sEastAsia_EN, YR_MN, Wuzhuangguoliang, Shimao_HJGD1, Yumin, Wuzhuangguoliang_o and Xinhua_o) one at a time in their best-fitting positions iteratively32. An optimum tree model could be constructed based on the observed f statistics (f2, f3 and f4 for all possible pairs of populations). We required a |Z| score of less than 3 between the observed and expected values (determined by the block Jackknife) to accept the model. A small number (0.0001) was added to the diagonal entries of the estimated covariance matrix of the f statistics (Q matrix) to stabilize the matrix inversion. The qpGraph program was run with the following recommended parameters80: ‘outpop: Mbuti, blgsize: 0.05, lsqmode: YES, diag: 0.0001, hires: YES, initmix: 1000, precision: 0.0001, zthresh: 0, terse: NO, useallsnps: NO’. To not miss any unexplored graph models, we first carried out fully automated graph exploration using findGraphs tools, allowing 0 to 8 admixture events to occur in 100 algorithm iterations (Stop_gen = 100). Next, we constrained deeply diverged populations (Mbuti, Tianyuan, sEastAsia_EN, Ust-Ishim), assuming they are non-admixed with 100 algorithm iterations, setting admixture events from 0 to 3. Here sEastAsia_EN, including Qihe2 and Liangdao2, and all the denoted populations are similar to the admixture graph by qpGraph in AdmixTools. Graphs of best-fitting models are listed in Fig. 2 and Supplementary Figs. 8 and 9.
Treemix analysis
The phylogenetic relationships among populations were estimated using Treemix v.1.13 (ref. 81). The dataset for the Treemix analysis included the newly sequenced cohort including five groups based on their genetic and spatiotemporal characteristics: preShimao_5k, Shimao_4k, Taosi_4k, Shimao_4k_o and preShimao_5k_o (denoted in Supplementary Table 1) and previously published cohorts including DevilsCave_N, WLR_MN, WLR_LN, Coastal_nEastAsia_EN, Yumin, AR19K, sEastAsia_LN, sEastAsia_EN, Tianyuan, Yana and Ustlshim. The maximum likelihood tree was rooted by Mbuti (-root Mbuti) and linkage disequilibrium was compensated for by grouping sites in blocks of 500 SNPs (-k 500). A round of global rearrangements and no sample size correction were used with the parameters ‘-global -noss’, allowing 0 to 7 migration events (-m 0–7). For m = 2, we ran 1,000 bootstraps for the maximum likelihood tree (-bootstrap), then assessed 1,000 bootstrapped trees in phylip, using the consense command82 to count the number of times a particular branch or populations clustered for maximum likelihood tree with migration. Results are shown in Fig. 2. The inferred maximum likelihood trees with migrations from 0 to 7 and the corresponding residuals (Supplementary Fig. 6) were visualized with an R script from Treemix v.1.13.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The raw sequencing reads, aligned BAM files and genotypes for the newly sequenced individuals are available through the Genome Sequence Archive83 in the National Genomics Data Center84 at https://ngdc.cncb.ac.cn/gsa-human/ (accession no. PRJCA028402), and the pseudo-haploid genotyped data (Eigenstrat format) are available through OMIX, from the China National Center for Bioinformation/Beijing Institute of Genomics, Chinese Academy of Sciences at https://ngdc.cncb.ac.cn/omix/ (accession no. PRJCA028402). Mitochondrial DNA sequences (FASTA format) have been deposited in GenBase85 in the National Genomics Data Center84 available at https://ngdc.cncb.ac.cn/genbase (accession no. PRJCA028402), Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformation or published in ref. 27 and available at https://bigd.big.ac.cn/gwh/ (accession no. PRJCA009290). All software used is available online with open access. Human reference genome hg19 is available through the National Center for Biotechnology Information under accession number PRJNA31257. The previously reported ancient DNA datasets used in this study are available through the Allen Ancient DNA Resource v.62.0 at https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/FFIDCW. The worldwide base maps of land, ocean and rivers are available through WorldClim2 (ref. 46) and Natural Earth at https://www.naturalearthdata.com/.
References
Sun, Z. et al. Defining the Shimao Culture: an integrated analysis of nomenclature, territorial scope, and chronological framework. Archaeology 8, 101–108 (2020).
Sun, Z. et al. The first Neolithic urban center on China’s north Loess Plateau: the rise and fall of Shimao. Archaeol. Res. Asia 14, 33–45 (2018).
Jaang, L., Sun, Z., Shao, J. & Li, M. When peripheries were centres: a preliminary study of the Shimao-centred polity in the loess highland, China. Antiquity 92, 1008–1022 (2018).
Rawson, J. Shimao and Erlitou: new perspectives on the origins of the bronze industry in central China. Antiquity 91, e5 (2017).
Guo, Q. & Sun, Z. The East Gate of Shimao: an architectural interpretation. Archaeol. Res. Asia 14, 61–70 (2018).
Owlett, T. E., Hu, S., Sun, Z. & Shao, J. Food between the country and the city: the politics of food production at Shimao and Zhaimaoliang in the Ordos Region, northern China. Archaeol. Res. Asia 14, 46–60 (2018).
Sun, Z. et al. The Shimao site in Shenmu County, Shaanxi. Archaeology 7, 15–24 (2013).
Han, Q. Critical issues in the Huangchengtai Cemetery of the Shimao Site. Archaeol. Cult. Relics 11, 69–77 (2024).
Pei, X. A preliminary study of burial practices in the Shimao Culture. Archaeol. Cult. Relics 2, 78–85 (2022).
He, N. Taosi: an archaeological example of urbanization as a political center in prehistoric China. Archaeol. Res. Asia 14, 20–32 (2018).
Zhao, Y. The Liangzhu: an exemplar of ancient Chinese civilization. Cult. Relics South. China 1, 69–76 (2018).
Zhang, H. The secular kingship of the Early Erlitou State. J. Natl Mus. Chin. Hist. 11, 15–28 (2023).
Xiao, J., Shang, Z., Zhang, Z., Xiao, S. & Jia, X. A preliminary study on the mechanism of the Liangzhu culture’s migration across the Yangtze river. Front. Earth Sci. 11, 1121469 (2023).
Liu, L. State emergence in early China. Annu. Rev. Anthropol. 38, 217–232 (2009).
Liu, L. & Chen, X. Archaeology of China: From the Late Paleolithic to the Early Bronze Age (Cambridge Univ. Press, 2012).
Shao, J. A comparative study of Shimao and Taosi. Archaeology 5, 65–77 (2020).
Gnecchi-Ruscone, G. A. et al. Network of large pedigrees reveals social practices of Avar communities. Nature 629, 376–383 (2024).
Villalba-Mouco, V. et al. Kinship practices in the early state El Argar society from Bronze Age Iberia. Sci. Rep. 12, 22415 (2022).
Blöcher, J. et al. Descent, marriage, and residence practices of a 3,800-year-old pastoral community in Central Eurasia. Proc. Natl Acad. Sci. USA 120, e2303574120 (2023).
Rivollat, M. et al. Extensive pedigrees reveal the social organization of a Neolithic community. Nature 620, 600–606 (2023).
Mittnik, A. et al. Kinship-based social inequality in Bronze Age Europe. Science 366, 731–734 (2019).
Kennett, D. J. et al. Archaeogenomic evidence reveals prehistoric matrilineal dynasty. Nat. Commun. 8, 14115 (2017).
Žegarac, A. et al. Ancient genomes provide insights into family structure and the heredity of social status in the early Bronze Age of southeastern Europe. Sci. Rep. 11, 10072 (2021).
Gretzinger, J. et al. Evidence for dynastic succession among early Celtic elites in Central Europe. Nat. Hum. Behav. 8, 1467–1480 (2024).
Seersholm, F. V. et al. Repeated plague infections across six generations of Neolithic farmers. Nature 632, 114–121 (2024).
Wang, J. et al. Ancient DNA reveals a two-clanned matrilineal community in Neolithic China. Nature 643, 1304–1311 (2025).
Xue, J. et al. Ancient mitogenomes reveal the origins and genetic structure of the neolithic Shimao population in northern China. Front. Genet. 13, 909267 (2022).
Wang, M. et al. Multiple human population movements and cultural dispersal events shaped the landscape of Chinese paternal heritage. Mol. Biol. Evol. 41, msae122 (2024).
Ning, C. et al. Ancient genomes from northern China suggest links between subsistence changes and human migration. Nat. Commun. 11, 2700 (2020).
Fu, Q. et al. An early modern human from Romania with a recent Neanderthal ancestor. Nature 524, 216–219 (2015).
Mao, X. W. et al. The deep population history of northern East Asia from the Late Pleistocene to the Holocene. Cell 184, 3256–3266 (2021).
Yang, M. A. et al. Ancient DNA indicates human population shifts and admixture in northern and southern China. Science 369, 282–288 (2020).
Wang, C. C. et al. Genomic insights into the formation of human populations in East Asia. Nature 591, 413–419 (2021).
Zou, Y. et al. Ancient genomes from the Yellow River Bend reveal long-distance population interactions between the Central Plains, Steppe, and southern China. Cell Rep. 44, 116034 (2025).
Miller, N. et al. The integration of plant and animal remains in the study of the agropastoral economy at Gordion, Turkey from food and fuel to farms and flocks. Curr. Anthropol. 50, 915–924 (2009).
Owlett, T. Finding greener pastures: the local development of Agro-pastoralism in the Ordos Region, North China. J. Indo Pa. Archaeol. 40, 42–53 (2016).
Allentoft, M. E. et al. Population genomics of Bronze Age Eurasia. Nature 522, 167–172 (2015).
Haak, W. et al. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, 207–211 (2015).
Moussa, N. M. et al. Insights into Lake Baikal’s ancient populations based on genetic evidence from the Early Neolithic Shamanka II and Early Bronze Age Kurma XI cemeteries. Archaeol. Res. Asia 25, 100238 (2021).
Jeong, C. et al. A dynamic 6,000-year genetic history of Eurasia’s Eastern Steppe. Cell 183, 890–904 (2020).
Mathieson, I. et al. Genome-wide patterns of selection in 230 ancient Eurasians. Nature 528, 499–503 (2015).
de Barros Damgaard, P. et al. The first horse herders and the impact of early Bronze Age steppe expansions into Asia. Science 360, eaar7711 (2018).
Cassidy, L. M. et al. A dynastic elite in monumental Neolithic society. Nature 582, 384–388 (2020).
Knipper, C. et al. Female exogamy and gene pool diversification at the transition from the Final Neolithic to the Early Bronze Age in central Europe. Proc. Natl Acad. Sci. USA 114, 10083–10088 (2017).
Ensor, B. E. The Not Very Patrilocal European Neolithic: Strontium, aDNA, and Archaeological Kinship Analyses (Archaeopress, 2021).
Fick, S. & Hijmans, J. WorldClim 2: new 1 km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 37, 4302–4315 (2017).
Dabney, J. et al. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments. Proc. Natl Acad. Sci. USA 110, 15758–15763 (2013).
Rohland, N., Harney, E., Mallick, S., Nordenfelt, S. & Reich, D. Partial uracil-DNA-glycosylase treatment for screening of ancient DNA. Philos. Trans. R. Soc. B Biol. Sci. 370, 11 (2015).
Meyer, M. & Kircher, M. Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harbor Protoc. 2010, pdb.prot5448 (2010).
Fu, Q. et al. DNA analysis of an early modern human from Tianyuan Cave, China. Proc. Natl Acad. Sci. USA 110, 2223–2227 (2013).
Renaud, G., Stenzel, U. & Kelso, J. leeHom: adaptor trimming and merging for Illumina sequencing reads. Nucleic Acids Res. 42, e141 (2014).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Andrews, R. M. et al. Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA. Nat. Genet. 23, 147–147 (1999).
Church, D. M. et al. Modernizing reference genome assemblies. PLoS Biol. 9, e1001091 (2011).
Fu, Q. et al. A revised timescale for human evolution based on ancient mitochondrial genomes. Curr. Biol. 23, 553–559 (2013).
Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD: analysis of next generation sequencing data. BMC Bioinf. 15, 356 (2014).
Huang, Y. & Ringbauer, H. hapCon: estimating contamination of ancient genomes by copying from reference haplotypes. Bioinformatics 38, 3768–3777 (2022).
Skoglund, P. et al. Separating endogenous ancient DNA from modern day contamination in a Siberian Neandertal. Proc. Natl Acad. Sci. USA 111, 2229–2234 (2014).
van Oven, M. & Kayser, M. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Hum. Mutat. 30, 386–394 (2009).
Weissensteiner, H. et al. HaploGrep 2: mitochondrial haplogroup classification in the era of high-throughput sequencing. Nucleic Acids Res. 44, 58–63 (2016).
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, 2074–2093 (2006).
Lazaridis, I. et al. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513, 409–413 (2014).
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
Lu, D. S. et al. Ancestral origins and genetic history of Tibetan highlanders. Am. J. Hum. Genet. 99, 580–594 (2016).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Alaçamlı, E. et al. READv2: advanced and user-friendly detection of biological relatedness in archaeogenomics. Genome Biol. 25, 216 (2024).
Lipatov, M., Sanjeev, K., Patro, R. & Veeramah, K. R. Maximum likelihood estimation of biological relatedness from low coverage sequencing data. Preprint at bioRxiv https://doi.org/10.1101/023374 (2015).
Ringbauer, H. et al. Accurate detection of identity-by-descent segments in human ancient DNA. Nat. Genet. 56, 143–151 (2024).
Popli, D., Peyrégne, S. & Peter, B. M. KIN: a method to infer relatedness from low-coverage ancient DNA. Genome Biol. 24, 10 (2023).
Blouin, M. S. DNA-based methods for pedigree reconstruction and kinship analysis in natural populations. Trends Ecol. Evol. 18, 503–511 (2003).
Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).
Rubinacci, S., Hofmeister, R. J., Sousa da Mota, B. & Delaneau, O. Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes. Nat. Genet. 55, 1088–1090 (2023).
Rubinacci, S., Ribeiro, D. M., Hofmeister, R. J. & Delaneau, O. Efficient phasing and imputation of low-coverage sequencing data using large reference panels. Nat. Genet. 53, 120–126 (2021).
Ringbauer, H., Novembre, J. & Steinrucken, M. Parental relatedness through time revealed by runs of homozygosity in ancient DNA. Nat. Commun. 12, 5425 (2021).
Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).
Narasimhan, V. M. et al. The formation of human populations in South and Central Asia. Science 365, eaat7487 (2019).
Harney, É, Patterson, N., Reich, D. & Wakeley, J. Assessing the performance of qpAdm: a statistical tool for studying population admixture. Genetics 217, iyaa045 (2021).
Van de Loosdrecht, M. et al. Pleistocene North African genomes link near Eastern and sub-Saharan African human populations. Science 360, 548–552 (2018).
Maier, R. et al. On the limits of fitting complex models of population history to F-statistics. eLife 12, e85492 (2023).
Lipson, M. Applying f4-statistics and admixture graphs: theory and examples. Mol. Ecol. Resour. 20, 1658–1667 (2020).
Pickrell, J. K. & Pritchard, J. K. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 8, 17 (2012).
Baum, B. R. PHYLIP: phylogeny inference package. Version 3.2. Joel Felsenstein. Q. Rev. Biol. 64, 539–541 (1989).
Chen, T. et al. The Genome Sequence Archive family: toward explosive data growth and diverse data types. Genom. Proteom. Bioinform. 19, 578–583 (2021).
CNCB-NGDC Members and Partners. Database resources of the National Genomics Data Center, China National Center for Bioinformation in 2024. Nucleic Acids Res. 52, 18–32 (2024).
Bu, C. et al. GenBase: a nucleotide sequence database. Genom. Proteom. Bioinform. 22, qzae047 (2024).
Acknowledgements
We thank WorldClim2 and Natural Earth for providing the map data used in this study. We thank the archaeological teams from Shaanxi, the Archaeology Institute of the National Museum of China and the Archaeology Institute of the Chinese Academy of Social Sciences. This work was supported by the Chinese Academy of Sciences (grant YSBR-019), the National Natural Science Foundation of China (grant 41925009), National Key R&D Program of China (grants 2023YFF0905700 and 2020YFC1521601) and Beijing Nova Program (grant Z211100002121040).
Author information
Authors and Affiliations
Contributions
Q.F. designed and supervised the research project. Z.C., Q.F., X.M., H.S., F.B. and X.F. processed or analysed the data. Z.C. and H.S. did the data visualization. J.D.G., T.L. and J.X. did the data investigation. Z.S., X.P., W.P., Q.H., W.W., L.C., X.D., N.H., X.G., N.D., L.Z. and J.S. prepared the archaeological samples and materials. X.W. performed the carbon dating. Q.F., P.C., Q.D. and F.L. performed or supervised wet laboratory work. Z.C. and Q.F. wrote the paper. Z.C., E.A.B., J.D.G., Q.F., Z.S., X.D., Q.H., X.P., N.H. and J.S. discussed and revised the manuscripts. Z.C., H.S., J.X., J.D.G. and W.W. wrote and prepared the supplementary materials. All authors discussed, critically revised and approved the final version of the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature thanks Yinqiu Cui, Harald Ringbauer and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Genomic clustering of ancient Eurasian populations based on outgroup-f3 statistics.
Populations with red font are new samples from this study.
Extended Data Fig. 2 Admixture plot with worldwide modern populations.
Admixture plots with K = 2-10 and 10 replicates, the left color bar indicates the genetic populations listed at right in different colors.
Extended Data Fig. 3 Kinship practices and the IBD sharing between Shimao and Taosi culture-related populations.
a, Diagram of kinship connections inferred by READ and single IBD sharing across sites. b, Correlations between geo-distance and the kinship relations (here, Normalized P0 is used as the proxy) across sites. Here, only shared long IBD segments were plotted, which were defined by a single IBD of more than 25 cM between individuals from Shimao_HCT and Shimao_HYW/Shimao_HJGD sites with both average coverages >1x. If the average coverage of either individual was lower than 1x, such as the individual pairs (n = 2) of Zhoujiazhuang and Shimao_HJGD, we denoted those connections as “low confidence”. These should be considered less reliable given the high false-positive rate shown for lower coverage data. Other individual pairs of IBD that potentially indicate third- to fifth-degree kinships are shown in Supplementary Table 12. Map in a reproduced with permission from Shaanxi Academy of Archaeology7.
Extended Data Fig. 4 IBD sharing among groups with low to high status.
Here, the proportion of individual pairs (coverage of both individuals more than 1x) that shared at least one IBD segment larger than 12 cM (a) or 8 cM (b) in each site and different social status (c and d). Individuals without clear status information were grouped as not recorded (“NR”). No significant differences in IBD sharing were found between different social status (tomb owners or sacrificed victims).
Extended Data Fig. 5 Run of homozygous (ROH) length distribution and karyotype plot for three individuals.
a, ROHs for all the individuals of high coverage (SNPs > 400,000) exhibited ROHs > 4 cM at each site. Individuals are sorted by sites, and colored bars within the histogram represent the sum of ROHs in different length categories. b, ROH length distribution and karyotype of three individuals with relatively longer ROHs to further infer the possible relationships between their parents.
Supplementary information
Supplementary Information (download PDF )
Supplementary Notes 1–3, Supplementary Figs. 1–21 and Supplementary References.
Supplementary Tables (download XLSX )
Supplementary Tables 1–14.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Chen, Z., Gardner, J.D., Sun, Z. et al. Ancient DNA from Shimao city records kinship practices in Neolithic China. Nature 648, 659–667 (2025). https://doi.org/10.1038/s41586-025-09799-x
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41586-025-09799-x






