Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Nature Communications
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. nature communications
  3. articles
  4. article
A catalogue of early diverged contemporary human genome variation reveals distinct Khoe-San populations
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 10 February 2026

A catalogue of early diverged contemporary human genome variation reveals distinct Khoe-San populations

  • Weerachai Jaratlerdsiri  ORCID: orcid.org/0000-0001-9100-18071,2,
  • Pamela X. Y. Soh  ORCID: orcid.org/0000-0002-8485-65561,
  • Tingting Gong  ORCID: orcid.org/0000-0001-5907-24451,3,
  • Jue Jiang  ORCID: orcid.org/0000-0003-0920-83101,
  • Zolani Simayi4,
  • Desiree C. Petersen  ORCID: orcid.org/0000-0002-0817-25745,
  • Errol Holland6,
  • Eva K. F. Chan  ORCID: orcid.org/0000-0002-6104-37637,
  • Kathrine E. Theron4,
  • Wilfrid H. G. Haacke8,
  • Hagen E. A. Förtsch9,
  • M. S. Riana Bornman  ORCID: orcid.org/0000-0003-3975-233310,11,
  • David M. Thomas12,
  • Jeffrey Mphahlele13,14 &
  • …
  • Vanessa M. Hayes  ORCID: orcid.org/0000-0002-4524-72801,9,15,16 

Nature Communications , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Genetic variation
  • Genetics research

Abstract

Creating a catalogue of early diverged genome variation is critical to determine the true extent of human diversity and associated medical impact. Generating deep whole genome data for 150 Khoe-San (12 groups, 1 unclassified), and 40 regionally comparative Southern Africans (3 groups), we identify ~30 million small-to-large variants - over 1.3 million unknown single nucleotide variants. Representing shared traditionally forager lifestyles and click-speaking languages, we identify San and Damara as separate phylogenetic lineages, contributing two admixture waves to Nama. While San represented modern humans’ deep divergence (~115 thousand years ago), Damara divergence is recent, with both showing high effective population sizes between 45–150 thousand years ago. Developing an assembly-based test we report 1,376 genes under positive selection (dN/dS = 19.46) of which 479 are significantly associated with forager peoples and, therefore, maintained ancestral alleles that differ from derived genetic variation observed in non-African biomedical resources.

Data availability

Raw sequencing data, alignments, germline variant calls (small variants, short tandem repeats and mobile element insertions) and derived datasets are available for general research use for browsing and download through the European Genome Phenome Achieve (EGA) [https://ega-archive.org] via adherence to KSGP Data Access Committee (DAC) EGAC50000000798 policy and approval [https://dac.ega-archive.org/EGAC50000000798/requests] for KSGP under accession number EGAS50000001408. Genomic data for South African participants have previously been deposited at the EGA under accession number EGAD00001009067. The public release of SGDP data is available through the EBI European Nucleotide Archive under accession numbers PRJEB9586 and ERP010710. The Altai Neanderthal genome can be downloaded online [http://cdna.eva.mpg.de/neandertal/altai/AltaiNeandertal/]. Source data are provided with this paper. Access to KSGP and SAPCS sequencing data may be requested via the KSGP or SAPCS Data Access Committee’s (DACs), respectively, and will be made available to researchers with appropriate feasibility and corresponding ethics approvals to ensure the safeguarding of patient genomic information (contact V.M.H. directly). Both DACs include community representation, with all studies directly communicated with community representative partners. Restrictions include (i) no transfer to third parties allowed, (ii) inability of the researchers to adequately articulate their research question at application or the question is deemed culturally inappropriate, (iii) a report of the results of the research to be provided to the respective DACs prior to publication (or when requested), (iv) written DAC approval for publication of final draft, (v) acknowledgment of the KSGP or SAPCS community leaders in publications/presentations, (vi) researchers cannot utilise the data for commercial purposes or any other purposes not approved by the DAC, and (vii) approval will not be given that excludes other researchers from accessing data. Data currently being used for capacity building in under-resourced studies across Sub-Saharan Africa will be given priority and at times may be granted time-limited exclusive rights for no more than a two-year period. Source data are provided in this paper.

Code availability

The core computational pipelines used in this study for read alignment, quality control and variant calling are described in Supplementary Information. Analysis code for assembly-based genome analysis of positive selection is available at GitHub (https://github.com/wjaratlerdsiri/aGATK) or Reference96.

References

  1. Mallick, S. et al. The simons genome diversity project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).

    Google Scholar 

  2. 1000-Genomes-Project-Consortium. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Google Scholar 

  3. Choudhury, A. et al. High-depth African genomes inform human migration and health. Nature 586, 741–748 (2020).

    Google Scholar 

  4. Jaratlerdsiri, W. et al. African-specific molecular taxonomy of prostate cancer. Nature 609, 552–559 (2022).

    Google Scholar 

  5. Soh, P. X. Y. S. & Hayes, V. M. Common genetic variants associated with prostate cancer risk: the need for African Inclusion. Eur. Urol. 84, 22–24 (2023).

    Google Scholar 

  6. Sengupta, D. et al. Genetic substructure and complex demographic history of South African Bantu speakers. Nat. Commun. 7, 2080 (2021).

    Google Scholar 

  7. Bergström, A. et al. Insights into human genetic variation and population history from 929 diverse genomes. Science 367, eaay5012 (2020).

    Google Scholar 

  8. Fatumo, S. et al. Promoting the genomic revolution in Africa through the Nigerian 100K Genome Project. Nat. Genet. 54, 531–536 (2022).

    Google Scholar 

  9. Schuster, S. C. et al. Complete Khoisan and Bantu genomes from Southern Africa. Nature 463, 943–947 (2010).

    Google Scholar 

  10. Schlebusch, C. M. et al. Khoe-San genomes reveal unique variation and confirm the deepest population divergence in homo sapiens. Mol. Biol. Evol. 37, 2944–2954 (2020).

    Google Scholar 

  11. Güldemann, T. & Fehn, A.-M. Beyond ‘Khoisan’: Historical Relations in the Kalahari Basin, (John Benjamins Publishing Company, Amsterdam, 2014).

  12. Wilfred, H. Khoekhoegowab (Nama/Damara). in The Social and Political History of Southern Africa’s Languages (eds. Kamusella, T. & Ndhlovu, F.) 133–158 (Palgrave Macmillan, London, 2018).

  13. Fehn, A. M., Amorim, B. & Rocha, J. The linguistic and genetic landscape of southern Africa. J. Anthropol. Sci. 100, 243–265 (2022).

    Google Scholar 

  14. Smith, A., Malherbe, C., Guemther, M. & Berens, P. The Bushmen of Southern Africa. A Foraging Society in Transition. (David Philip Publishers, South Africa, 2004).

  15. Barnard, A. B. Anthropology and the Bushman, (Routledge, New York, 2007).

  16. Koot, S. & Walter, V. B. Ju|’hoansi Lodging in a Namibian conservancy: CBNRM, tourism and increasing domination. Conserv. Soc. 15, 136–146 (2017).

    Google Scholar 

  17. Hayes, V. M. Indigenous genomics. Science 332, 639 (2011).

    Google Scholar 

  18. Haacke, W. H. G. The social and political history of Southern Africa’s languages. in Khoekhoegowab (Nama/Damara) (eds. Kamusella, T. & Ndhlovu, F.) (Palgrave Macmillan, London, 2018).

  19. Sullivan, S. & Ganuses, W. S. Understanding Damara / ‡Nūkhoen and ||Ubun indigeneity and marginalisation in Namibia, (Land, environment and development project, Legal Assistance Centre, Windhoek, Republic of Namibia, 2020).

  20. Kinahan, J. The rock art of ǀUi-ǁAis (Twyfelfontein) Namibia’s first World Heritage Site. (Namib Desert Archaeological Survey, Windhoek, Namibia, 2007).

  21. Lander, F. & Russell, T. The archaeological evidence for the appearance of pastoralism and farming in southern Africa. PLoS ONE 13, e0198941 (2018).

    Google Scholar 

  22. Ragsdale, A. P. et al. A weakly structured stem for human origins in Africa. Nature 617, 755–763 (2023).

    Google Scholar 

  23. Fan, S. et al. Whole-genome sequencing reveals a complex African population demographic history and signatures of local adaptation. Cell 186, 923–939 (2023).

    Google Scholar 

  24. Soodyall, H. & Jenkins, T. Mitochondrial DNA polymorphisms in Negroid populations from Namibia: new light on the origins of the Dama, Herero and Ambo. Ann. Hum. Biol. 20, 477–485 (1993).

    Google Scholar 

  25. Güldemann, T. & Stoneking, M. A historical appraisal of clicks: A linguistic and genetic population perspective. Annu. Rev. Anthropol. 37, 93–109 (2008).

    Google Scholar 

  26. Barbieri, C. et al. Migration and interaction in a contact zone: mtDNA variation among Bantu-speakers in Southern Africa. PLoS ONE 9, e99117 (2014).

    Google Scholar 

  27. Oliveira, S. et al. Matriclans shape populations: Insights from the Angolan Namib Desert into the maternal genetic history of southern Africa. Am. J. Phys. Anthropol. 165, 518–535 (2018).

    Google Scholar 

  28. Grollemund, R. et al. Bantu expansion shows that habitat alters the route and pace of human dispersals. Proc. Natl. Acad. Sci. USA 112, 13296–13301 (2015).

    Google Scholar 

  29. Hammond-Tooke, W. D. Southern Bantu origins: light from kinship terminology. South Afr. Humanit. 16, 71–78 (2004).

    Google Scholar 

  30. Koile, E., Greenhill, S. J., Blasi, D. E., Bouckaert, R. & Gray, R. D. Phylogeographic analysis of the Bantu language expansion supports a rainforest route. Proc. Natl. Acad. Sci. USA 119, e2112853119 (2022).

    Google Scholar 

  31. Choudhury, A., Sengupta, D., Ramsay, M. & Schlebusch, C. Bantu-speaker migration and admixture in southern Africa. Hum. Mol. Genet. 30, R56–R63 (2021).

    Google Scholar 

  32. Skoglund, P. et al. Reconstructing prehistoric African population structure. Cell 171, 59–71.e21 (2017).

    Google Scholar 

  33. Patin, E. et al. Dispersals and genetic adaptation of Bantu-speaking populations in Africa and North America. Science 356, 543–546 (2017).

    Google Scholar 

  34. Gymrek, M., Golan, D., Rosset, S. & Erlich, Y. lobSTR: A short tandem repeat profiler for personal genomes. Genome Res. 22, 1154–1162 (2012).

    Google Scholar 

  35. Gardner, E. J. et al. The mobile element locator tool (MELT): population-scale mobile element discovery and biology. Genome Res. 27, 1916–1929 (2017).

    Google Scholar 

  36. Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021).

    Google Scholar 

  37. Ceballos, F. C., Joshi, P. K., Clark, D. W., Ramsay, M. & Wilson, J. F. Runs of homozygosity: windows into population history and trait architecture. Nat. Rev. Genet. 19, 220–234 (2018).

    Google Scholar 

  38. Bennett, E. A. et al. Active Alu retrotransposons in the human genome. Genome Res. 18, 1875–1883 (2008).

    Google Scholar 

  39. Fan, S. et al. African evolutionary history inferred from whole genome sequence data of 44 indigenous African populations. Genome Biol. 20, 204 (2019).

    Google Scholar 

  40. Chan, E. K. F. et al. Human origins in a southern African palaeo-wetland and first migrations. Nature 575, 185–189 (2019).

    Google Scholar 

  41. Raj, A., Stephens, M. & Pritchard, J. K. fastSTRUCTURE: variational inference of population structure in large SNP data sets. Genetics 197, 573–589 (2014).

    Google Scholar 

  42. Pritchard, J. K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).

    Google Scholar 

  43. Byrska-Bishop, M. et al. High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios. Cell 185, 3426–3440 (2022).

    Google Scholar 

  44. Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).

    Google Scholar 

  45. Schlebusch, C. M. et al. Southern African ancient genomes estimate modern human divergence to 350,000 to 260,000 years ago. Science 358, 652–655 (2017).

    Google Scholar 

  46. Lipson, M. et al. Ancient West African foragers in the context of African population history. Nature 577, 665–670 (2020).

    Google Scholar 

  47. Schlebusch, C. M. et al. Genomic variation in seven Khoe-San groups reveals adaptation and complex African history. Science 338, 374–379 (2012).

    Google Scholar 

  48. Schlebusch, C. M., Prins, F., Lombard, M., Jakobsson, M. & Soodyall, H. The disappearing San of southeastern Africa and their genetic affinities. Hum. Genet. 135, 1365–1373 (2016).

    Google Scholar 

  49. May, A. et al. Genetic diversity in black South Africans from Soweto. BMC Genom. 14, 644 (2013).

    Google Scholar 

  50. Aberer, A. J., Krompass, D. & Stamatakis, A. Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice. Syst. Biol. 62, 162–166 (2013).

    Google Scholar 

  51. Prüfer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 43–49 (2014).

    Google Scholar 

  52. Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).

    Google Scholar 

  53. Giliomee, H. B. & Mbenga, B. K. Nuwe geskiedenis van Suid-Afrika. (Tafelberg, 2007).

  54. Choin, J. et al. Genomic insights into population history and biological adaptation in Oceania. Nature 592, 583–589 (2021).

    Google Scholar 

  55. Malaspinas, A. S. et al. A genomic history of Aboriginal Australia. Nature 538, 207–214 (2016).

    Google Scholar 

  56. Speidel, L., Forest, M., Shi, S. & Myers, S. R. A method for genome-wide genealogy estimation for thousands of samples. Nat. Genet. 51, 1321–1329 (2019).

    Google Scholar 

  57. Browning, S. R. et al. Ancestry-specific recent effective population size in the Americas. PLoS Genet. 14, e1007385 (2018).

    Google Scholar 

  58. Sabeti, P. C. et al. Genome-wide detection and characterization of positive selection in human populations. Nature 449, 913–918 (2007).

    Google Scholar 

  59. Hughes, A. L. & Nei, M. Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature 335, 167–170 (1988).

    Google Scholar 

  60. Wilson, D. J. & McVean, G. Estimating diversifying selection and functional constraint in the presence of recombination. Genetics 172, 1411–1425 (2006).

    Google Scholar 

  61. Kim, U. K. et al. Positional cloning of the human quantitative trait locus underlying taste sensitivity to phenylthiocarbamide. Science 299, 1221–1225 (2003).

    Google Scholar 

  62. Petersen, D. C. et al. Complex patterns of genomic admixture within southern Africa. PLoS Genet. 9, e1003309 (2013).

    Google Scholar 

  63. Sabbagh, A., Darlu, P., Crouau-Roy, B. & Poloni, E. S. Arylamine N-acetyltransferase 2 (NAT2) genetic diversity and traditional subsistence: a worldwide population survey. PLoS ONE 6, e18507 (2011).

    Google Scholar 

  64. Lamason, R. L. et al. SLC24A5, a putative cation exchanger, affects pigmentation in zebrafish and humans. Science 310, 1782–1786 (2005).

    Google Scholar 

  65. Engelken, J. et al. Extreme population differences in the human zinc transporter ZIP4 (SLC39A4) are explained by positive selection in Sub-Saharan Africa. PLoS Genet. 10, e1004128 (2014).

    Google Scholar 

  66. Campbell, M. C. & Tishkoff, S. A. African genetic diversity: implications for human demographic history, modern human origins, and complex disease mapping. Annu. Rev. Genomics Hum. Genet. 9, 403–433 (2008).

    Google Scholar 

  67. Yi, X. et al. Sequencing of fifty human exomes reveals adaptation to high altitude. Science 329, 75–78 (2010).

    Google Scholar 

  68. Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

    Google Scholar 

  69. Lacaze, P. et al. The Medical Genome Reference Bank: a whole-genome data resource of 4000 healthy elderly individuals. Rationale and cohort design. Eur. J. Hum. Genet. 27, 308–316 (2019).

    Google Scholar 

  70. Flicek, P. et al. Ensembl 2014. Nucleic Acids Res. 42, D749–D755 (2014).

    Google Scholar 

  71. Li, H. & Durbin, R. Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011).

    Google Scholar 

  72. Hauser, A. S. et al. Pharmacogenomics of GPCR Drug Targets. Cell 172, 41–54 (2018).

    Google Scholar 

  73. Whirl-Carrillo, M. et al. Pharmacogenomics knowledge for personalized medicine. Clin. Pharmacol. Ther. 92, 414–417 (2012).

    Google Scholar 

  74. Welter, D. et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001–D1006 (2014).

    Google Scholar 

  75. Gamazon, E. R. et al. Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation. Nat. Genet. 50, 956–967 (2018).

    Google Scholar 

  76. Weis, W. I. & Kobilka, B. K. The molecular basis of G protein-coupled receptor activation. Annu. Rev. Biochem. 87, 897–919 (2018).

    Google Scholar 

  77. Conti, D. V. et al. Trans-ancestry genome-wide association meta-analysis of prostate cancer identifies new susceptibility loci and informs genetic risk prediction. Nat. Genet. 53, 65–75 (2021).

    Google Scholar 

  78. Barrett, R. D. & Hoekstra, H. E. Molecular spandrels: tests of adaptation at the genetic level. Nat. Rev. Genet. 12, 767–780 (2011).

    Google Scholar 

  79. Akey, J. M. Constructing genomic maps of positive selection in humans: where do we go from here? Genome Res. 19, 711–722 (2009).

    Google Scholar 

  80. Szpak, M., Xue, Y., Ayub, Q. & Tyler-Smith, C. How well do we understand the basis of classic selective sweeps in humans? FEBS Lett. 593, 1431–1448 (2019).

    Google Scholar 

  81. Vitti, J. J., Grossman, S. R. & Sabeti, P. C. Detecting natural selection in genomic data. Annu. Rev. Genet. 47, 97–120 (2013).

    Google Scholar 

  82. Nurk, S. et al. The complete sequence of a human genome. Science 376, 44–53 (2022).

    Google Scholar 

  83. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler Transform. Bioinformatics 25, 1754–1760 (2009).

    Google Scholar 

  84. Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinform. 11, 11.10.1–33 (2013).

    Google Scholar 

  85. Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

    Google Scholar 

  86. Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

    Google Scholar 

  87. Behr, A. A., Liu, K. Z., Liu-Fang, G., Nakka, P. & Ramachandran, S. pong: fast analysis and visualization of latent clusters in population genetic data. Bioinformatics 32, 2817–2823 (2016).

    Google Scholar 

  88. Diaz-Papkovich, A., Anderson-Trocmé, L., Ben-Eghan, C. & Gravel, S. UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts. PLoS Genet. 15, e1008432 (2019).

    Google Scholar 

  89. Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).

    Google Scholar 

  90. Maier, R. et al. On the limits of fitting complex models of population history to f-statistics. Elife 12, e85492 (2023).

    Google Scholar 

  91. Wangkumhang, P., Greenfield, M. & Hellenthal, G. An efficient method to identify, date, and describe admixture events using haplotype information. Genome Res. 32, 1553–1564 (2022).

    Google Scholar 

  92. Delaneau, O., Marchini, J. & 1000-Genomes-Project-Consortium. Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel. Nat. Commun. 5, 3934 (2014).

    Google Scholar 

  93. Schiffels, S. & Durbin, R. Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014).

    Google Scholar 

  94. Kamm, J., Terhorst, J., Durbin, R. & Song, Y. S. Efficiently inferring the demographic history of many populations with allele count data. J. Am. Stat. Assoc. 115, 1472–1487 (2020).

    Google Scholar 

  95. Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    Google Scholar 

  96. Jaratlerdsiri, W. A catalogue of early diverged contemporary human genome variation reveals distinct Khoe-San populations, Code Ocean https://doi.org/10.24433/CO.6181495.v1 (2025).

Download references

Acknowledgements

The work presented was supported by a donation provided by the University of Limpopo in South Africa (to V.M.H. and J.M.), the Garvan Institute of Medical Research Foundation and Medical Genome Research Biobank (MGRB) in Australia (to D.M.T.), and by an Australian Medical Research Future Fund (MRFF) Genomics Health Futures Mission Grant (2025/MRF2045394 to V.M.H., W.J., D.M.T. and P.X.Y.S.), while partially supported through the U.S.A. Congressionally Directed Medical Research Programmes (CDMRP) Prostate Cancer Research Programme (PCRP) Health Equity Research and Outcomes Improvement Consortium (HEROIC) Award (PC210168 and PC230673, HEROIC Prostate Cancer Precision Health (PCaPH) Africa1K to V.M.H., M.S.R.B., Peter Ngugi from the University of Nairobi, Kenya and Gail Prins from the University of Illinois at Chicago, U.S.A.). J.J. is supported by a U.S.A. Prostate Cancer Foundation (PCF) PhD Scholarship as part of a Challenge award) 23CHAL18, to V.M.H.) and V.M.H. is supported by the Petre Foundation through the University of Sydney Foundation. We acknowledge the use of the National Computational Infrastructure (NCI), which is supported by the Australian Government, and accessed through the National Computational Merit Allocation Scheme (V.M.H., E.K.F.C. and W.J.), the Intersect Computational Merit Allocation Scheme (V.M.H.), Intersect Australia Limited and the Sydney Informatics Hub, Core Research Facility, and we acknowledge the staff at the Garvan Institute of Medical Research’s Kinghorn Centre for Clinical Genomics (KCCG) core facility for genome sequencing. We thank the study participants and their representative communities who contributed to this study; without their contribution and continued engagement, this research would not be possible. We are in debt to the many local Namibians who have aided during community engagement, providing critical logistical, historical, cultural and linguistic insights, specifically E. Adams, A.A. Collins, R. Friederich, B. G/aq’o, N. /kun, J. /kunta, H. Mische, F. Naque, D. Naque, H. Oosthuizen, E. Oosthuizen, A. Oosthuysen, E. Oosthuysen, D. Roux, J. Sinvula, C. Swau, T. Tauros, T. Tsebe and R. Wilkinson, while we are grateful to C.P. Bennett from Evolving Picture in Sydney (https://evolvingpicture.com/) for providing community recording. We further acknowledge and fondly remember the late Archbishop Emeritus Desmond Tutu (South Africa), who remained an advocate and key participant of the Ubuntu Project, to the late Chief Seth M. Kooitjie (Namibia), past Chairperson of the Nama Traditional Leaders Association for his blessing and critical support, and to Professors Philip A. Venter (University of Limpopo, South Africa) and Christopher F. Heyns (University of Stellenbosch, South Africa) for their respective foundational work in establishing ethical frameworks within South Africa and Namibia, respectively. We are more recently grateful to Professor Lamech Mwapagha, Namibia University of Science & Technology (NUST), for taking on the responsibility of KSGP DAC Chair.

Author information

Authors and Affiliations

  1. Ancestry and Health Genomics Laboratory, Charles Perkins Centre, School of Medical Sciences, Faculty of Medicine and Health, University of Sydney, Camperdown, New South Wales, Australia

    Weerachai Jaratlerdsiri, Pamela X. Y. Soh, Tingting Gong, Jue Jiang & Vanessa M. Hayes

  2. Computational Genomics Group, Charles Perkins Centre, School of Medical Sciences, Faculty of Medicine and Health, University of Sydney, Camperdown, New South Wales, Australia

    Weerachai Jaratlerdsiri

  3. Human Phenome Institute, Fudan University, Shanghai, China

    Tingting Gong

  4. Faculty of Health Sciences, University of Limpopo, Mankweng, South Africa

    Zolani Simayi & Kathrine E. Theron

  5. South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa

    Desiree C. Petersen

  6. Formerly from Faculty of Medicine, Sefako Makgatho Health Sciences University, Pretoria, South Africa

    Errol Holland

  7. Statewide Genomics, NSW Health Pathology, Newcastle, New South Wales, Australia

    Eva K. F. Chan

  8. Formerly from the Department of Language and Literature Studies, University of Namibia, Windhoek, Namibia

    Wilfrid H. G. Haacke

  9. Windhoek Central Hospital, University of Namibia, Windhoek Khomas, Namibia

    Hagen E. A. Förtsch & Vanessa M. Hayes

  10. School of Health Systems and Public Health, University of Pretoria, Pretoria, Gauteng, South Africa

    M. S. Riana Bornman

  11. Department of Biological Sciences Faculty of Science, Engineering and Agriculture, University of Venda, Thohoyandou, South Africa

    M. S. Riana Bornman

  12. Centre for Molecular Oncology, School of Biomedical Sciences, University of New South Wales Sydney, Randwick, New South Wales, Australia

    David M. Thomas

  13. Department of Virology, National Health Laboratory Service and Sefako Makgatho Health Sciences University, Pretoria, South Africa

    Jeffrey Mphahlele

  14. Office of the Deputy Vice Chancellor and Innovation, North-West University, Potchefstroom, South Africa

    Jeffrey Mphahlele

  15. Manchester Cancer Research Centre, University of Manchester, Manchester, United Kingdom

    Vanessa M. Hayes

  16. Norwich Medical School, University of East Anglia, Norwich, United Kingdom

    Vanessa M. Hayes

Authors
  1. Weerachai Jaratlerdsiri
    View author publications

    Search author on:PubMed Google Scholar

  2. Pamela X. Y. Soh
    View author publications

    Search author on:PubMed Google Scholar

  3. Tingting Gong
    View author publications

    Search author on:PubMed Google Scholar

  4. Jue Jiang
    View author publications

    Search author on:PubMed Google Scholar

  5. Zolani Simayi
    View author publications

    Search author on:PubMed Google Scholar

  6. Desiree C. Petersen
    View author publications

    Search author on:PubMed Google Scholar

  7. Errol Holland
    View author publications

    Search author on:PubMed Google Scholar

  8. Eva K. F. Chan
    View author publications

    Search author on:PubMed Google Scholar

  9. Kathrine E. Theron
    View author publications

    Search author on:PubMed Google Scholar

  10. Wilfrid H. G. Haacke
    View author publications

    Search author on:PubMed Google Scholar

  11. Hagen E. A. Förtsch
    View author publications

    Search author on:PubMed Google Scholar

  12. M. S. Riana Bornman
    View author publications

    Search author on:PubMed Google Scholar

  13. David M. Thomas
    View author publications

    Search author on:PubMed Google Scholar

  14. Jeffrey Mphahlele
    View author publications

    Search author on:PubMed Google Scholar

  15. Vanessa M. Hayes
    View author publications

    Search author on:PubMed Google Scholar

Contributions

V.M.H. designed the experiments. Community engagement, recruitments, government and ethic approvals were performed by V.M.H., Z.S., E.H., K.T., H.A.E.F., M.S.R.B. and J.M., with V.M.H. performing all remote community recruitment personally in the boarder of Namibia (see Supplementary Data). Z.S., D.C.P., E.K.F.C., K.T., and V.M.H. performed initial genetic screening for participant inclusion, W.H.G.H. provided Khoe-San linguistic expertise, and D.M.T. provided access and interpretation for Medical Genome Reference Biobank (MGRB). W.J. performed all the bioinformatic analyses and designed the positive selection workflow and codes, with additional support from T.G. and J.J, while P.X.Y.S. performed population substructure analyses. W.J. developed the pipelines and performed high-performance computational variant calling, with further complex variant annotation supported by T.G. and J.J. Both W.J. and V.M.H. performed data interpretation and wrote the manuscript, with further critical culturally relevant interpretation provided by Z.S., D.C.P., E.H., H.A.E.F., M.S.R.B., and J.M.; V.M.H., J.M., M.S.R.B. and D.M.T. acquired the funding. W.J. and P.X.Y.S. generated the figures, while all authors contributed to the final editing and approval of the manuscript.

Corresponding authors

Correspondence to Weerachai Jaratlerdsiri or Vanessa M. Hayes.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Data 1–11

Reporting Summary

Peer Review file

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jaratlerdsiri, W., Soh, P.X.Y., Gong, T. et al. A catalogue of early diverged contemporary human genome variation reveals distinct Khoe-San populations. Nat Commun (2026). https://doi.org/10.1038/s41467-026-69269-4

Download citation

  • Received: 27 February 2024

  • Accepted: 28 January 2026

  • Published: 10 February 2026

  • DOI: https://doi.org/10.1038/s41467-026-69269-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Videos
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims & Scope
  • Editors
  • Journal Information
  • Open Access Fees and Funding
  • Calls for Papers
  • Editorial Values Statement
  • Journal Metrics
  • Editors' Highlights
  • Contact
  • Editorial policies
  • Top Articles

Publish with us

  • For authors
  • For Reviewers
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Nature Communications (Nat Commun)

ISSN 2041-1723 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing