Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Pan-UK Biobank genome-wide association analyses enhance discovery and resolution of ancestry-enriched effects

Abstract

Large biobanks, such as the UK Biobank (UKB), enable massive phenome by genome-wide association studies that elucidate genetic etiology of complex traits. However, people from diverse genetic ancestry groups are often excluded from association analyses due to concerns about population structure introducing false positive associations. Here we generate mixed model associations and meta-analyses across genetic ancestry groups, inclusive of a larger fraction of the UK Biobank than previous efforts, to produce freely available summary statistics for 7,266 traits. We build a quality control and analysis framework informed by genetic architecture. Overall, we identify 14,676 significant loci (P < 5 × 10−8) in the meta-analysis that were not found in the EUR genetic ancestry group alone, including new associations, for example between CAMK2D and triglycerides. We also highlight associations from ancestry-enriched variation, including a known pleiotropic missense variant in G6PD associated with several biomarker traits. We release these results publicly alongside frequently asked questions that describe caveats for interpretation of results, enhancing available resources for interpretation of risk variants across diverse populations.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Genetic ancestry in the Pan-UKB.
Fig. 2: Pan-UKB GWAS resource facilitates multiancestry multitrait analyses.
Fig. 3: Heritability informs robustness of GWAS across ancestry–trait pairs.
Fig. 4: UKB-wide analysis improves genetic discovery.
Fig. 5: Differences in allele frequencies across ancestries yield new genetic discoveries.
Fig. 6: Meta-analysis identifies pleiotropic signals from non-European populations.

Similar content being viewed by others

Data availability

All data are available at https://pan.ukbb.broadinstitute.org/, as well as on the AWS Open Data program (https://aws.amazon.com/marketplace/pp/prodview-2efssfw2ezyq6). Sample metadata is available in the UKB showcase under https://biobank.ndph.ox.ac.uk/ukb/dset.cgi?id=2442.

Code availability

All analysis code is available via GitHub at https://github.com/atgu/ukbb_pan_ancestry and via Zenodo at https://doi.org/10.5281/zenodo.15420125 (ref. 73).

References

  1. Abul-Husn, N. S. & Kenny, E. E. Personalized medicine and the power of electronic health records. Cell 177, 58–69 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Zhou, W. et al. Global Biobank meta-analysis initiative: powering genetic discovery across human disease. Cell Genom. 2, 100192 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).

    Article  CAS  PubMed  Google Scholar 

  4. Sirugo, G., Williams, S. M. & Tishkoff, S. A. The missing diversity in human genetic studies. Cell 177, 26–31 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Morales, J. et al. A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog. Genome Biol. 19, 21 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  7. Sinnott-Armstrong, N. et al. Genetics of 35 blood and urine biomarkers in the UK Biobank. Nat. Genet. 53, 185–194 (2021).

    Article  CAS  PubMed  Google Scholar 

  8. Lam, M. et al. Comparative genetic architectures of schizophrenia in East Asian and European populations. Nat. Genet. 51, 1670–1678 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Chen, J. et al. The trans-ancestral genomic architecture of glycemic traits. Nat. Genet. 53, 840–860 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Hou, K. et al. Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. Nat. Genet. 55, 549–558 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Hu, S. et al. Fine-scale population structure and widespread conservation of genetic effect sizes between human groups across traits. Nat. Genet. 57, 379–389 (2025).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. SIGMA Type 2 Diabetes Consortium et al. Association of a low-frequency variant in HNF1A with type 2 diabetes in a Latino population. JAMA 311, 2305–2314 (2014).

    Article  Google Scholar 

  13. Cohen, J. et al. Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9. Nat. Genet. 37, 161–165 (2005).

    Article  CAS  PubMed  Google Scholar 

  14. Liu, Z. et al. Genetic architecture of the inflammatory bowel diseases across East Asian and European ancestries. Nat. Genet. 55, 796–806 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Miller, L. H., Mason, S. J., Clyde, D. F. & McGinniss, M. H. The resistance factor to Plasmodium vivax in blacks. The Duffy-blood-group genotype, FyFy. N. Engl. J. Med. 295, 302–304 (1976).

    Article  CAS  PubMed  Google Scholar 

  16. Genovese, G. et al. Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science 329, 841–845 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Ross, M. J. New insights into APOL1 and kidney disease in African children and Brazilians living with end-stage kidney disease. Kidney Int. Rep. 4, 908–910 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Genovese, G., Friedman, D. J. & Pollak, M. R. APOL1 variants and kidney disease in people of recent African ancestry. Nat. Rev. Nephrol. 9, 240–244 (2013).

    Article  CAS  PubMed  Google Scholar 

  19. Mägi, R. et al. Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry increases power for discovery and improves fine-mapping resolution. Hum. Mol. Genet. 26, 3639–3650 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Asimit, J. L., Hatzikotoulas, K., McCarthy, M., Morris, A. P. & Zeggini, E. Trans-ethnic study design approaches for fine-mapping. Eur. J. Hum. Genet. 24, 1330–1336 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Mahajan, A. et al. Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat. Genet. 54, 560–572 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Huang, H. et al. Fine-mapping inflammatory bowel disease loci to single-variant resolution. Nature 547, 173–178 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Schaid, D. J., Chen, W. & Larson, N. B. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat. Rev. Genet. 19, 491–504 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Graff, M. et al. Discovery and fine-mapping of height loci via high-density imputation of GWASs in individuals of African ancestry. Am. J. Hum. Genet. 108, 564–582 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Luo, Y. et al. A high-resolution HLA reference panel capturing global population diversity enables multi-ethnic fine-mapping in HIV host response. Nat. Genet. 53, 1504–1516 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Polygenic Risk Score Task Force of the International Common Disease Alliance. Responsible use of polygenic risk scores in the clinic: potential benefits, risks and gaps. Nat. Med. 27, 1876–1884 (2021).

    Article  CAS  Google Scholar 

  27. Martin, A. R. et al. Human demographic history impacts genetic risk prediction across diverse populations. Am. J. Hum. Genet. 100, 635–649 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Scutari, M., Mackay, I. & Balding, D. Using genetic distance to infer the accuracy of genomic prediction. PLoS Genet. 12, e1006288 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Wang, Y. et al. Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations. Nat. Commun. 11, 3865 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Ding, Y. et al. Polygenic scoring accuracy varies across the genetic ancestry continuum. Nature 618, 774–781 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Conti, D. V. et al. Trans-ancestry genome-wide association meta-analysis of prostate cancer identifies new susceptibility loci and informs genetic risk prediction. Nat. Genet. 53, 65–75 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Bigdeli, T. B. et al. Contributions of common genetic variants to risk of schizophrenia among individuals of African and Latino ancestry. Mol. Psychiatry 25, 2455–2467 (2020).

    Article  CAS  PubMed  Google Scholar 

  33. Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. National Academies of Sciences, Engineering, and Medicine. Using Population Descriptors in Genetics and Genomics Research: a New Framework for an Evolving Field (National Academies Press, 2023).

  35. Ben-Eghan, C. et al. Don’t ignore genetic data from minority populations. Nature 585, 184–186 (2020).

    Article  CAS  PubMed  Google Scholar 

  36. 1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Article  Google Scholar 

  37. Li, J. Z. et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 1100–1104 (2008).

    Article  CAS  PubMed  Google Scholar 

  38. Mathieson, I. & Scally, A. What is ancestry? PLoS Genet. 16, e1008624 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Lewis, A. C. F. et al. Getting genetic ancestry right for science and society. Science 376, 250–252 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Schizophrenia Psychiatric Genome-Wide Association Study (GWAS) Consortium. Genome-wide association study identifies five new schizophrenia loci. Nat. Genet. 43, 969–976 (2011).

    Article  Google Scholar 

  42. Denny, J. C. et al. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat. Biotechnol. 31, 1102–1110 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. COVID-19 Host Genetics Initiative. Mapping the human genetic architecture of COVID-19. Nature 600, 472–477 (2021).

    Article  Google Scholar 

  44. Howrigan, D. Details and considerations of the UK Biobank GWAS. https://www.nealelab.is/blog/2017/9/11/details-and-considerations-of-the-uk-biobank-gwas (2017).

  45. Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Pazokitoroudi, A. et al. Efficient variance components analysis across millions of genomes. Nat. Commun. 11, 4020 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Ghoussaini, M. et al. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res. 49, D1311–D1320 (2020).

    Article  PubMed Central  Google Scholar 

  49. Solovieff, N., Cotsapas, C., Lee, P. H., Purcell, S. M. & Smoller, J. W. Pleiotropy in complex traits: challenges and strategies. Nat. Rev. Genet. 14, 483–495 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Sun, L., Wang, Z., Lu, T., Manolio, T. A. & Paterson, A. D. eXclusionarY: 10 years later, where are the sex chromosomes in GWASs? Am. J. Hum. Genet. 110, 903–912 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Rasooly, D. et al. Genome-wide association analysis and Mendelian randomization proteomics identify drug targets for heart failure. Nat. Commun. 14, 3826 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Gage, P. J., Suh, H. & Camper, S. A. Dosage requirement of Pitx2 for development of multiple organs. Development 126, 4643–4651 (1999).

    Article  CAS  PubMed  Google Scholar 

  53. Tümer, Z. & Bach-Holm, D. Axenfeld-Rieger syndrome and spectrum of PITX2 and FOXC1 mutations. Eur. J. Hum. Genet. 17, 1527–1539 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Berry, F. B. et al. Functional interactions between FOXC1 and PITX2 underlie the sensitivity to FOXC1 gene dose in Axenfeld–Rieger syndrome and anterior segment dysgenesis. Hum. Mol. Genet. 15, 905–919 (2006).

    Article  CAS  PubMed  Google Scholar 

  55. Gibson, G. Population genetics and GWAS: a primer. PLoS Biol. 16, e2005485 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Martin, A. R., Daly, M. J., Robinson, E. B., Hyman, S. E. & Neale, B. M. Predicting polygenic risk of psychiatric disorders. Biol. Psychiatry 86, 97–109 (2019).

    Article  PubMed  Google Scholar 

  57. Liu, D. J. & Leal, S. M. Estimating genetic effects and quantifying missing heritability explained by identified rare-variant associations. Am. J. Hum. Genet. 91, 585–596 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Sarnowski, C. et al. Impact of rare and common genetic variants on diabetes diagnosis by hemoglobin A1c in multi-ancestry cohorts: the Trans-Omics for Precision Medicine Program. Am. J. Hum. Genet. 105, 706–718 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Kanai, M. et al. Meta-analysis fine-mapping is often miscalibrated at single-variant resolution. Cell Genom. 2, 100210 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Atkinson, E. G. et al. Tractor uses local ancestry to enable the inclusion of admixed individuals in GWAS and to boost power. Nat. Genet. 53, 195–204 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Zhou, W. et al. Global Biobank Meta-analysis Initiative: powering genetic discovery across human diseases. Cell Genom. 2, 100192 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Kurki, M. I. et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613, 508–518 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Breeyear, J. H. et al. Adaptive selection at G6PD and disparities in diabetes complications. Nat. Med. 30, 2480–2488 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. All of Us Research Program Genomics Investigators. Genomic data in the All of Us research program. Nature 627, 340–346 (2024).

    Article  Google Scholar 

  65. Panagiotou, O. A., Willer, C. J., Hirschhorn, J. N. & Ioannidis, J. P. A. The power of meta-analysis in genome-wide association studies. Annu. Rev. Genomics Hum. Genet. 14, 441–465 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Lin, D. Y. & Zeng, D. Meta-analysis of genome-wide association studies: no efficiency gain in using individual participant data. Genet. Epidemiol. 34, 60–66 (2010).

    Article  CAS  PubMed  Google Scholar 

  67. Balding, D. J. A tutorial on statistical methods for population association studies. Nat. Rev. Genet. 7, 781–791 (2006).

    Article  CAS  PubMed  Google Scholar 

  68. Witherspoon, D. J. et al. Genetic similarities within and between human populations. Genetics 176, 351–359 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Henn, B. M., Cavalli-Sforza, L. L. & Feldman, M. W. The great human expansion. Proc. Natl Acad. Sci. USA 109, 17758–17764 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Bamshad, M., Wooding, S., Salisbury, B. A. & Stephens, J. C. Deconstructing the relationship between genetics and race. Nat. Rev. Genet. 5, 598–609 (2004).

    Article  CAS  PubMed  Google Scholar 

  71. Meyer, M. N. et al. Wrestling with social and behavioral genomics: risks, potential benefits, and ethical responsibility. Hastings Cent. Rep. 53, S2–S49 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  72. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Karczewski, K. et al. atgu/ukbb_pan_ancestry: figure release v.1.0. Zenodo https://doi.org/10.5281/zenodo.15420124 (2025).

  74. Zhang, X. et al. Whole genome sequencing analysis of body mass index identifies novel African ancestry-specific risk allele. Preprint at medRxiv https://doi.org/10.1101/2023.08.21.23293271 (2023).

Download references

Acknowledgements

We thank P. Kraft and J.-A. Dias for helpful discussions. This work was supported by the Novo Nordisk Foundation (NNF21SA0072102; K.J.K., B.M.N.), NIH grants R37MH107649 (B.M.N.), R00MH117229 (A.R.M.), K01MH121659 (E.G.A.), F31HL167378 (K.T.) and F30AG074507 (R.G.) and BroadIgnite funding (A.R.M.).

Author information

Authors and Affiliations

Authors

Contributions

K.J.K. developed pipelines and performed association analysis, variant and association QC, and summary statistics analyses. K.J.K., R.G., M.K., W.L., N.N.S. and A.R.M. generated figures. R.G. performed heritability analysis and QC. M.K. created LD matrices and LD scores, and performed locus definition analysis, meta-analysis and fine-mapping. W.L. performed additional association analyses. K.T. and N.B. performed LD analyses including clumping. Y.W. performed polygenicity analyses. R.K.W. performed sample QC and advised on association analyses. P.T., S.C., E.G.A. and A.R.M. wrote the FAQs. N.N.S. and E.G.A. performed Tractor analyses. D.S.P. and E.G.A. performed phenotype curation, processing and QC. J.I.G., T.P., J.C., D.K. and C.S. built the Hail infrastructure that enabled association analysis. G.S. curated prescription data. M.S. developed the website. N.C. performed initial comparisons of meta- and mega-analysis. S.B., C.C. and C.M.C. provided data and project management. W.Z. aided in development of association methods. A.R.M., H.K.F., M.J.D., B.M.N. and E.G.A. provided oversight and direction of the project. A.R.M. performed ancestry assignment and pruning analysis. A.R.M., E.G.A., B.M.N. and K.J.K. conceived the study. K.J.K., R.G., M.K., E.G.A. and A.R.M. wrote the manuscript. All authors reviewed and approved the manuscript.

Corresponding author

Correspondence to Alicia R. Martin.

Ethics declarations

Competing interests

K.J.K. is a consultant for Tome Biosciences, AlloDx and Vor Biosciences, and a member of the scientific advisory board of Nurture Genomics. M.J.D is a founder of Maze Therapeutics. B.M.N. is a member of the scientific advisory board at Deep Genomics and Neumora. The other authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Paul Auer and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Global and subcontinental PCA.

a, Global PCA projection of UKBB into PCs 1-2 defined by HGDP and 1000 Genomes Project reference panel, which are shown in colored dots on top of UKBB in black. b, Global PCA density plot of UKBB points only, excluding reference panel. c, Map of HGDP, 1000 Genomes Project, and AGVP reference used to define AFR PC space. d, PCs 1-2 within AFR, reference panel colored, UKBB in grey. Inset shows density of UKBB samples assigned to AFR using a random forest. In c and d, colors and shapes are consistent across panels. e, Map of HGDP and 1000 Genomes Project reference used to define CSA PC space. f, PCs 1-2 within CSA, reference panel colored, UKBB in grey. Inset shows density of UKBB samples assigned to CSA using a random forest. In e and f, colors and shapes are consistent across panels. g, Map of HGDP and 1000 Genomes Project reference used to define EAS PC space. h, PCs 1-2 within EAS, reference panel colored, UKBB in grey. Inset shows density of UKBB samples assigned to EAS using a random forest. In g and h, colors and shapes are consistent across panels. i, Map of HGDP and 1000 Genomes Project reference used to define EUR PC space. j, PCs 1-2 within EUR, reference panel colored, UKBB in grey. Inset shows density of UKBB samples assigned to EUR using a random forest. In I and j, colors and shapes are consistent across panels. k, Map of HGDP and 1000 Genomes Project reference used to define MID PC space. l, PCs 1-2 within MID, reference panel colored, UKBB in grey. Inset shows density of UKBB samples assigned to MID using a random forest. In k and l, colors and shapes are consistent across panels.

Extended Data Fig. 2 Heritability informs the robustness of GWAS across ancestry-trait pairs.

a, Heritability estimates are generally concordant in the EUR genetic ancestry group across 64 pilot phenotypes (Supplementary Table 9) and two statistical methods. RHE-mc uses a randomized multi-component version of classical Haseman-Elston regression with a genetic relatedness matrix53, whereas S-LDSC uses GWAS summary statistics52. For binary phenotypes, heritability estimates are reported on the liability scale. All pilot phenotypes are shown, except for sepsis, which had negative heritability estimates by both methods. The dotted line shows y = x, while the dashed line is a fitted linear regression (slope = 0.87, intercept = 0.05, P = 7 × 10−13). Error bars indicate one standard error. b, Across the same non-EUR ancestry-trait pairs, heritability estimated with RHE-mc have higher z-scores due to the smaller standard errors compared to S-LDSC. Dashed line at z = 4 was used as a QC filter. c, As in Fig. 2b, without filtering to phenotypes passing QC, but instead only filtering to EUR z > 4 and defined heritability in both genetic ancestry groups. Dotted line shows y = x and dashed line shows York regression fit (n phenotypes = 147, slope = 0.66, intercept = 0.17, P < 10−100). Points indicate the point estimate of heritability, and error bars indicate one standard error.

Extended Data Fig. 3 Heritability summaries across trait types and genetic ancestry groups.

a, The confidence metrics (heritability z score) across traits (columns) and ancestry groups (rows) are shown for the final heritability metrics used (S-LDSC for EUR, otherwise RHE-mc). Dashed line indicates inclusion criteria (z ≥ 4). b, The mean observed heritability (h2) is plotted by ancestry group and trait type. For ancestry groups with smaller sample sizes, heritabilities are likely inflated due to a combination of residual stratification and winner’s curse, as only significantly heritable phenotypes in each ancestry group are shown. Error bars are standard deviations of the distribution of the heritability point estimates.

Extended Data Fig. 4 Improved identification of associations by EFO category.

Number (left) and percentage (right) of known and novel variants identified in this study compared to the GWAS catalog across EFO categories.

Extended Data Fig. 5 GWAS hits near haploinsufficient genes.

a, The percentage of novel associations by gene category. 66% of haploinsufficient genes have a novel significant hit nearby, compared to 34% of all genes. b, Locuszoom plots of a 1-Mb region around rs1379871 (purple diamond; DMD), for whole body fat mass (P = 1.84 × 10−41; n = 431,792). The −log10(P-value) is plotted along chromosomal position, with neighboring variants colored by sample-size weighted LD (with lead SNP) for ancestries included in meta-analysis (gray: LD not defined for at least one ancestry group). This variant has recently been identified in a larger study of BMI74.

Extended Data Fig. 6 Comparison of meta-analysis and EUR summary statistics.

a, As in Fig. 3c, the P-value in EUR is plotted compared to the P-value in the meta-analysis, as a density plot to indicate the relative number of points in each region of the plot. Three quadrants are highlighted for significant in meta-analysis only (green), both meta-analysis and EUR (purple), and EUR-only (blue). b, Summaries and meta-data of the variants in each of these three quadrants are shown. Heterogeneous is defined as Cochran’s Q P < 0.01, low INFO score is defined as INFO < 0.9, and low quality is defined as failing quality filters from gnomAD or allele frequency significantly differing between gnomAD and Pan-UKB in at least one ancestry group (see Supplementary Note, QC of summary statistics). Common is defined as frequency > 1%.

Extended Data Fig. 7 Fine-mapping of the G6PD locus.

a,b, Fine-mapping results for rs1050828 (G6PD) in AFR (a) and meta-analysis (b). a, AFR fine-mapping results highlight the missense variant (rs1050828) in a credible set, with a second independent signal for some phenotypes. b, Meta-analysis fine-mapped results show instability as the major signal at rs1050828 is discovered in a group with a relatively small sample size, which results in a small contribution to the LD panel and thus, poor performance in fine-mapping. Detailed results are shown in Supplementary Table 12.

Extended Data Fig. 8 Manhattan and QQ plot comparison for SAIGE and Tractor GWAS for mean corpuscular hemoglobin concentration.

a,b, Original GWAS performed using SAIGE for AFR. c-f, Among AFR individuals, Tractor GWAS results are shown for AFR haplotypes (c,d) and EUR haplotypes (e,f). QQ plots in b, d and f include bands indicating the confidence bounds based on the normal distribution.

Extended Data Table 1 Comparison of λ1000 and λGC for five phenotypes across three association study paradigms
Extended Data Table 2 Number of significant associations for height in AFR and CSA

Supplementary information

Supplementary Information

Supplementary Note, Figs. 1–34 and Tables 1–12.

Reporting Summary

Peer Review File

Supplementary Data 1

Assigned genetic ancestry labels correlate with the country of birth or known migration events. The number of people by genetic ancestry and country of birth (non-UK) are shown.

Supplementary Data 2

Summary of all phenotypes in Pan-UKB. Phenotypes are keyed by five keys: trait type, phenocode, pheno_sex, coding and modifier. Where available, description and coding_description are provided from the UKB showcase. For each ancestry group, we include the number of cases, heritability estimates (observed, liability, standard errors and z scores), whether the phenotype passes QC, and lambda GC. We provide QC flags, whether the phenotype is in the maximal independent set, and filename information, including a download link for the phenotype-specific file and tabix index on Amazon S3 and md5 checksums for each.

Supplementary Data 3

Summary of all heritability metrics. Phenotypes are keyed as in Supplementary Data 2. For each ancestry group, we provide heritability estimates (observed, liability, standard errors and z scores) for LDSC and S-LDSC, and for ancestry groups other than EUR, also RHE-mc, as well as details of QC flags.

Supplementary Data 4

Pairwise genetic correlations. Genetic correlations (rg) from S-LDSC are computed for pairs of 528 phenotypes (phenotype_code_1 and phenotype_code_2) using summary statistics from EUR.

Supplementary Data 5

Pairwise phenotypic correlations. Covariates were regressed out from each of the 452 high-quality phenotypes, and pairwise correlations (entry) were computed for each pair of phenotypes (residuals), i (with phenotype identifier in i_data) and j (identifier in j_data). The correlation for all phenotypes is available at gs://ukb-diverse-pops-public/misc/pairwise/pairwise_correlations_regressed.txt.bgz.

Supplementary Data 6

Polygenicity estimates. Polygenicity estimates (mean and s.d.) from SBayesS for 451 phenotypes, along with convergence criteria (R_GelmanRubin).

Supplementary Data 7

Summary statistics for key loci across GWAS methods. SAIGE AFR and SAIGE EUR refer to the SAIGE analyses performed on the AFR and EUR genetically inferred ancestry groups of UKB. Tractor AFR and Tractor EUR indicate the Tractor GWAS conducted on the AFR or EUR haplotype tracts, respectively, within the AFR group. Variants are filtered as described above in Tractor GWAS analysis.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Karczewski, K.J., Gupta, R., Kanai, M. et al. Pan-UK Biobank genome-wide association analyses enhance discovery and resolution of ancestry-enriched effects. Nat Genet 57, 2408–2417 (2025). https://doi.org/10.1038/s41588-025-02335-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41588-025-02335-7

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing