Abstract
Despite dozens of tools to identify mutational signatures in cancer samples, there is not an established metric for quantifying whether signature exposures differ significantly between two heterogeneous groups of samples. We demonstrate that a signature-agnostic metric - the aggregate mutation spectrum distance permutation method (AMSD) - can rigorously determine whether mutational exposures differ between groups, a hypothesis that is not directly addressed by signature analysis. First, we reanalyze a study of carcinogen exposure in mice, determining that eleven of twenty tested carcinogens produce significant mutation spectrum shifts. Only three of these carcinogens were previously reported to induce distinct mutational signatures, suggesting that many carcinogens perturb mutagenesis by altering the composition of endogenous signatures. Next, we interrogate whether patient ancestry has a measurable impact on human tumor mutation spectra, finding significant ancestry-associated differences across ten cancer types. Some have been previously reported, such as elevated SBS4 in African lung adenocarcinomas, while some have not to our knowledge been reported, such as elevated SBS17a/b in European esophageal carcinomas. These examples suggest that AMSD is a robust tool for detecting differences among groups of tumors or other mutated samples, complementing descriptive signature deconvolution and enabling the discovery of environmental and genetic influences on mutagenesis.
Similar content being viewed by others
Data availability
All analyses in this study use publicly available datasets, and figures and results can be reproduced using the code available at https://github.com/sfhart33/AMSD_cancer_mutation_spectra. Preprocessed mutation spectra are included in the repository, while raw data can be accessed from:
• Mouse carcinogen exposure: https://github.com/team113sanger/mouse-mutatation-signatures/blob/master/starting_data/snvs.rds
• Asbestos exposure: https://github.com/IARCbioinfo/MESOMICS_data/tree/main/phenotypic_map/MESOMICS
• TCGA ancestry metadata: https://gdc.cancer.gov/about-data/publications/CCG-AIM-2020
• TCGA somatic mutations: https://gdc.cancer.gov/about-data/publications/mc3-2017
The original implementation of the AMSD as a method for identifying mutator alleles is also available on github: https://github.com/quinlan-lab/proj-mutator-mapping.
Code availability
The Aggregate Mutation Spectrum Distance permutation test is implemented as the R package “mutspecdist”, available at https://github.com/sfhart33/mutspecdist. All analyses in this study use publicly available datasets, and figures and results can be reproduced using the code available at https://github.com/sfhart33/AMSD_cancer_mutation_spectra.
References
Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).
Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).
Nik-Zainal, S. et al. Mutational processes molding the genomes of 21 breast cancers. Cell 149, 979–993 (2012).
COSMIC. Catalogue of Somatic Mutations in Cancer. https://cancer.sanger.ac.uk/cosmic.
Adams, W. T. & Skopek, T. R. Statistical test for the comparison of samples from mutational spectra. J. Mol. Biol. 194, 391–396 (1987).
Riva, L. et al. The mutational signature profile of known and suspected human carcinogens in mice. Nat. Genet. 52, 1189–1197 (2020).
Sasani, T. A., Quinlan, A. R. & Harris, K. Epistasis between mutator alleles contributes to germline mutation spectrum variability in laboratory mice. eLife 12, RP89096 (2024).
Kucab, J. E. et al. A Compendium of mutational signatures of environmental agents. Cell 177, 821–836.e16 (2019).
Abbas, S. et al. Mutational signature dynamics shaping the evolution of oesophageal adenocarcinoma. Nat. Commun. 14, 4239 (2023).
Chevalier, A. et al. Characterization of mutational signatures in tumors from a large Chinese population. Cancer Res. Commun. 5, 1466–1476 (2025).
Thatikonda, V. et al. Comprehensive analysis of mutational signatures reveals distinct patterns and molecular processes across 27 pediatric cancers. Nat. Cancer 4, 276–289 (2023).
Díaz-Gay, M. et al. Assigning mutational signatures to individual samples and individual somatic mutations with SigProfilerAssignment. Bioinformatics 39, btad756 (2023).
Bueno, R. et al. Comprehensive genomic analysis of malignant pleural mesothelioma identifies recurrent mutations, gene fusions and splicing alterations. Nat. Genet 48, 407–416 (2016).
Hmeljak, J. et al. Integrative molecular characterization of malignant pleural mesothelioma. Cancer Discov. 8, 1548–1565 (2018).
Mangiante, L. et al. Multiomic analysis of malignant pleural mesothelioma identifies molecular axes and specialized tumor profiles driving intertumor heterogeneity. Nat. Genet. 55, 607–618 (2023).
Steele, C. D. et al. Signatures of copy number alterations in human cancer. Nature 606, 984–991 (2022).
Harris, K. & Pritchard, J. K. Rapid evolution of the human mutation spectrum. eLife 6, e24284 (2017).
Garcia-Salinas, O. I. et al. The impact of ancestral, environmental and genetic influences on germline de novo mutation rates and spectra. Nat. Commun. 16, 4527 (2025).
Beichman, A. C., Zhu, L. & Harris, K. The evolutionary interplay of somatic and germline mutation rates. Annu. Rev. Biomed. Data Sci. 7, 83–105 (2024).
Carrot-Zhang, J. et al. Comprehensive analysis of genetic ancestry and its molecular correlates in cancer. Cancer Cell 37, 639–654.e6 (2020).
Ellrott, K. et al. Scalable open science approach for mutation calling of tumor exomes using multiple genomic pipelines. Cell Syst. 6, 271–281.e7 (2018).
Aaltonen, L. A. et al. Pan-cancer analysis of whole genomes. Nature 578, 82–93 (2020).
Schenk, A., López, S., Kschischo, M. & McGranahan, N. Germline ancestry influences the evolutionary disease course in lung adenocarcinomas. Evol. Appl 13, 1550–1557 (2020).
Stellman, S. D. et al. Lung cancer risk in White and Black Americans. Ann. Epidemiol. 13, 294–302 (2003).
Zheng, S., Donnelly, E. D. & Strauss, J. B. Race, prevalence of POLE and POLD1 alterations, and survival among patients with endometrial cancer. JAMA Netw. Open 7, e2351906 (2024).
Yao, J. et al. Comprehensive analysis of POLE and POLD1 Gene Variations identifies cancer patients potentially benefit from immunotherapy in Chinese population. Sci. Rep. 9, 15767 (2019).
Hu, H. et al. Ultra-mutated colorectal cancer patients with POLE driver mutations exhibit distinct clinical patterns. Cancer Med. 10, 135–142 (2020).
Pourhoseingholi, M. A., Vahedi, M. & Baghestani, A. R. Burden of gastrointestinal cancer in Asia; an overview. Gastroenterol. Hepatol. Bed. Bench 8, 19–27 (2015).
Chang, J. et al. Genomic analysis of oesophageal squamous-cell carcinoma identifies alcohol drinking-related mutation signature and genomic alterations. Nat. Commun. 8, 15290 (2017).
Li, X. C. et al. A mutational signature associated with alcohol consumption and prognostically significantly mutated driver genes in esophageal squamous cell carcinoma. Ann. Oncol. 29, 938–944 (2018).
Letouzé, E. et al. Mutational signatures reveal the dynamic interplay of risk factors and cellular processes during liver tumorigenesis. Nat. Commun. 8, 1315 (2017).
Mason, M. J., Bailar, J. C. & Eisenberg, H. Geographic variation in the incidence of esophageal cancer. J. Chronic Dis. 17, 667–676 (1964).
Moody, S. et al. Mutational signatures in esophageal squamous cell carcinoma from eight countries with varying incidence. Nat. Genet 53, 1553–1563 (2021).
Edenberg, H. J. & McClintick, J. N. Alcohol dehydrogenases, aldehyde dehydrogenases, and alcohol use disorders: a critical review. Alcohol. Clin. Exp. Res. 42, 2281–2297 (2018).
Zhang, S.-Y. et al. Meta-analysis of association between ALDH2 rs671 polymorphism and essential hypertension in Asian populations. Herz 40, 203–208 (2015).
Tanaka, F. et al. Strong interaction between the effects of alcohol consumption and smoking on oesophageal squamous cell carcinoma among individuals with ADH1B and/or ALDH2 risk alleles. Gut 59, 1457–1464 (2010).
Makimoto, K. Drinking patterns and drinking problems among Asian-Americans and Pacific Islanders. Alcohol Health Res. World 22, 270–275 (1998).
Ng, A. W. T. et al. Aristolochic acids and their derivatives are widely implicated in liver cancers in Taiwan and throughout Asia. Sci. Transl. Med. 9, eaan6446 (2017).
Morrison, M. L. et al. Variability of mutational signatures is a footprint of carcinogens. Preprint at https://doi.org/10.1101/2023.11.23.23298821 (2023).
Balmain, A. The critical roles of somatic mutations and environmental tumor-promoting agents in cancer risk. Nat. Genet. 52, 1139–1143 (2020).
Bai, J. et al. Pan-cancer mutational signature surveys correlated mutational signature with geospatial environmental exposures and viral infections. Comput. Struct. Biotechnol. J. 21, 5413–5422 (2023).
Roerink, S. F. et al. Intra-tumour diversification in colorectal cancer at the single-cell level. Nature 556, 457–462 (2018).
Cagan, A. et al. Somatic mutation rates scale with lifespan across mammals. Nature 604, 517–524 (2022).
Carlson, J., Li, J. Z. & Zöllner, S. Helmsman: fast and efficient mutation signature analysis for massive sequencing datasets. BMC Genomics 19, 845 (2018).
Bergstrom, E. N. et al. SigProfilerMatrixGenerator: a tool for visualizing and exploring patterns of small mutational events. BMC Genomics 20, 685 (2019).
Acknowledgements
We thank Harris and Feder lab members for figure feedback, Sayre Coombs for graphic design feedback, and Tom Sasani for developing the original implementation of AMSD and manuscript feedback. This work was possible due to funding from NIH training grant T32-HG000035 supporting S.F.M.H., Worldwide Cancer Research grant 24-0106 to N.A., NIH grant 1DP2CA280623-01 to A.F.F., NIH NIGMS grant 2R35M133428-06 to K.H., and the Allen Discovery Center for Cell Lineage Tracing. Where authors are identified as personnel of the International Agency for Research on Cancer/World Health Organization, the authors alone are responsible for the views expressed in this article and they do not necessarily represent the decisions, policy or views of the International Agency for Research on Cancer/World Health Organization. The results published here are in part based on data generated by the the TCGA Research Network (https://www.cancer.gov/tcga) and by the Rare Cancers Genomics initiative (www.rarecancersgenomics.com).
Author information
Authors and Affiliations
Contributions
S.F.M.H., A.F.F., and K.H. contributed to study conceptualization and design. S.F.M.H. performed the data analysis. S.F.M.H., N.A., A.F.F., and K.H. interpreted the results. S.F.M.H. wrote the original draft of the manuscript. S.F.M.H., N.A., A.F.F., and K.H. contributed to review and editing of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Communications Biology thanks Laura Torrens for their contribution to the peer review of this work. Primary Handling Editor: Mengtan Xing. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Hart, S.F.M., Alcala, N., Feder, A.F. et al. A signature-agnostic test for differences between tumor mutation spectra reveals carcinogen and ancestry effects. Commun Biol (2026). https://doi.org/10.1038/s42003-026-09652-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42003-026-09652-5


