Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Resource
  • Published:

The Curated Cancer Cell Atlas provides a comprehensive characterization of tumors at single-cell resolution

Abstract

Recent years have seen a rapid proliferation of single-cell cancer studies, yet most of these studies profiled few tumors, limiting their statistical power. Combining data and results across studies holds great promise but also involves various challenges. We recently began to address these challenges by curating a large collection of cancer single-cell RNA-sequencing datasets, leveraging it for systematic analyses of tumor heterogeneity. Here we greatly extend this repository to 124 datasets for over 40 cancer types, together comprising 2,836 samples, with improved data annotations, visualizations and exploration. Using this vast cohort, we generate an updated map of recurrent expression programs in malignant cells and systematically quantify context-dependent gene expression and cell-cycle patterns across cell types and cancer types. These data, annotations and analysis results are all freely available for exploration and download through the Curated Cancer Cell Atlas, a central community resource that opens new avenues in cancer research.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of 3CA and data summary features.
Fig. 2: Data query features on the 3CA website.
Fig. 3: Updated MPs in malignant cells.
Fig. 4: Context dependency of gene expression.
Fig. 5: Quantification and comparison of cell-cycle patterns.

Similar content being viewed by others

Data availability

This study used only external datasets and did not involve the generation of new data. All published single-cell datasets are available on the 3CA website (https://www.weizmann.ac.il/sites/3CA/), with the exception of one dataset47, for which permission for sharing through 3CA was not granted; this dataset is available through the European Genome–Phenome Archive under accession number EGAS00001002543. Additional unpublished datasets used will be added to the 3CA website when possible. TCGA data were obtained online (http://gdac.broadinstitute.org/). Data for reproducing the figures in this article are provided in the Supplementary Information. Source data are provided with this paper.

Code availability

Source code for all analyses in this study and for generating the figures available on the 3CA website is available on GitHub (https://github.com/tiroshlab/3ca).

References

  1. Marusyk, A., Janiszewska, M. & Polyak, K. Intratumor heterogeneity: the Rosetta stone of therapy resistance. Cancer Cell 37, 471–484 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Tirosh, I. et al. Single-cell RNA-seq supports a developmental hierarchy in human oligodendroglioma. Nature 539, 309–313 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  3. Puram, S. V. et al. Single-cell transcriptomic analysis of primary and metastatic tumor ecosystems in head and neck cancer. Cell 171, 1611–1624 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Elyada, E. et al. Cross-species single-cell analysis of pancreatic ductal adenocarcinoma reveals antigen-presenting cancer-associated fibroblasts. Cancer Discov. 9, 1102–1123 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Vegliante, R., Pastushenko, I. & Blanpain, C. Deciphering functional tumor states at single-cell resolution. EMBO J. 41, e109221 (2022).

    Article  CAS  PubMed  Google Scholar 

  6. Aran, D. Single-cell RNA sequencing for studying human cancers. Annu. Rev. Biomed. Data Sci. 6, 1–22 (2023).

    Article  PubMed  Google Scholar 

  7. Gavish, A. et al. Hallmarks of transcriptional intratumour heterogeneity across a thousand tumours. Nature 618, 598–606 (2023).

    Article  CAS  PubMed  Google Scholar 

  8. Han, Y. et al. TISCH2: expanded datasets and new tools for single-cell transcriptome analyses of the tumor microenvironment. Nucleic Acids Res. 51, D1425–D1431 (2023).

    Article  PubMed  Google Scholar 

  9. Patel, A. P. et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344, 1396–1401 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Zaghlool, A. et al. Characterization of the nuclear and cytosolic transcriptomes in human brain tissue reveals new insights into the subcellular distribution of RNA transcripts. Sci. Rep. 11, 4076 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Zheng, L. et al. Pan-cancer single-cell landscape of tumor-infiltrating T cells. Science 374, abe6474 (2021).

    Article  PubMed  Google Scholar 

  12. Tang, F. et al. A pan-cancer single-cell panorama of human natural killer cells. Cell 186, 4235–4251 (2023).

    Article  CAS  PubMed  Google Scholar 

  13. Cheng, S. et al. A pan-cancer single-cell transcriptional atlas of tumor infiltrating myeloid cells. Cell 184, 792–809 (2021).

    Article  CAS  PubMed  Google Scholar 

  14. Buechler, M. B. et al. Cross-tissue organization of the fibroblast lineage. Nature 593, 575–579 (2021).

    Article  CAS  PubMed  Google Scholar 

  15. Whitfield, M. L. et al. Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol. Biol. Cell 13, 1977–2000 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Cohen, H. T. & McGovern, F. J. Renal-cell carcinoma. N. Engl. J. Med. 353, 2477–2490 (2005).

    Article  CAS  PubMed  Google Scholar 

  17. Puram, S. V. et al. Cellular states are coupled to genomic and viral heterogeneity in HPV-related oropharyngeal carcinoma. Nat. Genet. 55, 640–650 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Lawrence, M. S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Graham, S. V. The human papillomavirus replication cycle, and its links to cancer progression: a comprehensive review. Clin. Sci. 131, 2201–2221 (2017).

    Article  CAS  Google Scholar 

  20. CZI Cell Science Program et al. CZ CELL×GENE Discover: a single-cell data platform for scalable exploration, analysis and modeling of aggregated data. Nucleic Acids Res. 53, D886–D900 (2025).

  21. Camps, J. et al. Meta-analysis of human cancer single-cell RNA-seq datasets using the IMMUcan database. Cancer Res. 83, 363–373 (2023).

    Article  CAS  PubMed  Google Scholar 

  22. Zeng, J. et al. CancerSCEM: a database of single-cell expression map across various human cancers. Nucleic Acids Res. 50, D1147–D1155 (2022).

    Article  CAS  PubMed  Google Scholar 

  23. Yuan, H. et al. CancerSEA: a cancer single-cell state atlas. Nucleic Acids Res. 47, D900–D908 (2019).

    Article  CAS  PubMed  Google Scholar 

  24. Franzén, O., Gan, L.-M. & Björkegren, J. L. M. PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data. Database 2019, baz046 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Xu, C. et al. Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models. Mol. Syst. Biol. 17, e9620 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Luecken, M. D. et al. Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods 19, 41–50 (2022).

    Article  CAS  PubMed  Google Scholar 

  29. Wang, L. et al. A single-cell atlas of glioblastoma evolution under therapy reveals cell-intrinsic and cell-extrinsic therapeutic targets. Nat. Cancer 3, 1534–1552 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Hwang, W. L. et al. Single-nucleus and spatial transcriptome profiling of pancreatic cancer identifies multicellular dynamics associated with neoadjuvant treatment. Nat. Genet. 54, 1178–1191 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Griffiths, J. I. et al. Serial single-cell genomics reveals convergent subclonal evolution of resistance as patients with early-stage breast cancer progress on endocrine plus CDK4/6 therapy. Nat. Cancer 2, 658–671 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Nassiri, F. et al. A clinically applicable integrative molecular classification of meningiomas. Nature 597, 119–125 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Jansky, S. et al. Single-cell transcriptomic analyses provide insights into the developmental origins of neuroblastoma. Nat. Genet. 53, 683–693 (2021).

    Article  CAS  PubMed  Google Scholar 

  34. Kim, C. et al. Chemoresistance evolution in triple-negative breast cancer delineated by single-cell sequencing. Cell 173, 879–893 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. 10x Genomics. Chromium Fixed RNA Profiling Reagent Kits for Multiplexed Samples. Report No. CG000527 (10x Genomics, 2023).

  36. Chung, H. et al. SnFFPE-seq: towards scalable single nucleus RNA-seq of formalin-fixed paraffin-embedded (FFPE) tissue. Preprint at bioRxiv https://doi.org/10.1101/2022.08.25.505257 (2022).

  37. Wang, T. et al. snPATHO-seq, a versatile FFPE single-nucleus RNA sequencing method to unlock pathology archives. Commun. Biol. 7, 1340 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Xu, Z. et al. High-throughput single nucleus total RNA sequencing of formalin-fixed paraffin-embedded tissues by snRandom-seq. Nat. Commun. 14, 2734 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Phan, H. V. et al. High-throughput RNA sequencing of paraformaldehyde-fixed single cells. Nat. Commun. 12, 5636 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Baglama, J. & Reichel, L. Augmented implicitly restarted Lanczos bidiagonalization methods. SIAM J. Sci. Comput. 27, 19–42 (2005).

    Article  Google Scholar 

  41. Biermann, J. et al. Dissecting the treatment-naive ecosystem of human melanoma brain metastasis. Cell 185, 2591–2608 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Nath, A. et al. Evolution of core archetypal phenotypes in progressive high grade serous ovarian cancer. Nat. Commun. 12, 3039 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Wang, L. et al. The phenotypes of proliferating glioblastoma cells reside on a single axis of variation. Cancer Discov. 9, 1708–1719 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Neftel, C. et al. An integrative model of cellular states, plasticity, and genetics for glioblastoma. Cell 178, 835–849 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Kinker, G. S. et al. Pan-cancer single-cell RNA-seq identifies recurring programs of cellular heterogeneity. Nat. Genet. 52, 1208–1218 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Chan-Seng-Yue, M. et al. Transcription phenotypes of pancreatic cancer are driven by genomic events during tumor evolution. Nat. Genet. 52, 231–240 (2020).

    Article  CAS  PubMed  Google Scholar 

  48. Raghavan, S. et al. Microenvironment drives cell state, plasticity, and drug response in pancreatic cancer. Cell 184, 6119–6137 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Kürten, C. H. L. et al. Investigating immune and non-immune cell interactions in head and neck tumors by single-cell RNA sequencing. Nat. Commun. 12, 7338 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Pal, B. et al. A single‐cell RNA expression atlas of normal, preneoplastic and tumorigenic states in the human breast. EMBO J. 40, e107333 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Pelka, K. et al. Spatially organized multicellular immune hubs in human colorectal cancer. Cell 184, P4734–4752.E20 (2021).

    Article  Google Scholar 

  52. Venteicher, A. S. et al. Decoupling genetics, lineages, and microenvironment in IDH-mutant gliomas by single-cell RNA-seq. Science 355, eaai8478 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Best, D. J. & Roberts, D. E. Algorithm AS 89: the upper tail probabilities of Spearman’s rho. J. R. Stat. Soc. C Appl. Stat. 24, 377–379 (1975).

    Google Scholar 

Download references

Acknowledgements

We thank all members of the I.T. lab for helpful discussions and comments. We thank the Weizmann Information Technology Internet and Mobile Development Team for their work on the 3CA website. This work was supported by the Israeli Council for Higher Education through the Weizmann Data Science Research Center and by the Israel Science Foundation. I.T. is the incumbent of the Dr. Celia Zwillenberg-Fridman and Dr. Lutz Zwillenberg Career Development Chair and is supported by the Zuckerman Science, Technology, Engineering and Mathematics Leadership Program, the Mexican Friends New Generation and the Benoziyo Endowment Fund. R.H. is funded by the Walter Benjamin Program from the German Research Foundation.

Author information

Authors and Affiliations

Authors

Contributions

M.T. and I.T. conceptualized and designed the study, interpreted the results and wrote the paper. M.T. performed the majority of computational analyses, generated the data visualizations for the 3CA website and managed the hosting of data on the website for public download. A.G. performed the analysis defining MPs in malignant cells and quantifying the relative contributions of scRNA-seq and snRNA-seq profiles. C.B. and M.T. performed the analysis of cancer type specificity of cell types. R.T., R.H. and M.M. curated datasets for 3CA. S.V.P. contributed scRNA-seq data for head and neck tumors. A.G., R.T., R.H., M.M. and S.V.P. reviewed the paper and provided feedback. The study was cosupervised by M.T. and I.T.

Corresponding authors

Correspondence to Michael Tyler or Itay Tirosh.

Ethics declarations

Competing interests

I.T. is an advisory board member of Immunitas Therapeutics. The other authors declare no competing interests.

Peer review

Peer review information

Nature Cancer thanks Francisco Martínez Jiménez and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Overview of 3CA cell annotation and data summary features.

a. Dot plot showing the average expression level (colour) and percentage expressing cells (point size) of a selection of cell type marker genes (rows) in each cell type (columns) in the Raghavan et al. dataset48. b. Heatmap showing inferred copy number alteration (CNA) values (colour, quantified as log2 ratio, with blue indicating depletion and red amplification) at each chromosomal position (columns) for a representative subset of cells (rows) in the Raghavan et al. dataset48, with colour bar (left) showing the sample each cell belongs to. c. UMAP plot of all cells (points) in the Raghavan et al. dataset48, coloured by sample, with the same colours as in b.

Source data

Extended Data Fig. 2 Cancer-type-dependent gene expression patterns.

a. Scatter plots per cancer type showing sensitivity (y axis) and specificity (x axis) of genes (points) to malignant cells, relative to other cell types, within each cancer type. Selected genes with unusually high sensitivity or specificity are labelled. b. Scatter plots per cancer type showing sensitivity (y axis) and specificity (x axis) of genes (points) to malignant cells in each cancer type, relative to malignant cells in other cancer types. Selected genes with unusually high sensitivity or specificity are labelled.

Source data

Extended Data Fig. 3 Interactive web tool for exploring gene specificity and sensitivity in different contexts.

Screenshot of 3CA web page enabling exploration of gene specificity and sensitivity values in sortable table format. Tabs distinguish analyses of global cell type markers, cancer-type-specific malignant cell markers and genes with highly variable expression between cancer types in malignant cells. In each tab, the user can select a cell type or cancer type and view the specificity and sensitivity values in a table, with the option to sort by either value or by a ‘Combined’ score reflecting overall marker performance. The corresponding summary scatter plot from Fig. 4b or Extended Data Fig. 2 is shown alongside.

Extended Data Fig. 4 Number of cancer-type-specific genes per cell type.

a. Scatter plot to illustrate the definition of cancer-type-specific gene expression at different thresholds. Points correspond to genes whose mean expression in malignant cells is highest in HCC than in all other cancer types. A point’s y axis value denotes the average expression of this gene in malignant cells in HCC, while its x axis value corresponds to the maximum of its mean expression levels in malignant cells across all non-HCC cancer types. Each dashed line denotes a choice of threshold, whereby the number of genes whose expression in malignant cells is specific to HCC is defined as the number of points above this dashed line. b. Line plot showing the median number of cancer-type-specific genes (y axis, median across cancer types) for each cell type (x axis) for different choices of threshold (colour). Cell types are ordered by their average y values. c. Boxplots showing, for each choice of threshold (panels), the log-transformed number of cancer-type-specific genes (y axis) per cancer type (points/colour) for each cell type (x axis). Cell types are ordered as in b. Each point in b corresponds to the median of points for the corresponding box in c, after reversing the log transformation. Boxes indicate the median and 1st and 3rd quartiles, while the upper, respectively the lower whiskers extend to the maximal, resp. minimal values no further than 1.5 times the interquartile range from the 3rd, resp. 1st quartiles. Groups (cell types) consist of n = 20, 18, 21, 15, 17, 10, 11, 22 data points (in order from left to right), corresponding to differences in average expression levels across biologically distinct samples.

Source data

Extended Data Fig. 5 Cancer type and patient specificity of cell type expression profiles.

a. Scheme illustrating the definition of cancer type and patient specificity in terms of pairwise correlations of pseudobulk profiles. b. Box plot showing the patient specificity (y axis) of each cell type (x axis) in each cancer type (points/colour). Brackets indicate significant differences and are labelled with p values (0.0091 and 0.017), which were computed by pairwise paired, two-sided t tests and adjusted to FDR < 0.05 (all p values are provided in the Source Data). Unmarked pairwise differences are not significant. Boxes indicate the median and 1st and 3rd quartiles, while the upper, respectively the lower whiskers extend to the maximal, resp. minimal values no further than 1.5 times the interquartile range from the 3rd, resp. 1st quartiles. Groups (cell types) consist of n = 15, 11, 11, 11, 10, 8, 16 data points (in order from left to right), corresponding to averages across pairwise correlations between biologically distinct samples.

Source data

Extended Data Fig. 6 Potential technical confounders of cell cycle estimates.

a. Scatter plots showing cell cycle proportion and phase bias from bespoke (y axis) and consensus (x axis) G1/S and G2/M gene signatures. Each point corresponds to one cell type in one dataset. Red lines denote y = x, and r denotes Pearson correlation. b. Bar plot showing average percentage of cycling cells in 10x datasets (y axis) in each cell type (x axis; n = 53, 51, 33, 9, 58, 26, 17, 42, 29, 44, 19, 45, in order from left to right). Error bars denote standard error. c. Heatmap showing Spearman correlation between cell types (colour) of percentages of cycling cells in 10x datasets. Significant correlations are labelled with p values (2.0 × 10−2, 8.1 × 10−5, 2.2 × 10−2, 5.6 × 10−3, 2.0 × 10−2, 2.0 × 10−2, 3.9 × 10−6, 5.1 × 10−4, 1.6 × 10−3, 5.9 × 10−3, 6.1 × 10−6, 5.9 × 10−6, 1.3 × 10−4, 1.1 × 10−4, 7.1 × 10−3, 8.6 × 10−3, 1.2 × 10−3, 5.9 × 10−6, 4.8 × 10−2, 1.5 × 10−2, 4.2 × 10−2, 9.3 × 10−4, 2.0 × 10−2, 1.1 × 10−23, 5.9 × 10−3, in order from left to right and top to bottom), which were computed by two-tailed test of zero correlation via algorithm AS 8953, and adjusted to FDR < 0.05 (all p values are provided in the Source Data). d. Bar plot showing percentage of cycling malignant cells (y axis) in each 10x dataset (bars), grouped by cancer type (x axis). Crosses denote the average for each cancer type, weighted by number of samples containing at least 10 malignant cells. Bar colour categorises studies by number of such samples, and values above the plot denote the total number of such samples per cancer type. e. Bar plot showing phase bias (y axis) of malignant cells in each 10x dataset (bars), grouped by cancer type (x axis). Crosses, bar colour and number of samples per cancer type are as in d. f. Scatter plots showing, for each cell type and sequencing platform, percentage of cycling cells (y axis) against number of detected genes (x axis) in each dataset (points). Regression lines and Pearson correlation were computed with and without outliers (red and blue respectively, or purple in cases with no outliers). Average correlations and p values (computed by two-sided t test, without adjustment) are shown at the top. g-i. Scatter plots as in a showing: percentage of cycling cells against number of captured cells; phase bias against number of detected genes; phase bias against number of captured cells.

Source data

Extended Data Fig. 7 Correlation of cell cycle between cell types, per cancer type.

Bar plots, for each pair of cell types, showing the Spearman correlation (across samples) of proportion of cycling cells between those cell types in each cancer type. Significant correlations are labelled with p values (2.1 × 10−3, 2.7 × 10−2, 2.8 × 10−4, 6.1 × 10−3, 2.7 × 10−2, 1.7 × 10−2, 1.2 × 10−2, 2.1 × 10−5, 2.9 × 10−2, 6.9 × 10−7, 3.2 × 10−4, 4.4 × 10−5, 4.9 × 10−3, 5.2 × 10−3, 2.7 × 10−4, 7.9 × 10−4, 4.1 × 10−5, 4.1 × 10−5, 6.5 × 10−3, 4.4 × 10−5, 7.8 × 10−3, 0, 3.5 × 10−2, 4.3 × 10−7, 2.1 × 10−2, in order from left to right and top to bottom), which were computed by two-tailed test of zero correlation via algorithm AS 8953 and adjusted to FDR < 0.05 (all p values are provided in the Source Data).

Source data

Extended Data Fig. 8 Proliferation rates across cancer types for non-malignant cell types.

a-j. Bar plots, for each of the most common non-malignant cell types, showing the percentage of cycling cells of that type (y axis) in each study (bars), grouped by cancer type (x axis), with crosses denoting the average y value for each cancer type, weighted by the number of samples in each study which contain at least 10 cells of that type. Bar colour categorises studies by number of such samples, and values above the plot denote the total number of such samples in each cancer type.

Source data

Extended Data Fig. 9 Phase bias patterns across cancer types.

a. Bar plot showing the phase bias (y axis, quantified by the relative fraction of cycling cells in G1/S versus G2/M) of T cells in each study (bars), grouped by cancer type (x axis), with crosses denoting the average y value for each cancer type, weighted by the number of samples in each study which contain at least 10 T cells. Bar colour categorises studies by number of such samples, and values above the plot denote the total number of such samples in each cancer type. Low and high y values indicate bias toward G2/M and G1/S, respectively. b. Bar plot as in a for macrophages. c. Heatmap showing the weighted average of the phase bias (colour, defined as for the crosses in a) per cancer type (x axis) and cell type (y axis). Grey squares indicate insufficient data.

Source data

Extended Data Fig. 10 Genomic associations of phase bias.

a. Box plot showing the phase bias scores (y axis, defined as the difference between scores for G1/S and G2/M gene signatures) of TCGA tumour samples (points; n = 334, 530, 52, 284, 161, 127, 73, 37, 241, 238, 191, 197, 20, 220, 101, 189, 189, 18, 56, 177, 106, 45, 139, 80, 309, 85, 385, 87, 86, 47, in order from left to right), grouped by cancer type (x axis) and coloured by TP53 mutation status. Brackets are labelled with p values (6.1 × 10−33, 2.2 × 10−8, 8.3 × 10−9, 7.6 × 10−5, 3.5 × 10−6, 5.4 × 10−3, 0.039, 0.019, 0.17, 0.019, 0.17, 0.23, 0.19, 0.80, 0.76, in order from left to right), which were computed by two-sided t test and adjusted to FDR < 0.05. Low and high y values indicate bias toward G2/M and G1/S, respectively. Boxes indicate the median and 1st and 3rd quartiles, while the upper, respectively the lower whiskers extend to the maximal, resp. minimal values no further than 1.5 times the interquartile range from the 3rd, resp. 1st quartiles. b. Dot plot showing, for a selection of genes commonly mutated in cancer (x axis), the difference in average phase bias score between mutant and wild-type tumours (point colour; phase bias score defined as in a) and the statistical significance of this difference (point size, computed as in a, before adjustment) in each cancer type (y axis). c. Box plot as in a for RB1 mutations (n = 33, 439, 30, 449, 19, 271, 73, 315, 16, 224, 27, 106; p = 0.0065, 0.049, 0.088, 0.22, 0.75, 0.75; both in order from left to right).

Source data

Supplementary information

Reporting Summary

Supplementary Data 1

Cell type marker genes used to verify the assignment of nonmalignant cell types during data processing.

Supplementary Data 2

Top 50 signature genes of each NMF program and the MP it contributes to (Fig. 3a,c). NMF programs in the table are in the same order as in Fig. 3a.

Supplementary Data 3

Top 50 signature genes of each MP, ranked by the number of datasets whose NMF programs contributed each gene to this MP (Fig. 3a,b,d,f).

Supplementary Data 4

Mean expression of each gene in each cell type and cancer type (Fig. 4b–e and Extended Data Figs. 2 and 4).

Supplementary Data 5

Proportion of expressing cells for each gene in each cell type and cancer type (Fig. 4c–e).

Supplementary Data 6

G1/S and G2/M gene signatures, both consensus and dataset specific, used to assign cells to cell-cycle phases. An entry of 1 represents membership of a gene to the corresponding signature.

Supplementary Data 7

Phase bias estimates and presence of mutations in common cancer genes in TCGA RNA-seq data (Fig. 5g and Extended Data Fig. 10).

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 1

Statistical source data.

Source Data Extended Data Fig. 2

Statistical source data.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 6

Statistical source data.

Source Data Extended Data Fig. 7

Statistical source data.

Source Data Extended Data Fig. 8

Statistical source data.

Source Data Extended Data Fig. 9

Statistical source data.

Source Data Extended Data Fig. 10

Statistical source data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tyler, M., Gavish, A., Barbolin, C. et al. The Curated Cancer Cell Atlas provides a comprehensive characterization of tumors at single-cell resolution. Nat Cancer 6, 1088–1101 (2025). https://doi.org/10.1038/s43018-025-00957-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s43018-025-00957-8

This article is cited by

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer