Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Technical Report
  • Published:

PRET is a few-shot system for pan-cancer recognition without example training

Abstract

Pathological examination stands as the cornerstone in cancer diagnosis, impacting millions worldwide annually. With the shortage of pathologists globally, artificial intelligence (AI) has emerged rapidly to automate the diagnostics process. However, conventional AI models require substantial labeled data for each disease, posing huge challenges in scalability and practicality. Therefore, we introduce PRET (pan-cancer recognition without examples training), a few-shot system to achieve flexible, scalable, and effective cancer recognition across diverse organs, hospitals and tasks without training. Evaluated on 23 international benchmarks comprising 4,484 whole-slide images, our method outperforms existing approaches across 20 tasks, achieving over 97% area under the curve on 15 benchmarks with a maximum improvement of 36.76%. Notably, PRET delivers clinical-grade diagnostic performance in lymph node metastasis detection using only eight slide examples, outperforming 11 pathologists. By offering a flexible and cost-effective solution for pan-cancer recognition, PRET paves the way for accessible and equitable AI-based pathology systems, particularly benefiting minority populations and underserved regions.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview.
Fig. 2: Performances of cancer screening and subtyping.
Fig. 3: Performance in tumor segmentation.
Fig. 4: Clinical-grade performances on lymph node metastasis detection with strong generalization.
Fig. 5: Generalization performances on minority populations, underserved regions and external hospitals.

Data availability

The in-house datasets from GDPH and QPCH are publicly available (https://huggingface.co/datasets/yili7eli/PRET/tree/main), including ESCC, PTC, CRC, GC, LC, BC, lymphoma, NSCLC-HQ and PTC-QP. The visual prompts involved for both in-house datasets and open datasets were also released, with data lists to reproduce data splits. The public datasets CAMELYON16 (ref. 28) and CAMELYON17 (ref. 30) are available online (https://camelyon16.grand-challenge.org/) and CAMELYON16-C (ref. 31) was realized by scanning corruption with code available from GitHub (https://github.com/superjamessyx/robustness_benchmark). TCGA datasets can be found from the National Institutes of Health Genomic Data Commons (https://portal.gdc.cancer.gov/), including NSCLC, RCC, ESCA and SARC. Source data are provided with this paper.

Code availability

All the involved model weights and Python packages are available online. Our work is publicly available from GitHub (https://github.com/xmed-lab/PRET), with detailed instructions, comments and evaluation scripts.

References

  1. Ferlay, J. et al. Cancer statistics for the year 2020: an overview. Int. J. Cancer 149, 778–789 (2021).

    Article  CAS  Google Scholar 

  2. Benediktsson, H., Whitelaw, J. & Roy, I. Pathology services in developing countries: a challenge. Arch. Pathol. Lab. Med. 131, 1636–1639 (2007).

    Article  PubMed  Google Scholar 

  3. Metter, D. M., Colgan, T. J., Leung, S. T., Timmons, C. F. & Park, J. Y. Trends in the US and Canadian pathologist workforces from 2007 to 2017. JAMA Netw. Open 2, e194337 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Märkl, B., Füzesi, L., Huss, R., Bauer, S. & Schaller, T. Number of pathologists in Germany: comparison with European countries, USA, and Canada. Virchows Arch. 478, 335–341 (2021).

    Article  PubMed  Google Scholar 

  5. Bera, K., Schalper, K. A., Rimm, D. L., Velcheti, V. & Madabhushi, A. Artificial intelligence in digital pathology—new tools for diagnosis and precision oncology. Nat. Rev. Clin. Oncol. 16, 703–715 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Song, A. H. et al. Artificial intelligence for digital and computational pathology. Nat. Rev. Bioeng. 1, 930–949 (2023).

    Article  CAS  Google Scholar 

  7. Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Huang, S.-C. et al. Deep neural network trained on gigapixel images improves lymph node metastasis detection in clinical settings. Nat. Commun. 13, 3347 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Kundra, R. et al. OncoTree: a cancer classification system for precision oncology. JCO Clin. Cancer Inform. 5, 221–230 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 555–570 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Lu, M. Y. et al. A visual-language foundation model for computational pathology. Nat. Med. 30, 863–874 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Wang, X. et al. A pathology foundation model for cancer diagnosis and prognosis prediction. Nature 634, 970–978 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Xu, H. et al. A whole-slide foundation model for digital pathology from real-world data. Nature 630, 181–188 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Arslan, S. et al. A systematic pan-cancer study on deep learning-based prediction of multi-omic biomarkers from routine pathology images. Commun. Med. 4, 48 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Huang, Z., Bianchi, F., Yuksekgonul, M., Montine, T. J. & Zou, J. A visual–language foundation model for pathology image analysis using medical Twitter. Nat. Med. 29, 2307–2316 (2023).

    Article  CAS  PubMed  Google Scholar 

  16. Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med. 30, 850–862 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Kang, M., Song, H., Park, S., Yoo, D. & Pereira, S. Benchmarking self-supervised learning on diverse pathology datasets. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (eds Brown, M. S. et al.) 3344–3354 (IEEE, 2023).

  18. Ilse, M., Tomczak, J. & Welling, M. Attention-based deep multiple instance learning. In Proc. 35th International Conference on Machine Learning (eds Dy, J. & Krause, A.) 2127–2136 (PMLR, 2018).

  19. Chen, R. J. et al. Scaling vision transformers to gigapixel images via hierarchical self-supervised learning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (eds Chellappa, R. et al.) 16144–16155 (IEEE, 2022).

  20. Snell, J., Swersky, K. & Zemel, R. Prototypical networks for few-shot learning. In Proc. 31st International Conference on Neural Information Processing Systems (eds von Luxburg, U. et al.) 4080–4090 (ACM, 2017).

  21. Wang, Y., Chao, W.-L., Weinberger, K. Q. & van der Maaten, L. SimpleShot: revisiting nearest-neighbor classification for few-shot learning. Preprint at https://arxiv.org/abs/1911.04623 (2019).

  22. Brown, T. et al. Language models are few-shot learners. In Proc. 34th International Conference on Neural Information Processing Systems (eds Larochelle, H. et al.) 1877–1901 (Curran Associates, Inc., 2020).

  23. Wang, X., Wang, W., Cao, Y., Shen, C. & Huang, T. Images speak in images: a generalist painter for in-context visual learning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (eds Brown, M. S. et al.) 6830–6839 (IEEE, 2023).

  24. Zhang, J., Wang, B., Li, L., Nakashima, Y. & Nagahara, H. Instruct me more! Random prompting for visual in-context learning. In Proc. IEEE/CVF Winter Conference on Applications of Computer Vision (eds Souvenir, R. et al.) 2585–2594 (IEEE, 2024).

  25. Sheng, D. et al. Towards more unified in-context visual understanding. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (eds Camps, O. et al.) 13362–13372 (IEEE, 2024).

  26. Zhao, H. et al. MMICL: empowering vision-language model with multi-modal in-context learning. In Proc. 12th International Conference on Learning Representations (ed. Kim, B.) (ICLR, 2024).

  27. Ferber, D. et al. In-context learning enables multimodal large language models to classify cancer pathology images. Nat. Commun. 15, 10104 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Bejnordi, B. E. et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318, 2199–2210 (2017).

    Article  Google Scholar 

  29. Qu, L. et al. The rise of AI language pathologists: exploring two-level prompt learning for few-shot weakly-supervised whole slide image classification. In Proc. 37th International Conference on Neural Information Processing Systems (eds Oh, A. et al.) 67551–67564 (Curran Associates, Inc., 2023).

  30. Litjens, G. et al. 1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset. Gigascience 7, giy065 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Zhang, Y. et al. Benchmarking the robustness of deep neural networks to common corruptions in digital pathology. In Proc. 25th International Conference on Medical Image Computing and Computer-Assisted Intervention (eds Wang, L. et al.) 242–252 (Springer, 2022).

  32. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (eds Tuytelaars, T. et al.) 770–778 (IEEE, 2016).

  33. Deng, J. et al. ImageNet: a large-scale hierarchical image database. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (eds Huttenlocher, D. et al.) 248–255 (IEEE, 2009).

  34. Vorontsov, E. et al. A foundation model for clinical-grade computational pathology and rare cancers detection. Nat. Med. 30, 2924–2935 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Ma, J. et al. A generalizable pathology foundation model using a unified knowledge distillation pretraining framework. Nat. Biomed. Eng. https://doi.org/10.1038/s41551-025-01488-4 (2025).

  36. Ding, T. et al. A multimodal whole-slide foundation model for pathology. Nat. Med. 31, 3749–3761 (2025).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Shao, Z. et al. TransMIL: transformer based correlated multiple instance learning for whole slide image classification. In Proc. 35th International Conference on Neural Information Processing Systems (eds Ranzato, M. et al.) 2136–2147 (Curran Associates, Inc., 2021).

  38. Li, H. et al. Task-specific fine-tuning via variational information bottleneck for weakly-supervised pathology whole slide image classification. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (eds Brown, M. S. et al.) 7454–7463 (IEEE, 2023).

  39. Tang, W. et al. Multiple instance learning framework with masked hard instance mining for whole slide image classification. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (eds Brown, M. S. et al.) 4078–4087 (IEEE, 2023).

  40. Chen, Y. et al. dMIL-Transformer: multiple instance learning via integrating morphological and spatial information for lymph node metastasis classification. IEEE J. Biomed. Health Inform. 27, 4433–4443 (2023).

    Article  PubMed  Google Scholar 

  41. Xiang, J. et al. A vision-language foundation model for precision oncology. Nature 638, 769–778 (2025).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Liang, J. et al. Deep learning supported discovery of biomarkers for clinical prognosis of liver cancer. Nat. Mach. Intell. 5, 408–420 (2023).

    Article  Google Scholar 

  44. Zhou, Y., Li, X., Wang, Q. & Shen, J. Visual in-context learning for large vision-language models. In Findings of the Association for Computational Linguistics (eds Ku, L.-W. et al.) 15890–15902 (ACL, 2024).

  45. Li, C. et al. LLaVA-med: training a large language-and-vision assistant for biomedicine in one day. In Proc. 37th International Conference on Neural Information Processing Systems (eds Oh, A. et al.) 28541–28564 (Curran Associates, Inc., 2023).

  46. Li, Y. et al. Few-shot lymph node metastasis classification meets high performance on whole slide images via the informative non-parametric classifier. In Proc. 25th International Conference on Medical Image Computing and Computer-Assisted Intervention (eds Linguraru, M. G. et al.) 109–119 (Springer, 2024).

  47. Caron, M. et al. Emerging properties in self-supervised vision transformers. In Proc. IEEE/CVF International Conference on Computer Vision (eds Berg, T. et al.) 9650–9660 (IEEE, 2021).

  48. Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. In Proc. 9th International Conference on Learning Representations (ed. Mohamed, S.) (ICLR, 2021).

  49. Xu, Y. et al. A multimodal knowledge-enhanced whole-slide pathology foundation model. Nat. Commun. 16, 11406 (2025).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Wang, X. et al. Transformer-based unsupervised contrastive learning for histopathological image classification. Med. Image Anal. 81, 102559 (2022).

    Article  PubMed  Google Scholar 

  51. Oquab, M. et al. DINOv2: Learning robust visual features without supervision. Trans. Mach. Learn. Res. https://openreview.net/forum?id=a68SUt6zFt (2024).

  52. Radford, A. et al. Learning transferable visual models from natural language supervision. In Proc. International Conference on Machine Learning (eds Meila, M. & Zhang, T.) 8748–8763 (PMLR, 2021).

  53. Ding, J. et al. LongNet: scaling transformers to 1,000,000,000 tokens. In Proc. 10th International Conference on Learning Representations (eds Hofmann, K. & Rush, A.) (ICLR, 2023).

  54. Komura, D. et al. Universal encoding of pan-cancer histology by deep texture representations. Cell Rep. 38, 110424 (2022).

    Article  CAS  PubMed  Google Scholar 

  55. Riasatian, A. et al. Fine-tuning and training of DenseNet for histopathology image representation using tcga diagnostic slides. Med. Image Anal. 70, 102032 (2021).

    Article  PubMed  Google Scholar 

  56. Lu, M. Y. et al. AI-based pathology predicts origins for cancers of unknown primary. Nature 594, 106–110 (2021).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

Y.L., T.X., Qixiang Zhang and X.L. received support from the Research Grants Council (RGC) of the Hong Kong Special Administrative Region (project nos. R6005-24, AoE/E-601/24-N and T45-401/22-N), the Hong Kong Joint Research Scheme of the National Natural Science Foundation of China/RGC (project no. N_HKUST654/24), the Hong Kong Innovation and Technology Fund under Project PRP/041/22FX and the National Natural Science Foundation of China (grant no. 62306254). T.X. also received support from the Hong Kong PhD Fellowship Scheme. Z.N. and Qingling Zhang are supported by grants from the Key R&D Program Projects in Guangdong Province (2021B0101420005 to Qingling Zhang), the National Natural Science Foundation of China (grant no. 82173033 to Qingling Zhang), the High-level Hospital Construction Project (DFJHBF202108 and YKY- KF202204 to X.-W.B. and Qingling Zhang) and the Guangdong Provincial Key Laboratory of AI in Medical Image Analysis and Application (2022B1212010011 to Z.L.). K.-H.Y. is supported in part by the National Institute of General Medical Sciences (grant no. R35GM142879), the National Heart, Lung and Blood Institute (grant no. R01HL174679), the Department of Defense Peer-Reviewed Cancer Research Program Career Development Award (HT9425-231-0523), the Research Scholar Grant (RSG-24-1253761-01-ESED) to K.-H.Y. from the American Cancer Society and the Harvard Medical School Dean’s Innovation Award. K.Z. is supported by the National Natural Science Foundation of China (W2431057), Macau Science and Technology Development Fund, Macau (0007/2020/AFJ, 0070/2020/A2 and 0003/2021/AKP) and Guangzhou National Laboratory (YW-SLJC0201). We would like to thank Q. Shao from the Hong Kong University of Science and Technology for his valuable suggestions.

Author information

Authors and Affiliations

Contributions

X.L. and Qingling Zhang planned the study. Y.L. designed the PRET methods guided by X.L. and conducted the experiments. Z.N. and Qingling Zhang collected the in-house data required for this study and organized the data labeling. T.X. and Qixiang Zhang implemented some baseline methods. Y.L., X.L., Qingling Zhang and K.-H.Y contributed to experimental design. The in-house WSIs were collected by Z.N. and Qingling Zhang, with support from M.Y. and X.-W.B. Data labeling was performed by Z.N., Z.L., F.F., B.Z., X.Q., L.S., J.Q., L.X., C.F., T.Q. and Q.W., who also accounted for the labeling costs. X.L. supervised the study. Y.L. wrote the paper, with contributions from Z.N., X.L. and K.Z. All authors discussed the results and contributed to the final manuscript.

Corresponding authors

Correspondence to Kang Zhang, Qingling Zhang or Xiaomeng Li.

Ethics declarations

Competing interests

Qingling Zhang, Z.N., X.L. and Y.L. are inventors on a pending Chinese patent application (application no. 202510130037.4, ‘A pan-cancer AI pathology diagnosis system using few examples without training’; applicant: GDPH, Guangdong Academy of Medical Sciences) related to the PRET algorithm described in this paper. K.-H.Y. is an inventor on US patent 10,832,406. This patent is assigned to Harvard University and is not directly related to this paper. The other authors declare no competing interests.

Peer review

Peer review information

Nature Cancer thanks Jana Lipkova, Sheng Wang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Method details of PRET and comparison with other settings.

a. PRET contains six major components, including the extractor, tagger, miner, classifier, aggregator, and post processor. b. Linear probing methods require task-specific parameter fine-tuning. The fine-tuned parameters exist in the multiple instance learning (MIL) module and linear classifier. Both the training and testing examples are embedded into slide-level global features. c. The KNN classifier also deploys slide-level features for each example slide without local path-level features. The MIL module turns to be global pooling from and linear classifier becomes KNN classifier. d. The MI-Prototype20 and MI-SimpleShot16 apply the prototypical classifier, where the prototypes are the mean features from examples. Note that prototypes are dataset-level features whose size is the number of classes. Then, they are used to calculate similarities with patch-level features from the test slide with a top K operation. e. The proposed PRET preserves rich information of patch-level features from both example and test slides, with novel modules to leverage visual in-context. The new modules involve a visual in-context tagger to process visual prompts, a miner to explore discriminative test patches, an informative in-context classifier to classify patches, and an attention aggregator to get the slide-level prediction. f. Comparison among methods in multiple aspects. PRET is unique at local information in example slides, effective prompt design and utilization, with advantages in aspects of training-free, pan-cancer recognition, segmentation supporting, high performance, local information in the test slide.

Extended Data Fig. 2 Dataset characteristics and labeling details.

a. Statistics of patient age for the proposed datasets (n = 906). b. Statistics of age patient sex for the proposed datasets (n = 906). c. Density map of tumor size (n = 565). The x-axis indicates the number of tumor patches per slide, suggesting the tumor size with the frequency on the y-axis. d. Examples of slides and visual prompts. The slide label is a certain number, while other visual prompts are boxes or outlines given in these examples.

Source data

Extended Data Fig. 3 Pan-cancer recognition performances with multiple visual prompts.

The proposed PRET almost surpasses the baselines in all datasets and prompts at the AUC metric. The gray error bars indicate standard deviation, and the gray dots are the specific results for n = 5 independent experiments. The p value indicates the significance of PRET outperforming the best baseline with two-sided Wilcoxon test, reporting the median value across all repeat experiments and varied shots.

Source data

Extended Data Fig. 4 Visualization and case study for cancer screening and subtyping.

The classification tasks do not require accurate pixel-level localization as the segmentation task. Instead, the score maps are supposed to be distinguishable among classes. The score maps between positive and negative examples are much more distinguishable for the proposed PRET compared with other methods. So that PRET meets fewer false positives. These cases include ESCC for cancer screening and NSCLC for cancer subtyping, involving multiple visual prompt types, where L indicates slide label and B, R means bounding box and rough mask, respectively.

Extended Data Fig. 5 Experiment results about foundation models, hyperparameters, and segmentation.

a. PRET is born for foundation models with much larger performance gaps compared with the non-foundation model (ResNet-5032 pretrained from ImageNet). There are 1-4 shot results. The error bars show the standard deviation across n = 5 independent experiments, and the gray dots are the specific results for different data splits. b. Hyperparameter gird search on NSCLC validation set with slide labels at 8-shot. The result fluctuates within 2.5%, showcasing the model’s robustness. Among these hyperparameters, the uncertain range factor te in Supplementary Algorithm 1 for the local visual in-context tagger (LVIT) and discriminative instance miner (DIM) is relatively influential since they control the quantity of visual in-context. {v1, v2, v3, v4, v5} are {0, 0.02, 0.04, 0.06, 0.08} for the uncertain range factor of the LVIT in the examples, and {0.1, 0.15, 0.2, 0.25, 0.3} for that of the DIM to the test slides, {20, 30, 40, 50, 60} for top k of the informative in-context classifier (IIC). Besides, they are set to {0.86, 0.87, 0.88, 0.89, 0.9}, {1000, 2000, 3000, 4000, 5000}, {1, 5, 10, 20, 30} for the related threshold tr, high score instances number n, and softmax temperature τ in the attention aggregator (AA). c. The results on different data splits and hyperparameter groups. Thanks to the robustness, default hyperparameters without grid search and those from the RCC dataset also perform well, showing close performances compared with the searched hyperparameters group. Different data splits reflect the varied example quality, which is less robust than hyperparameters, thus we report the average results for better stability. d. The proposed PRET almost surpasses the baselines in all datasets and prompts at the DICE metric. The gray error bars indicate standard deviation across n = 5 independent experiments, and the gray dots are the specific results for different data splits. The p value indicates the significance of PRET outperforming the best baseline with two-sided Wilcoxon test, reporting the median value across all repeat experiments and varied shots.

Source data

Extended Data Fig. 6 Visualization and comparison of tumor segmentation task based on weak visual prompts.

Qualitative illustrations comparing PRET to MI-Prototype and MI-SimpleShot on PTC and BC datasets using 8 examples. The tumor region is colored yellow. Note that the blue area in the first “BC-L” row are stains on the slide from a blue marker pen.

Extended Data Fig. 7 Visualization of in-context tagger.

Our method successfully tags the instances of examples that are close to manually labeled regions via weak visual prompts. This in-context tagger module is used for examples to label visual in-context instead of weakly segmentation for the test slides. Different visual prompts support varied conditions and provide more choices for pathologists. The bounding box and rough mask show fewer responses out of the manual mask with better score maps, while the slide label produces more positive regions. The used dataset is NSCLC (subtyping for LUAD and LUSC) on 8-shot settings.

Extended Data Fig. 8 PRET is scalable and achieves comparable performance to many-shot methods with much less data.

a. PRET is scalable when applying more examples. We increased the example size (+ 16 and 32) and witnessed a continuous growth in performance. These results indicate that PRET is scalable and maintains its advantages over baseline methods (p < 0.001). The involved datasets are RCC, NSCLC, and Lymphoma under slide label to support more examples. b. PRET achieves comparable performance to many-shot fine-tuned methods using much less data (for example 8 vs. 128 or full). The many-shot methods are mainstream fine-tuned multiple instance learning methods, including ABMIL18, TransMIL37, CLAM-SB10, and CLAM-MB10. PRET applies only 8 examples per class (8-shot) without training, while many-shot methods are trained with much more data, including 128-shot and full data scale. The prompt type corresponds to Fig. 2b, and the calculation of p value follows the above principle. PRET archives higher results than many-shot methods at the same data scale for all three benchmarks. Moreover, our performance is higher than many-shot methods in the full data scale on the CAMELYON1628 lymph node metastasis detection task (p < 0.001), while achieving comparable results for the rest of the benchmarks (p < 0.001). a, b. The gray error bars indicate standard deviation across n = 5 independent experiments, and the gray dots are the specific results. The p value indicates the significance of PRET outperforming the best baseline with two-sided Wilcoxon test, reporting the median value across all repeat experiments and varied shots.

Source data

Supplementary information

Supplementary Information (download PDF )

Supplementary Algorithms, Baselines and Tables 1–10.

Reporting Summary (download PDF )

Source data

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Y., Ning, Z., Xiang, T. et al. PRET is a few-shot system for pan-cancer recognition without example training. Nat Cancer (2026). https://doi.org/10.1038/s43018-026-01141-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • DOI: https://doi.org/10.1038/s43018-026-01141-2

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer