Artificial intelligence agents in cancer research and oncology

Truhn, Daniel; Azizi, Shekoofeh; Zou, James; Cerda-Alberich, Leonor; Mahmood, Faisal; Kather, Jakob Nikolas

doi:10.1038/s41568-025-00900-0

Review Article
Published: 12 January 2026

Artificial intelligence agents in cancer research and oncology

Nature Reviews Cancer volume 26, pages 256–269 (2026)Cite this article

10k Accesses
4 Citations
68 Altmetric
Metrics details

Subjects

Abstract

Since 2022, artificial intelligence (AI) methods have progressed far beyond their established capabilities of data classification and prediction. Large language models (LLMs) can perform logical reasoning, enabling them to plan and orchestrate complex workflows. By using this planning ability and equipped with the ability to act upon their environment, LLMs can function as agents. Agents are (semi-)autonomous systems capable of sensing, learning and acting upon their environments. As such, they can interact with external knowledge or external software and can execute sequences of tasks with minimal or no human input. In cancer research and oncology, evidence for the capability of AI agents is rapidly emerging. From autonomously optimizing drug design and development to proposing therapeutic strategies for clinical cases, AI agents can handle complex, multistep problems that were not addressable by previous generations of AI systems. Despite rapid developments, many translational and clinical cancer researchers still lack clarity regarding the precise capabilities, limitations, and ethical or regulatory frameworks associated with AI agents. Here we provide a primer on AI agents for cancer researchers and oncologists. We illustrate how this technology is set apart from and goes beyond traditional AI systems. We discuss existing and emerging applications in cancer research and address real-world challenges from the perspective of academic, clinical and industrial research.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to the full article PDF.

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Types of artificial intelligence agent architectures.**

**Fig. 2: Artificial intelligence agents in cancer research.**

**Fig. 3: A multi-agent framework for oncological treatment decisions, which integrates diverse medical data.**

Translation of AI into oncology clinical practice

Article 08 September 2023

The impact of AI on modern oncology from early detection to personalized cancer treatment

Article Open access 24 January 2026

Hallmarks of artificial intelligence contributions to precision oncology

Article 07 March 2025

References

Topol, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 25, 44–56 (2019).
Article CAS PubMed Google Scholar
Gao, S. et al. Empowering biomedical discovery with AI agents. Cell 187, 6125–6151 (2024).
Article CAS PubMed Google Scholar
Boiko, D. A., MacKnight, R., Kline, B. & Gomes, G. Autonomous chemical research with large language models. Nature 624, 570–578 (2023). Demonstration of an LLM-based agent (Coscientist) autonomously planning and executing real-world scientific experiments, marking a milestone for AI agents in research.
Article CAS PubMed PubMed Central Google Scholar
Bran, A. M. et al. Augmenting large language models with chemistry tools. Nat. Mach. Intell. 6, 525–535 (2024).
Article Google Scholar
Kaiser, J., Lauscher, A. & Eichler, A. Large language models for human-machine collaborative particle accelerator tuning through natural language. Sci. Adv. 11, eadr4173 (2025).
Article PubMed PubMed Central Google Scholar
Russell, S. & Norvig, P. Artificial Intelligence (Pearson, 1999).
ANTHROP\C. Building effective AI agents. https://www.anthropic.com/engineering/building-effective-agents (2024).
Zou, J. & Topol, E. J. The rise of agentic AI teammates in medicine. Lancet 405, 457 (2025).
Article PubMed Google Scholar
Google Cloud. What is an AI agent? https://cloud.google.com/discover/what-are-ai-agents (2025).
Ray, S. AI agents — what they are, and how they’ll change the way we work. Source https://news.microsoft.com/source/features/ai/ai-agents-what-they-are-and-how-theyll-change-the-way-we-work/ (2024).
AWS. What are AI agents? https://aws.amazon.com/what-is/ai-agents/ (2025).
Lee, Y., Ferber, D., Rood, J. E., Regev, A. & Kather, J. N. How AI agents will change cancer research and oncology. Nat. Cancer 5, 1765–1767 (2024).
Article PubMed Google Scholar
Vaswani, A. et al. Attention is all you need. Preprint at https://doi.org/10.48550/arXiv.1706.03762 (2017). This study introduced the transformer architecture that underpins all modern LLMs and AI agents.
Radford, A. et al. Language models are unsupervised multitask learners. OpenAI https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf (2019).
Zhao, W. X. et al. A survey of large language models. Preprint at https://doi.org/10.48550/arXiv.2303.18223 (2023).
Brown, T. B. et al. Language models are few-shot learners. NeurIPS 33, 1877–1901 (2020). This study demonstrated that LLMs can perform diverse tasks with minimal examples, establishing the paradigm of in-context learning that enables agentic capabilities.
Google Scholar
Truhn, D., Reis-Filho, J. S. & Kather, J. N. Large language models should be used as scientific reasoning engines, not knowledge databases. Nat. Med. 29, 2983–2984 (2023).
Article CAS PubMed Google Scholar
Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with GPT-4. Preprint at https://doi.org/10.48550/arXiv.2303.12712 (2023).
Hendrycks, D. et al. Measuring massive multitask language understanding. Preprint at https://doi.org/10.48550/arXiv.2009.03300 (2020).
ANTHROP\C. Claude’s extended thinking. https://www.anthropic.com/research/visible-extended-thinking (2025).
Tu, T. et al. Towards conversational diagnostic artificial intelligence. Nature 642, 442–450 (2025).
Article CAS PubMed PubMed Central Google Scholar
McDuff, D. et al. Towards accurate differential diagnosis with large language models. Nature 642, 451–457 (2025).
Article CAS PubMed PubMed Central Google Scholar
Ferber, D. et al. In-context learning enables multimodal large language models to classify cancer pathology images. Nat. Commun. 15, 10104 (2024).
Article CAS PubMed PubMed Central Google Scholar
OpenAI o1 System Card. OpenAI https://openai.com/index/openai-o1-system-card/ (2024).
Brodeur, P. G. et al. Superhuman performance of a large language model on the reasoning tasks of a physician. Preprint at https://doi.org/10.48550/arXiv.2412.10849 (2024).
Hao, S. et al. Training large language models to reason in a continuous latent space. Preprint at https://doi.org/10.48550/arXiv.2412.06769 (2024).
OpenAI. Introducing OpenAI o3 and o4-mini. https://openai.com/index/introducing-o3-and-o4-mini/ (2025).
Yuksekgonul, M. et al. Optimizing generative AI by backpropagating language model feedback. Nature 639, 609–616 (2025).
Article CAS PubMed Google Scholar
Bostrom, N. in Machine Ethics and Robot Ethics 69–75 (Routledge, 2020).
Smit, A., Duckworth, P., Grinsztajn, N., Barrett, T. D. & Pretorius, A. Should we be going MAD? A look at multi-agent debate strategies for LLMs. Preprint at https://doi.org/10.48550/arXiv.2311.17371 (2023).
Wu, Y. et al. ProAI: Proactive multi-agent conversational AI with structured knowledge base for psychiatric diagnosis. Preprint at https://doi.org/10.48550/arXiv.2502.20689v2 (2025).
Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. Preprint at https://doi.org/10.48550/arXiv.2210.03629 (2022). This study introduced the ReAct framework combining reasoning traces with actions, providing a foundational architecture for modern AI agents.
Wang, E. et al. TxGemma: efficient and agentic LLMs for therapeutics. Preprint at https://doi.org/10.48550/arXiv.2504.06196 (2025).
Shanahan, M., McDonell, K. & Reynolds, L. Role play with large language models. Nature 623, 493–498 (2023).
Article CAS PubMed Google Scholar
Moritz, M., Topol, E. & Rajpurkar, P. Coordinated AI agents for advancing healthcare. Nat. Biomed. Eng. https://doi.org/10.1038/s41551-025-01363-2 (2025).
Article PubMed Google Scholar
Schaeffer, R. Pretraining on the test set is all you need. Preprint at https://doi.org/10.48550/arXiv.2309.08632 (2023).
Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023). This study introduced Med-PaLM, demonstrating that LLMs can achieve expert-level performance on medical question answering and establishing benchmarks for clinical AI.
Article CAS PubMed PubMed Central Google Scholar
Singhal, K. et al. Toward expert-level medical question answering with large language models. Nat. Med. 31, 943–950 (2025).
Article CAS PubMed PubMed Central Google Scholar
Phan, L. et al. Humanity’s last exam. Preprint at https://doi.org/10.48550/arXiv.2501.14249 (2025).
Mitchener, L. et al. BixBench: a comprehensive benchmark for LLM-based agents in computational biology. Preprint at https://doi.org/10.48550/arXiv.2503.00096 (2025).
Huang, K. et al. Biomni: a general-purpose biomedical AI agent. Bioinformatics https://doi.org/10.1101/2025.05.30.656746 (2025).
Article PubMed PubMed Central Google Scholar
Schaefer, M. et al. Multimodal learning enables chat-based exploration of single-cell data. Nat. Biotechnol. https://doi.org/10.1038/s41587-025-02857-9 (2025).
Article PubMed Google Scholar
Doshi, A. R. & Hauser, O. P. Generative AI enhances individual creativity but reduces the collective diversity of novel content. Sci. Adv. 10, eadn5290 (2024).
Article PubMed PubMed Central Google Scholar
Si, C., Yang, D. & Hashimoto, T. Can LLMs generate novel research ideas? A large-scale human study with 100+ NLP researchers. Preprint at https://doi.org/10.48550/arXiv.2409.04109 (2024).
Baek, J., Jauhar, S. K., Cucerzan, S. & Hwang, S. J. ResearchAgent: iterative research idea generation over scientific literature with large language models. Preprint at https://doi.org/10.48550/arXiv.2404.07738 (2024).
Roohani, Y. et al. BioDiscoveryAgent: an AI agent for designing genetic perturbation experiments. Preprint at https://doi.org/10.48550/arXiv.2405.17631 (2024).
Schmidgall, S. & Moor, M. AgentRxiv: towards collaborative autonomous research. Preprint at https://doi.org/10.48550/arXiv.2503.18102 (2025).
Zhang, K. et al. Artificial intelligence in drug development. Nat. Med. 31, 45–59 (2025).
Article CAS PubMed Google Scholar
Schmidgall, S. et al. Agent laboratory: using LLM agents as research assistants. Preprint at https://doi.org/10.48550/arXiv.2501.04227 (2025).
Swanson, K., Wu, W., Bulaong, N. L., Pak, J. E. & Zou, J. The virtual lab: AI agents design new SARS-CoV-2 nanobodies with experimental validation. Bioinformatics https://doi.org/10.1101/2024.11.11.623004 (2024).
Wang, H. et al. SpatialAgent: an autonomous AI agent for spatial biology. Bioinformatics https://doi.org/10.1101/2025.04.03.646459 (2025).
Lu, C. et al. The AI scientist: towards fully automated open-ended scientific discovery. Preprint at https://doi.org/10.48550/arXiv.2408.06292 (2024).
Yamada, Y. et al. The AI scientist-v2: workshop-level automated scientific discovery via agentic tree search. Preprint at https://doi.org/10.48550/arXiv.2504.08066 (2025).
Wölflein, G., Ferber, D., Truhn, D., Arandjelović, O. & Kather, J. N. LLM agents making agent tools. Preprint at https://doi.org/10.48550/arXiv.2502.11705 (2025).
Thirunavukarasu, A. J. et al. Large language models in medicine. Nat. Med. 29, 1930–1940 (2023).
Article CAS PubMed Google Scholar
Williams, C. Y. K. et al. Physician- and large language model-generated hospital discharge summaries. JAMA Intern. Med. https://doi.org/10.1001/jamainternmed.2025.0821 (2025).
Article PubMed PubMed Central Google Scholar
Goodell, A. J., Chu, S. N., Rouholiman, D. & Chu, L. F. Large language model agents can use tools to perform clinical calculations. NPJ Digit. Med. 8, 163 (2025).
Article PubMed PubMed Central Google Scholar
Litière, S., Collette, S., de Vries, E. G. E., Seymour, L. & Bogaerts, J. RECIST — learning from the past to build the future. Nat. Rev. Clin. Oncol. 14, 187–192 (2017).
Article PubMed Google Scholar
Liu, F. et al. RiskAgent: autonomous medical AI copilot for generalist risk prediction. Preprint at https://doi.org/10.48550/arXiv.2503.03802 (2025).
Ferber, D. et al. GPT-4 for information retrieval and comparison of medical oncology guidelines. NEJM AI https://doi.org/10.1056/AIcs2300235 (2024).
Article Google Scholar
Ferber, D. et al. Development and validation of an autonomous artificial intelligence agent for clinical decision-making in oncology. Nat. Cancer 6, 1337–1349 (2025). Peer-reviewed validation of an autonomous AI agent for oncology clinical decision support in a tumour board setting.
Article CAS PubMed PubMed Central Google Scholar
Kojima, T., Gu, S. S., Reid, M., Matsuo, Y. & Iwasawa, Y. Large language models are zero-shot reasoners. NeurIPS 35, 22199–22213 (2022).
Google Scholar
Liévin, V., Hother, C. E., Motzfeldt, A. G. & Winther, O. Can large language models reason about medical questions? Preprint at https://doi.org/10.48550/arXiv.2207.08143 (2022).
Gao, S. et al. TxAgent: an AI agent for therapeutic reasoning across a universe of tools. Preprint at https://doi.org/10.48550/arXiv.2503.10970 (2025).
Hasjim, B. J. et al. The AI agent in the room: informing objective decision making at the transplant selection committee. Transplantation https://doi.org/10.1101/2024.12.06.24318575 (2024).
Article Google Scholar
Wang, S. et al. Empowering medical multi-agents with clinical consultation flow for dynamic diagnosis. Preprint at https://doi.org/10.48550/arXiv.2503.16547 (2025).
Kather, J. N., Ferber, D., Wiest, I. C., Gilbert, S. & Truhn, D. Large language models could make natural language again the universal interface of healthcare. Nat. Med. 30, 2708–2710 (2024).
Article CAS PubMed Google Scholar
Palepu, A. et al. Towards conversational AI for disease management. Preprint at https://doi.org/10.48550/arXiv.2503.06074 (2025).
Ferber, D. et al. End-to-end clinical trial matching with large language models. Preprint at https://doi.org/10.48550/arXiv.2407.13463 (2024).
Lukac, S. et al. Evaluating ChatGPT as an adjunct for the multidisciplinary tumor board decision-making in primary breast cancer cases. Arch. Gynecol. Obstet. 308, 1831–1844 (2023).
Article PubMed PubMed Central Google Scholar
Schmidl, B. et al. Assessing the role of advanced artificial intelligence as a tool in multidisciplinary tumor board decision-making for recurrent/metastatic head and neck cancer cases – the first study on ChatGPT 4o and a comparison to ChatGPT 4.0. Front. Oncol. 14, 1455413 (2024).
Article PubMed PubMed Central Google Scholar
Nardone, V. et al. The role of artificial intelligence on tumor boards: perspectives from surgeons, medical oncologists and radiation oncologists. Curr. Oncol. 31, 4984–5007 (2024).
Article PubMed PubMed Central Google Scholar
Ghezloo, F. et al. PathFinder: a multi-modal multi-agent system for medical diagnostic decision-making applied to histopathology. Preprint at https://doi.org/10.48550/arXiv.2502.08916 (2025).
Sinsky, C. et al. Allocation of physician time in ambulatory practice: a time and motion study in 4 specialties. Ann. Intern. Med. 165, 753–760 (2016).
Article PubMed Google Scholar
Rotenstein, L. et al. Virtual scribes and physician time spent on electronic health records. JAMA Netw. Open 7, e2413140 (2024).
Article PubMed PubMed Central Google Scholar
Ayers, J. W. et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern. Med. 183, 589–596 (2023).
Article PubMed PubMed Central Google Scholar
Chen, S. et al. The effect of using a large language model to respond to patient messages. Lancet Digit. Health 6, e379–e381 (2024).
Article CAS PubMed PubMed Central Google Scholar
Bock, A. Using a virtual scribe may shorten EHR time. JAMA 332, 188 (2024).
PubMed Google Scholar
Maddox, T. M. et al. Generative AI in medicine — evaluating progress and challenges. N. Engl. J. Med. https://doi.org/10.1056/NEJMsb2503956 (2025).
Article PubMed Google Scholar
Bastubbe, Y., Jain, D. & Torti, F. Frontier Technologies in Industrial Operations: The Rise of Artificial Intelligence Agents. White Paper (World Economic Forum, 2025).
Blease, C. R., Locher, C., Gaab, J., Hägglund, M. & Mandl, K. D. Generative artificial intelligence in primary care: an online survey of UK general practitioners. BMJ Health Care Inform. 31, e101102 (2024).
Article PubMed PubMed Central Google Scholar
Umeton, R. et al. GPT-4 in a cancer center — institute-wide deployment challenges and lessons learned. NEJM AI https://doi.org/10.1056/AIcs2300191 (2024).
Article Google Scholar
Gilbert, S., Harvey, H., Melvin, T., Vollebregt, E. & Wicks, P. Large language model AI chatbots require approval as medical devices. Nat. Med. 29, 2396–2398 (2023).
Article CAS PubMed Google Scholar
Jiang, Y. et al. MedAgentBench: a virtual EHR environment to benchmark medical LLM agents. NEJM AI https://doi.org/10.1056/AIdbp2500144 (2025).
Schmidgall, S. et al. AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments. Preprint at https://doi.org/10.48550/arXiv.2405.07960 (2024).
Rodman, A., Zwaan, L., Olson, A. & Manrai, A. K. When it comes to benchmarks, humans are the only way. NEJM AI https://doi.org/10.1056/AIe2500143 (2025).
Article Google Scholar
Roberts, M. et al. Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans. Nat. Mach. Intell. 3, 199–217 (2021).
Article Google Scholar
Zhang, A., Xing, L., Zou, J. & Wu, J. C. Shifting machine learning for healthcare from development to deployment and from models to data. Nat. Biomed. Eng. 6, 1330–1345 (2022).
Article PubMed PubMed Central Google Scholar
Schmidt, C. M. D. Anderson breaks with IBM Watson, raising questions about artificial intelligence in oncology. J. Natl Cancer Inst. https://doi.org/10.1093/jnci/djx113 (2017).
Article PubMed PubMed Central Google Scholar
Dratsch, T. et al. Automation bias in mammography: the impact of artificial intelligence BI-RADS suggestions on reader performance. Radiology 307, e222176 (2023).
Article PubMed Google Scholar
Clusmann, J. et al. The future landscape of large language models in medicine. Commun. Med. 3, 141 (2023).
Article PubMed PubMed Central Google Scholar
Han, T. et al. Medical large language models are susceptible to targeted misinformation attacks. NPJ Digit. Med. 7, 288 (2024).
Article PubMed PubMed Central Google Scholar
Clusmann, J. et al. Prompt injection attacks on vision language models in oncology. Nat. Commun. 16, 1239 (2025).
Article CAS PubMed PubMed Central Google Scholar
Savage, T., Nayak, A., Gallo, R., Rangan, E. & Chen, J. H. Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine. NPJ Digit. Med. 7, 20 (2024).
Article PubMed PubMed Central Google Scholar
Abgrall, G., Holder, A. L., Chelly Dagdia, Z., Zeitouni, K. & Monnet, X. Should AI models be explainable to clinicians? Crit. Care 28, 301 (2024).
Article PubMed PubMed Central Google Scholar
Gilbert, S., Dai, T. & Mathias, R. Consternation as Congress proposal for autonomous prescribing AI coincides with the haphazard cuts at the FDA. NPJ Digit. Med. 8, 165 (2025).
Article PubMed PubMed Central Google Scholar
Balch, J. A. et al. Machine learning-enabled clinical information systems using fast healthcare interoperability resources data standards: scoping review. JMIR Med. Inform. 11, e48297 (2023).
Article PubMed PubMed Central Google Scholar
Lee, H. et al. The impact of generative AI on critical thinking: self-reported reductions in cognitive effort and confidence effects from a survey of knowledge workers. In CHI Conference on Human Factors in Computing Systems (CHI ’25) 1–22 (Association for Computing Machinery, 2025).
Fanous, A. et al. SycEval: evaluating LLM sycophancy. Preprint at https://doi.org/10.48550/arXiv.2502.08177 (2025).
Wiest, I. C. et al. Large language models for clinical decision support in gastroenterology and hepatology. Nat. Rev. Gastroenterol. Hepatol. https://doi.org/10.1038/s41575-025-01108-1 (2025).
Article PubMed Google Scholar
Rein, D. et al. GPQA: A graduate-level Google-proof Q&A benchmark. Preprint at https://doi.org/10.48550/arXiv.2311.12022 (2023).
Kazemi, M. et al. BIG-bench extra hard. In Proc. 63rd Annu. Meet. Assoc. Comput. Linguist. Vol. 1, 26473–26501 (ACL, 2025).
Liang, P. et al. Holistic evaluation of language models. Preprint at https://doi.org/10.48550/arXiv.2211.09110 (2022).
Khandekar, N. et al. MedCalc-bench: evaluating large language models for medical calculations. Preprint at https://doi.org/10.48550/arXiv.2406.12036 (2024).

Download references

Acknowledgements

J.N.K. is supported by the German Cancer Aid (DECADE, 70115166), the German Federal Ministry of Education and Research (PEARL, 01KD2104C; CAMINO, 01EO2101; TRANSFORM LIVER, 031L0312A; TANGERINE, 01KT2302 through ERA-NET TRANSCAN; Come2Data, 16DKZ2044A; DEEP-HCC, 031L0315A), the German Academic Exchange Service (SECAI, 57616814), the European Union’s Horizon Europe and innovation programme (ODELIA, 101057091; GENIAL, 101096312), the European Research Council (ERC; NADIR, 101114631), the National Institutes of Health (EPICO, R01 CA263318) and the National Institute for Health and Care Research (NIHR, NIHR203331) Leeds Biomedical Research Centre. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. This work was funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.

Author information

Authors and Affiliations

RWTH Aachen University, Aachen, Germany
Daniel Truhn
Google DeepMind, Toronto, Ontario, Canada
Shekoofeh Azizi
Stanford University, Palo Alto, CA, USA
James Zou
La Fe Health Research Institute, Valencia, Spain
Leonor Cerda-Alberich
Harvard Medical School, Boston, MA, USA
Faisal Mahmood
Else Kroener Fresenius Center for Digital Health, Faculty of Medicine, TUD Dresden University of Technology, Dresden, Germany
Jakob Nikolas Kather
University Hospital Heidelberg, University of Heidelberg, Heidelberg, Germany
Jakob Nikolas Kather

Authors

Daniel Truhn
View author publications
Search author on:PubMed Google Scholar
Shekoofeh Azizi
View author publications
Search author on:PubMed Google Scholar
James Zou
View author publications
Search author on:PubMed Google Scholar
Leonor Cerda-Alberich
View author publications
Search author on:PubMed Google Scholar
Faisal Mahmood
View author publications
Search author on:PubMed Google Scholar
Jakob Nikolas Kather
View author publications
Search author on:PubMed Google Scholar

Contributions

D.T., L.C.A., F.M. and J.N.K. researched data for the article. All authors contributed substantially to discussion of the content. D.T., S.A., J.Z., L.C.A. and J.N.K. wrote the article. All authors reviewed and/or edited the manuscript before submission.

Corresponding author

Correspondence to Jakob Nikolas Kather.

Ethics declarations

Competing interests

J.N.K. declares consulting services for Bioptimus, France; Panakeia, UK; AstraZeneca, UK; and MultiplexDx, Slovakia. Furthermore, he holds shares in StratifAI, Germany; Synagen, Germany; and Ignition Lab, Germany; has received an institutional research grant by GSK and AstraZeneca; and has received honoraria by AstraZeneca, Bayer, Daiichi Sankyo, Eisai, Janssen, Merck, MSD, BMS, Roche, Pfizer and Fresenius. D.T. received honoraria for lectures by Bayer, GE, Roche, AstraZeneca and Philips and holds shares in StratifAI GmbH, Germany, and in Synagen GmbH, Germany. F.M. is a scientific adviser for and holds shares in Modella AI and is an adviser for Danaher. S.A. is an employee of Alphabet and may own stock as part of the standard compensation package. J.Z. and L.C.A. declare no competing interests.

Peer review

Peer review information

Nature Reviews Cancer thanks Anant Madabhushi, Wayne Zhao and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Glossary

Chain-of-thought reasoning: A prompting technique that encourages language models to generate intermediate reasoning steps before arriving at a final answer, improving performance on complex tasks.
Contraindications: Clinical conditions or factors that make a particular treatment or procedure inadvisable because of potential harm to the patient.
Deep learning: A subset of machine learning that uses artificial neural networks with multiple layers to learn hierarchical representations of data.
Differential diagnoses: A systematic process of distinguishing between diseases or conditions that share similar clinical features to identify the most likely diagnosis.
Edge case: An unusual or extreme scenario that occurs at the boundaries of normal operating conditions, often revealing limitations in system performance.
Hyperparameters: Configuration settings defined before model training that control the learning process, such as learning rate, batch size and network architecture choices.
Large language model: (LLM). Type of artificial intelligence model trained on vast amounts of text data to understand and generate human language, capable of performing diverse language tasks without task-specific training.
Multi-turn conversation: A dialogue consisting of multiple exchanges between a user and an AI system, in which context from previous turns informs subsequent responses.
Natural language processing: (NLP). A field of AI artificial intelligence focused on enabling computers to understand, interpret and generate human language.
Parsing documents: The computational process of analysing and extracting structured information from unstructured or semi-structured text documents.
Precompiled reports: Standardized documents generated in advance or from templates, typically containing structured clinical or research data ready for review.
Reinforcement learning: A machine learning paradigm in which an agent learns to make decisions by receiving feedback in the form of rewards or penalties based on its actions.
Token: The basic unit of text processed by a language model, which may represent a word, subword or character depending on the tokenization scheme.
Transformer architecture: A neural network design that uses self-attention mechanisms to process sequential data in parallel, forming the foundation of modern LLMs.
Vision language model: An AI model capable of processing and relating both visual information (such as images) and textual data within a unified framework.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Truhn, D., Azizi, S., Zou, J. et al. Artificial intelligence agents in cancer research and oncology. Nat Rev Cancer 26, 256–269 (2026). https://doi.org/10.1038/s41568-025-00900-0

Download citation

Accepted: 01 December 2025
Published: 12 January 2026
Version of record: 12 January 2026
Issue date: April 2026
DOI: https://doi.org/10.1038/s41568-025-00900-0

Artificial intelligence agents in cancer research and oncology

Subjects

Abstract

Access options

Similar content being viewed by others

Translation of AI into oncology clinical practice

The impact of AI on modern oncology from early detection to personalized cancer treatment

Hallmarks of artificial intelligence contributions to precision oncology

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Glossary

Rights and permissions

About this article

Cite this article

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

Translation of AI into oncology clinical practice

The impact of AI on modern oncology from early detection to personalized cancer treatment

Hallmarks of artificial intelligence contributions to precision oncology

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Glossary

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links