APOLLO11: a bio-data-driven model for clinical and translational research in lung cancer

Prelaj, Arsela; Provenzano, Leonardo; Miskovic, Vanja; Ganzinelli, Monica; Mazzeo, Laura; Gemelli, Maria; Silvestri, Cecilia; Spagnoletti, Andrea; Romanò, Rebecca; Brambilla, Marta; Occhipinti, Mario; Beninato, Teresa; Ambrosini, Paolo; Sottotetti, Elisa; Favali, Margherita; Zec, Aleksandra; Ferrarin, Alberto; Corrao, Giulia; Prina, Marco Meazza; Ruggirello, Margherita; Marino, Moreno Bruno; Dumitrascu, Andra Diana; Di Mauro, Rosa Maria; Giani, Claudia; Cavalli, Chiara; Serino, Roberta; Catania, Chiara; Panzardi, Antonella; Metro, Giulio; Bennati, Chiara; Ferrara, Roberto; Macerelli, Marianna; Servetto, Alberto; Cona, Maria Silvia; La Verde, Nicla; Toschi, Luca; Baili, Paolo; Corso, Federica; Zito, Emanuela; Cinieri, Saverio; Berardi, Rossana; Scoazec, Giovanni; Inno, Alessandro; Gori, Stefania; Pisconti, Salvatore; Buzzacchino, Federica; Brighenti, Matteo; Biello, Federica; Tartarone, Alfredo; Pruneri, Giancarlo; Belfiore, Antonino; Agnelli, Luca; Guidi, Alessandro; Invernizzi, Luca; Salmistraro, Noemi; Filippi, Andrea Riccardo; Solli, Piergiorgio; Galli, Giulia; Lorenzini, Daniele; Pizzutilo, Elio Gregory; De Braud, Filippo; Pedrocchi, Alessandra; Trovò, Francesco; Genova, Carlo; Corte, Carminia Maria Della; Viscardi, Giuseppe; Garassino, Marina Chiara; Cortellini, Alessio; Mingo, Emanuele; Russano, Marco; Signorelli, Diego; Proto, Claudia; Vingiani, Andrea; Sangaletti, Sabina; Lo Russo, Giuseppe

doi:10.1038/s41698-026-01295-3

Download PDF

Article
Open access
Published: 29 January 2026

APOLLO11: a bio-data-driven model for clinical and translational research in lung cancer

Arsela Prelaj¹^na1,
Leonardo Provenzano^1,2^na1,
Vanja Miskovic^1,2^na1,
Monica Ganzinelli¹^na1,
Laura Mazzeo^1,2,
Maria Gemelli³,
Cecilia Silvestri^1,4,
Andrea Spagnoletti¹,
Rebecca Romanò^4,5,
Marta Brambilla¹,
Mario Occhipinti¹,
Teresa Beninato¹,
Paolo Ambrosini^1,4,
Elisa Sottotetti¹,
Margherita Favali²,
Aleksandra Zec^1,2,
Alberto Ferrarin¹,
Giulia Corrao¹,
Marco Meazza Prina¹,
Margherita Ruggirello⁶,
Moreno Bruno Marino⁶,
Andra Diana Dumitrascu¹,
Rosa Maria Di Mauro¹,
Claudia Giani^1,4,
Chiara Cavalli^1,4,
Roberta Serino^1,4,
Chiara Catania⁷,
Antonella Panzardi⁷,
Giulio Metro⁸,
Chiara Bennati⁹,
Roberto Ferrara¹⁰,
Marianna Macerelli¹¹,
Alberto Servetto¹²,
Maria Silvia Cona¹³,
Nicla La Verde¹³,
Luca Toschi¹⁴,
Paolo Baili¹⁵,
Federica Corso¹,
Emanuela Zito¹⁶,
Saverio Cinieri¹⁷,
Rossana Berardi¹⁸,
Giovanni Scoazec¹⁹,
Alessandro Inno²⁰,
Stefania Gori²⁰,
Salvatore Pisconti²¹,
Federica Buzzacchino²¹,
Matteo Brighenti²²,
Federica Biello²³,
Alfredo Tartarone²⁴,
Giancarlo Pruneri^4,25,
Antonino Belfiore²⁵,
Luca Agnelli²⁵,
Alessandro Guidi²⁵,
Luca Invernizzi¹,
Noemi Salmistraro⁵,
Andrea Riccardo Filippi²⁶,
Piergiorgio Solli²⁷,
Giulia Galli²⁸,
Daniele Lorenzini^4,25,
Elio Gregory Pizzutilo⁵,
Filippo De Braud^1,4,
Alessandra Pedrocchi²,
Francesco Trovò²,
Carlo Genova^29,30,
Carminia Maria Della Corte³¹,
Giuseppe Viscardi³²,
Marina Chiara Garassino³³,
Alessio Cortellini^34,35,36,
Emanuele Mingo³⁶,
Marco Russano³⁶,
Diego Signorelli⁵,
Claudia Proto¹^na2,
Andrea Vingiani^4,26^na2,
Sabina Sangaletti³⁷^na2,
Giuseppe Lo Russo¹^na2 &
the APOLLO11 study group

npj Precision Oncology volume 10, Article number: 96 (2026) Cite this article

5446 Accesses
1 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Identifying predictive and resistance biomarkers remains one of the most relevant unmet needs in clinical cancer research. Artificial Intelligence (AI) represents a powerful tool to develop predictive algorithms tailored to individual patients. Thanks to its ability to process large quantities of heterogeneous, patient-level information, the AI-based approach is progressively fostering the growth of a data-driven paradigm to complement traditional, hypothesis-driven clinical research. However, the development of reliable AI models requires access to large, high-quality, and continuously updated datasets. Despite this necessity, no infrastructure currently exists to enable federated, multi-omic, standardized, prospective, and large-scale collection and analysis of real-world clinical and biological data in the context of lung cancer. We established the APOLLO11 consortium, a distributed, nationwide, updated Italian lung cancer network designed to build a decentralized, long-term, population-based, real-world data repository and a multilevel biobank, locally stored and centrally annotated. This strategy seeks to lay the foundation for the clinical implementation of data-driven research, ultimately advancing precision oncology.

Advancing AI for multi-omics and clinical data integration in basic and translational cancer research

Article 21 April 2026

Hallmarks of artificial intelligence contributions to precision oncology

Article 07 March 2025

Deep generative AI models analyzing circulating orphan non-coding RNAs enable detection of early-stage lung cancer

Article Open access 21 November 2024

Introduction

With the advent of several innovative therapies, such as immunotherapy (IO), target therapies, and other next-generation treatments, the identification of predictive biomarkers has become the main goal of clinical and translational research in advanced lung cancer^1,2,3,4. Indeed, the approval of different IO-based therapeutics and targeted treatments radically changed the treatment landscape of advanced Non-Small Cell Lung Cancer (aNSCLC) and advanced Small Cell Lung Cancer (aSCLC) patients, significantly prolonging the overall survival (OS) and also inducing a long-term remission for a quote of patients with metastatic disease^{5,6,7,8,9,10,11,12,13,14}. However, around half of patients do not benefit from these novel therapies, either due to primary refractoriness or due to secondary resistance, which occurs after an initial benefit^11,14,15,16. Current biomarkers are inadequate to guide treatment decisions in the context of novel therapeutics. As an example, the role of Programmed Death Ligand 1 (PD-L1), as evaluated by immunohistochemistry (IHC) on tumor specimens, to predict IO utility remains poorly defined, and no other biomarkers are currently used to tailor IO-based treatments (e.g., IO alone vs IO in combination with chemotherapy)^15,17,18,19.

Conventional statistical methods are widely used in oncology to find associations between patient characteristics and outcomes, and to test whether data provide sufficient support for a specific hypothesis. However, conventional approaches have limited capacity to comprehensively evaluate the complexity of cancer biology, and to integrate the vast, heterogeneous, multi-modal data available from oncological patients^20,21. In addition, the increasing advancement in technology has led to an unprecedented acceleration of drug discovery, making the oncology field continuously and dramatically changing, even in a short timeframe. For this reason, the hypothesis-driven research, which is based on establishing ad hoc studies to test just a single or a few hypotheses at the same time, cannot keep pace with the recent advances in oncology^22,23. Artificial Intelligence (AI) frameworks, which synthesize and correlate information from different data sources, are a potentially highly efficient instrument to construct algorithms reinforcing the individual patient prediction. AI also allows the extraction of information from unstructured data, such as medical images, digitized slides, and mobile app monitoring, which can be associated with prognosis or benefit from treatments and therefore adoptable as potential novel biomarkers. Through the ability of AI to handle large quantities of single-patient information at the same time, the data-driven paradigm is increasingly rising, in parallel to the traditional hypothesis-driven clinical research^24,25. This approach allows multiple clinical or scientific questions to be tested simultaneously using existing data, providing timely insights from open questions that are constantly arising from clinical practice or translational research.

The increasing availability of real-world data (RWD) and the application of AI are enabling the generation of novel hypotheses and accelerating translational insights. However, to achieve this goal, a large amount of high-quality, updated, and multisource data is mandatory to appropriately train AI models and reach satisfactory accuracy to make them applicable in clinical practice^17,18,20,26. To address the lack of adequate data to perform the analyses required to address unmet clinical needs, we established the APOLLO11 consortium, which is a distributed, nationwide, continuously updated Italian lung cancer network. It encompasses the development of a decentralized long-term national database collecting RWD, settled locally in each center, and a “multilevel” biobank, locally stored and centrally annotated. The multilevel structure is designed to maximize the contribution to biological samples collection, including centers with more limited facilities. APOLLO11 aims to create a large platform for the collection of data from multiple sources, to accelerate the conduct of academic research, and to generate knowledge to answer new questions arising from clinical practice. This strategy aims to become the foundation for the clinical implementation of data-driven research²⁷.

APOLLO11 will address several clinically-relevant, unsolved scientific questions²⁷. The first scientific objective of this project is to find a predictive multi-omic algorithm of IO efficacy in aNSCLC patients. To achieve this scientific aim, multi-modal data were collected from aNSCLC patients treated with IO-based therapy across centers participating in the consortium. Machine Learning (ML) and Deep Learning (DL) AI-based models will be used to generate and synthesize biomarkers to accomplish the highest performance in prediction using multi-modal data. With the help of EXplainable trustworthy AI (XAI) methodologies and fairness auditing, trustworthy, human-readable models will be generated, leading to the creation of a responsible AI-based tool. This tool aims to support individualized treatment decisions about the use of IO in aNSCLC, optimizing treatment outcomes while minimizing undue toxicity.

In this manuscript, we will describe the rationale, structure, and detail the early implementation of the APOLLO11 nationwide infrastructure, with the aim of presenting the framework underpinning the collection, harmonization, and integration of clinical, radiomic, and multi-omic data in lung cancer.

Results

To keep the pace of the drug discovery process, the APOLLO11 consortium aims to collect updated large-scale data to identify and combine different types of biomarkers (clinical, radiologic, genetic, molecular, and immunological) across different lung cancer histological and biological entities and stages. We expect to identify markers predicting response to therapy at baseline, markers associated with primary and secondary resistance mechanisms, markers capable of predicting relapse, and markers associated with treatment toxicity.

APOLLO11 aims to become a network model for data collection and analysis in the data-driven research era, which can be applied to virtually all fields of oncology and different geographical and territorial contexts^24,25. To pursue this objective, different steps are crucial for its implementation, starting from network building to results generation and application. APOLLO11 workflow as a model for data-driven research is shown in Fig. 1.

**Fig. 1: APOLLO11 workflow for data-driven research.**

Network establishment

Firstly, we established an Italian network of clinical centers with expertise in the treatment of lung cancer. The oncological network in Italy, as is often the case in the field, is characterized by a “hub and spokes” model of organization, in which local “spokes” centers generally take charge of patients, referring more complex clinical situations or treatment within clinical trials to the “hub” centers. The “hub” centers, on the other hand, are specialized in testing new drugs or new therapeutic approaches, or on treatment of rarer clinical conditions. This type of healthcare approach has led to a fragmentation of the collection of clinical data and biological materials, which often remain unused when they are collected at “spokes” centers. The consequence of this model is that RWD and biological material collected in non-research institutions cannot be used for research purposes, thus excluding “spokes” from academic research^28,29.

The APOLLO11 project is breaking down these logistical and administrative barriers, enabling even smaller centers to contribute to research activity and, on the other hand, to channel a large amount of data to the ‘hub’ centers, making it available for scientific purposes. For this project, the identification of centers comes from a careful selection of public, academic, or private hospitals reflecting specific criteria. In particular, aspects considered crucial for inclusion in the network are: the presence of clinical oncologists with the experience and motivation to conduct academic research; software able to support electronic health records (EHRs) on which patient data and images can be unambiguously traced; one or more experienced staff dedicated to data collection and data management and biologists dedicated to the handling of biological samples; experience in the management of clinical trials, including academic ones. Moreover, to guarantee a nationwide reproducible data collection reaching the population-based level, an effort is made to include centers equally distributed across the national territory, including rural ones, such as those in the island territories. This will allow patients with diverse demographics, varying access to healthcare, and different social circumstances and health-related behaviors. On the other hand, the ‘spokes’ centers are guaranteed the opportunity to actively participate in data collection and analytic processes, and access to training and support resources provided by the APOLLO11 network. Additionally, participating centers benefit from increased visibility and recognition within the oncology community, as well as the opportunity to contribute to cutting-edge research and advancements in cancer care.

Infrastructure

Each participating center is collecting data using the secure web-based REDCap platform³⁰. APOLLO11 data collection encompasses both retrospective and prospective observational phases. During the retrospective phase, all participating centers are contributing to the collection of RWD on lung cancer patients already treated with innovative systemic therapies. RWD collected includes demographic, epidemiological, treatment-related, blood-based (e.g., cell blood counts, biochemical), tumor-related (e.g., biological evaluations performed on tumor specimens as per clinical practice), as well as treatment details, survival outcomes, radiological response, and toxicity data. In the prospective phase, the study enrolls lung cancer patients who are candidates for an innovative systemic therapy, including newly diagnosed patients not previously exposed to such treatments. For the purposes of the study, innovative therapy is defined as any medical treatment that has been registered in Italy since the year 2010. Patient enrollment at each center started from the date of local committee approval, whereas individual data collection began when patients provided informed consent for data handling. RWD will be pseudonymized and entered into the local REDCap database, with a continuous update. Data sharing between participating centers and the coordinating center is governed by a dedicated Data Management Plan to ensure confidentiality and compliance with regulatory requirements³¹. Only the collection, sharing, or analysis of data with study objectives unrelated to the use of innovative anticancer treatment in lung cancer will require additional informed consent from patients³².

To harmonize data collection across participating centers, a data dictionary for biological and medical terms included in the database is discussed, shared with centers, and incorporated in REDCap electronic Case Report Form (eCRF)^33,34. This data dictionary will be periodically updated, adding new therapeutics or knowledge advances in lung cancer whenever it reaches the clinical practice, based on the most updated available literature. The adoption of a data dictionary will allow the use of a common language among clinical centers, avoiding the use of ambiguous “free text” fields, which are also poorly handled by AI-based algorithms.

Besides RWD collection, medical images are being collected as an additional source of information to potentially include in the multi-omic predictive models, to further boost their performance^18,35,36,37. Both retrospective and prospective enrolled patients underwent computed tomography (CT), Magnetic Resonance Imaging (MRI), and/or 2-fluorodeoxyglucose positron emission tomography (FdG-PET) scans according to standard-of-care clinical practice will be included in these analyses. Digitalized slides of diagnostic biopsy are also collected for the conduction of AI-based analysis on whole slide images (WSIs). Radiological imaging will be collected at each innovative treatment’s baseline, first radiological evaluation, and radiological progression. One coordinating center will arrange data collection and will assess data quality. Scans from individuals enrolled in participating centers will be de-identified per GDPR standards and encrypted before transmission to the server for radiomics analysis. The identification of the volume of interest (VOI) to conduct radiomics analysis in the APOLLO11 study follows a two-step approach: a fully-automated segmentation methodology (i.e., nnUNet) is adopted as the first instance to standardize the acquisition of VOI³⁸; for a quote of images, dedicated radiologists with experience in radiomics also semi-automatically delineate the 3D target tumor volumes for each patient, to enhance the reproducibility of extracted features³⁹. The generated radiomics signature will then be validated and integrated with RWD-based models. Similar procedures will be applied for the collection, digitalization, and sharing of digitalized pathology slides⁴⁰.

With the aim of comprehensively characterizing tumor biology of enrolled patients and finally improving the predictive tool performance trained on RWD and medical image-data, the APOLLO11 project aims to establish a national multilevel biobank for lung cancer patients, standardizing sample collection procedures across centers while ensuring ethical and legal compliance. Biological samples include: archival tumor tissues, collected for diagnostic purposes at the start of treatment or for re-characterization after treatment failure, whole blood, plasma, Peripheral Blood Mononuclear Cells (PBMC), urine, saliva, and feces samples, which are being collected at specific treatment intervals for prospective patients starting a new innovative therapy. Detailed information on immune profiling and spatial transcriptomic analytic process is reported in the “Methods” section. The samples collected are stored on site, but annotation will be continuously tracked in a dedicated section of the eCRF. Biological sample collection is encouraged but not mandatory for center participation and patient enrollment in the APOLLO11 study. This approach promotes comprehensive multi-omic data collection, recognizing that not all institutions may have the logistical or technical capabilities to support biospecimen handling and storage.

These samples will be shared with the center provider of translational analyses at the time of the exploitation of scientific purposes, with residual material promptly shipped back where the patient is enrolled for clinical use or future investigation. Analyses will be performed on specific samples according to the analyses foreseen by the scientific proposal.

Implementation

Patients who are candidates to participate in the APOLLO11 study will be enrolled at the individual center level. The master protocol details specific criteria for patient enrollment, such as the diagnosis of NSCLC or SCLC confirmed on histological or cytological specimen, and past or present treatment with at least one innovative systemic treatment in their cancer history. However, the broad enrollment criteria foresee the inclusion of subgroups of patients, such as older, fragile, and other neglected population settings, across all lung cancer stages (I–IV), providing evidence on special categories not addressed by available literature. Specifically, the introduction of the term “innovative therapies” includes the vast majority of treatments, with the exception of standard cytotoxic agents, to allow all clinical scenarios that have not yet been sufficiently investigated in oncological research to be studied.

Upon confirmation of eligibility, patients are required to provide informed consent. In detail, patients will be asked to provide one-shot consent for the collection of personal data and biological samples. This uniqueness of the informed consent allows sparing several and redundant re-consenting when a patient is going to start a new innovative therapy or when conducting new analyses in the context of the APOLLO11 study.

Each time patients enrolled in the APOLLO11 study initiate a new innovative therapy, biological materials, including whole blood, plasma, stool, and urine, are collected and stored at the center, and the availability of biological material is annotated on REDCap among clinical data³⁰. Available histological samples and medical images (i.e., radiological, digital pathology) are annotated as well.

Data sharing and federated learning

In the era of data-driven research, sharing large amounts of data is essential. However, this process is characterized by different criticisms, including issues about the necessity of high-capacity servers to centrally store data and ethical concerns about data privacy. This aspect is particularly relevant when the collection includes unstructured data such as radiological scans and digitized slides.

Federated learning offers a transformative approach to perform multi-omic analysis in multicenter studies, enabling the integration of diverse datasets without the need for centralized raw data^41,42,43,44. It allows each center to locally train AI models, sharing with the central server only the model updates, ensuring that sensitive data remains local. This novel approach to modeling in multicenter studies significantly reduces the risk of data breaches and compliance issues with stringent regulatory requirements for sensitive data transfer, such as GDPR⁴⁵. To take advantage of the above-mentioned benefits of federated and swarm learning, ad hoc software based on open source platforms has been tested in the APOLLO11 consortium for sharing RWD and radiological images and digital pathology slides between centers. Instead, the collected biological samples will be physically shared. To ensure proper functioning, the software will first be implemented in three Italian centers (the Sponsor institution, a cancer center, and a general hospital) and will be deployed in all centers of the consortium. The software is designed to be easy to use, as it does not require in-depth technical and informatics knowledge, but allows data to be loaded directly into the platform from the center, without the need for manual encoding for local training.

Resource usability

Data collected within the consortium will be used to answer various scientific questions that will be proposed on the basis of unmet clinical needs directly arising from clinical practice, or from new preclinical or translational evidence. On the basis of the specific scientific question, the most up-to-date version of the APOLLO11 dataset, including the subpopulation of interest, could be shared after an appropriate query is proposed.

Proposals for scientific queries, which have to be submitted in an anonymized way, may be made either by centers already belonging to the APOLLO11 consortium or by external institutions.

Internal centers are encouraged to advance new scientific proposals, based on their centers’ professional experience and expertise. To support this, “General Assembly” meetings will be regularly held to update ongoing projects and discuss new proposals. The General Assembly is a consortium organ consisting of one representative from each center, usually the Principal Investigator (PI) or his/her delegate. Proposals for new scientific queries may also come from external centers, whose suitability will be assessed by the Steering Committee itself.

Another ad hoc decision-making organ, called the “Steering Committee”, receives, evaluates, and expresses an independent judgment on the scientific proposal made. The members of the “Steering Committee” body are elected by the general assembly through a vote that takes place every three years. The steering committee will evaluate the projects by expressing an overall judgment on the proposal, based on 4 criteria: (1) relevance of the unmet clinical need in cancer care; (2) urgency of the unmet need in clinical practice; (3) originality of the project based on the available literature; (4) project feasibility, taking into account already available data and required additional resources.

Once the project is approved by the Steering Committee, the implementation phase begins with data extraction, including the annotation of biological samples locally stored in the biobanks at different centers.

At that point, the process will be different for proposals coming from internal and external centers. In the case of proposals from patients already taking part in the APOLLO11 consortium, data and materials are centralized at those centers or services designated to analyses conduction. All data are shared in a pseudonymized way, adopting a unique code for all the sub-analyses. When proposals arise from centers outside the consortium, in order to minimize the privacy risk, synthesized data will be generated from the identified dataset containing the population of interest. In this view, synthetic data already demonstrated to provide sufficient data quality to conduct statistical and AI-based analysis, while maximizing compliance with the GDPR. In the context of the APOLLO11 data workflow, the use of synthetic data, which will be generated through AI-based methodologies (e.g., variational autoencoders or generative adversarial networks), will allow to minimize data flow outside the consortium while still allowing queries from external centers to be addressed. The data usability flow for the APOLLO11 study is summarized in Fig. 2.

**Fig. 2: Data usability for the APOLLO11 study.**

Data analysis methodology

Analyses from blood samples include the identification of Single-Nucleotide Polymorphisms (SNPs) through germinal sequencing from whole blood samples; microRNA (miRNA), circulating free DNA (cfDNA), lipidomic profiles and cytokines analyses from plasma samples; immune profiles and single-cell transcriptomic analyses from vital PBMCs (Fig. 1). Analyses from available archival samples will be conducted on tumor DNA through comprehensive next-generation sequencing (NGS) using extended panels, such as the Oncomine Comprehensive Assay (~500 genes), and, for a subset of tumor samples, through whole-genome sequencing (WGS) depending on the specific clinical or research question. Analyses on RNA will be performed through bulk transcriptomic profiling with standard library preparation protocols. Proteomic and metabolomic analyses will be performed to characterize amino acid and metabolite profiles, respectively, providing an integrated multi-omic view of tumor biology. Finally, microbiota analyses will be conducted on available stool and saliva samples. The choice of the center where the specific type of analysis is conducted is crucial to ensure a high quality of results. For this reason, the data and/or biological samples will be centralized in a designated facility with the highest expertise and deepest experience in analyzing that specific type of data. As discussed above, it is unlikely that a single omics is capable of making predictions on a single patient in a comprehensive manner. Therefore, in order to optimize prediction performance and thus to enhance their applicability, the fusion of various signatures generated by single data modalities into a comprehensive final predictive model is crucial. Multi-modal integration of different omics allows the comprehensive analysis and combination of various types of input to uncover complex interactions between different biological and molecular layers. Given the real-world, multicenter nature of APOLLO11, some heterogeneity and missingness across data sources are unavoidable. Depending on the extent and pattern of missing data, different strategies will be applied, including variable exclusion, restriction to complete cases, imputation, or multi-modal integration approaches tolerant to missing modalities. The final multi-omic model will be tested on an independent cohort of patients from a consortium center that did not contribute to the data training, as a validation cohort. This approach allows the generation of a validated model at the conclusion of each scientific objective, making it potentially ready for clinical implementation (Fig. 3).

**Fig. 3: APOLLO11 metadata at the data cut-off of 1st February 2025.**

Explainability

Despite the potential of AI to revolutionize cancer care, the adoption of ML/DL-based technologies in clinical practice is often hindered by the “black box” nature of many AI models. In fact, the complexity of models generated, which involve DL and other advanced ML techniques, can make their decision-making processes opaque. This challenge is especially pertinent in cancer research, where the integration of diverse sources of data requires a high degree of transparency and interpretability to understand the reasoning behind a model’s predictions to make it trustworthy to clinicians, patients, and other stakeholders. In this field, the appropriate use of XAI becomes crucial⁴⁶.

The systematic inclusion of XAI frameworks in the APOLLO11 scientific projects, including SHapley Additive exPlanations (SHAP)-based feature attribution visual interpretability plots, will contribute to the scientific rigor by enabling the validation of AI models against established biological and clinical knowledge. These approaches will be applied once predictive models become available, enabling clinicians and researchers to understand the relative contribution of clinical, imaging, and multi-omic variables to individual predictions. For example, if a model predicts a poor prognosis based on certain radiomic features, XAI techniques can help identify which specific features are driving this prediction and how they correlate with known risk factors or biomarkers. This not only helps in verifying the accuracy of the model but also in uncovering new insights into the disease, potentially leading to novel therapeutic targets or diagnostic markers^46,47.

First data collection

At the date of 1st February 2025, 52 Italian oncologic centers were screened for inclusion in the APOLLO11 network. Of them, 32 centers were selected and accepted to participate in the consortium. Finally, all these centers submitted the protocol to the local ethical committee, of which 20 centers received approval. To date, 7 centers started enrolling patients, collecting RWD, baseline CT and FdG-PET scans, and digitizing slides of a total of 2020 patients. Finally, 4 centers started the biological specimen collection. The different pipelines are being implemented concurrently to ensure synchronization and interoperability across modalities.

One example of research query within the APOLLO11 consortium’s aims to identify, through collaborative multi-omic data collection, factors potentially implicated in the response to ICIs in aNSCLC patients treated with immunotherapy in any line of treatment. The final aim of this integrated approach is the creation of a predictive AI model to enhance response customization and prediction. For the present data collection, part of the samples collected in the context of the APOLLO study, a single-institutional observational clinical trial involving the collection of clinical data and blood and tissue samples at the “Fondazione IRCCS Istituto Nazionale dei Tumori” in NSCLC patients who had received IO, have been retrieved^17,48. Through these initiatives, this task of APOLLO11 seeks to advance understanding and treatment efficacy in NSCLC immunotherapy.

The scientific object of this first scientific aim consists of the development of a predictive multi-omic algorithm of IO efficacy in aNSCLC patients. To pursue this aim, peripheral immune profiling was obtained in 264 fresh blood samples of patients processed at baseline, while longitudinal immune profiling was available for 197 patients. This characterization encompasses the assessment of monocytes and neutrophils with specific markers, including: CD11b, CD66b, HLA-DR, CD14, CD15, CD10, CD16, CD117, and CD71. In addition, a subgroup of patients (N = 42) underwent single-cell transcriptomic analysis (scRNAseq) to specifically profile various immune subpopulations, including T-lymphocytes, B-lymphocytes, plasma cells, NK cells, monocytes, and neutrophil granulocytes. The association of the composition of PBMCs, and in particular T cells, with clinical outcomes (tumor response, PFS, and OS) during IO treatment is currently being evaluated. Based on the clinical outcome data available to date, the predictive role of the composition of PBMCs has been preliminarily assessed, confirming the positive prognostic role of T cells. As follow-up continues, it will be possible to identify gene expression patterns predictive of benefit from immune checkpoint inhibitor therapy.

Discussion

The APOLLO11 project has established a nationwide, multicenter, continuously updated collection of data and biological samples from lung cancer patients treated with innovative systemic therapies. This model provides a practical framework for data-driven research, which relies on large, updated, and comprehensive datasets to address clinical and translational questions arising from clinical practice. The study follows a strict regulatory framework to ensure compliance with data protection laws, patient confidentiality, and ethical guidelines, thereby maintaining the integrity and reliability of the collected data while facilitating collaborative research efforts. The selected centers are geographically distributed across Northern, Central, and Southern Italy, and include academic, community, and research institutions, thereby reflecting the diversity of clinical practice across the country.

The decentralized structure of data collection and the approach to the scientific queries ensures research democracy, facilitating the availability of data among research group either inside or outside the consortium and supporting sharing hypotheses among researchers; they also guarantees meritocracy, prioritizing research questions that are more likely to positively impact cancer care; finally, APOLLO11 structure promotes scientific fairness, supporting centers with lower structural and financial resources. This comprehensive approach has the potential to enhance disease understanding and to support the development of more tailored treatments, with the aim of ultimately improving lung cancer patients' outcomes. Finally, the integration of XAI methodologies is expected to increase the transparency and trustworthiness of these models, facilitating clinicians to make informed, personalized treatment decisions that optimize outcomes and minimize toxicity. The widespread collection of tumor samples across centers will allow us to emphasize the diversity and richness of available data, encompassing immune circulating profiling, genome, and scRNAseq, with high versatility of the project in terms of data acquisition and analysis. The material and pre-analytical data collection pipeline is designed to ensure the availability of readily usable information, facilitating the initiation of research activities. In essence, this pipeline serves as a “ball pit of information” for translational researchers, providing them with a comprehensive array of biological samples and associated data, facilitating in-depth investigations into the molecular mechanisms underlying lung cancer and potential therapeutic targets. In addition, the APOLLO11 project will enable collaboration with external centers or existing networks, following a scientific query–centered approach and encompassing the use of synthetic data, which allows secure data sharing while overcoming ethical barriers.

Other efforts are underway to build multisource data to build predictive models^49,50,51. The I3LUNG project, similarly to APOLLO11, focuses on personalized medicine through the development of AI tools based on multi-modal patient data in lung cancer, integrating clinical information and multi-omics data from international cancer Institutions into a data storage and processing platform. Through the use of AI methodology, I3LUNG aims to improve the clinical decision-making process specifically for aNSCLC patients receiving IO by tailoring treatments to individual needs. Another example of data-driven research includes the MOSAIC project, which is a large European initiative aimed at developing a federated infrastructure for multi-omics and clinical data integration in oncology, which aims to support clinicians with an AI-based framework for multi-modal analysis, classification, and personalized prognostic assessment in rare cancers. However, while MOSAIC primarily focuses on establishing technical standards, data interoperability frameworks, and ethical–legal guidelines to enable cross-border research collaboration, APOLLO11 is a national, multicenter, disease-specific program that applies an innovative infrastructural framework to a real-world, clinically integrated setting in NSCLC.

Similar approaches are ongoing in the literature, including the federated distribution of digitized slides and the systematic collection of cancer radiological images⁴⁴^,⁵⁰. However, APOLLO11 is unique in its decentralized data collection, multi-modal integration, and federated learning approach. This holistic strategy addresses previous limitations and provides a comprehensive framework for future translational oncology research. In particular, APOLLO11’s creation of a biobank and easily accessible data collection for scientific purposes is a significant innovation. Finally, thanks to the broad inclusion criteria and one-time consent, APOLLO11 will allow studies to be conducted based on novel questions arising directly from unmet clinical or translational needs.

Initial descriptive results from APOLLO11 demonstrate the potential of this collaborative effort. By identifying factors influencing the response to ICIs in advanced NSCLC patients, the consortium has started developing predictive models to guide treatment decisions. This success underscores the feasibility and effectiveness of the APOLLO11 approach, paving the way for further advancements in the field.

Despite its promising design and nationwide scope, the implementation of APOLLO11 is not without challenges. The multicentric nature of the initiative inevitably introduces heterogeneity in data quality and biospecimen handling across participating centers, even with standardized operating procedures in place. Ensuring long-term sustainability in terms of funding, infrastructure maintenance, and the continuous engagement of both hub and spoke centers will be critical to preserve the integrity and expansion of the network over time. Moreover, while the federated learning framework mitigates some privacy and data-sharing concerns, regulatory and legal complexities surrounding data governance may still pose barriers to broader interoperability and scalability. Proactively addressing these issues through continuous quality control, transparent governance, and alignment with European data frameworks will be essential to ensure the feasibility, scalability, and long-term integration of APOLLO11 within the evolving landscape of international precision oncology.

In summary, the APOLLO11 consortium aims to provide a shift in lung cancer research, allowing the implementation of data-driven beside the traditional hypothesis-driven approaches by enabling new hypotheses to directly emerge from large-scale RWD. Given the existing ethical and legal constraints that characterize the actual Italian scenario, the establishment of robust and transparent federated frameworks at the national level could facilitate the active participation of Italian centers in European data ecosystems. The ongoing commitment of participating centers and the continuous integration of new data and technologies are pivotal for sustaining progress. Insights gleaned from APOLLO11 have the ambition of directing future research, signaling a shift towards translational lung cancer research.

Methods

Ethics approval and one-shot consent

During the visit, patients will be thoroughly informed and subsequently given the opportunity to sign an informed consent form. This decree regulates the retrospective and prospective collection of clinical data and the storage of biological samples. In cases where it is not feasible to obtain consent for the use of data from patients retrospectively included due to ethical or administrative reasons, the study complies with the guidelines established by the Data Protection Authority, as specified in the General Authorization for the processing of personal data for scientific research purposes and the General Authorization for the processing of genetic data. The informed consent form will also allow patients to state their preference regarding whether they wish to be informed about any unexpected findings related to their health that could have therapeutic implications or influence their reproductive decisions. Eligibility will be verified using a checklist, ensuring that only patients diagnosed with SCLC or NSCLC, who have received or are candidates to receive innovative therapy, are included. Upon consenting, patients will begin innovative therapy, during which blood, stool, urine, and histological material, as well as data from CT and PET scans, will be collected and recorded in REDCap at baseline and/or specific time points. If consent is given during ongoing innovative therapy or if the patient has previously received such therapy, data collection will continue in REDCap. Patients who complete their current therapy and are candidates for a new innovative therapy will transition to Scenario 1, where samples will be collected at the start and at predetermined time points for each new line of therapy.

REDCap is a secure web application designed for data collection and management in research studies. The activation process involves the engineer from the Coordinating Center transferring data to the engineer at the Recruiting Center, who will structure the platform’s pages. To access REDCap, users will receive a link and credentials generated by the IT engineer at the Recruiting Center. The platform features a registration page that captures the patient’s basic information and multiple questions with either multiple-choice or single-choice responses, ensuring a comprehensive and organized data collection process.

Each center will have access only to its own data on the platform, with oversight provided by the Coordinating Center. Upon signing the consent form for the study, this data will be stored anonymously and in compliance with privacy laws. Access to the data will be granted based on various queries proposed and approved by the consortium, ensuring that all data usage adheres to the established guidelines and regulations.

This study was conducted in accordance with the principles of the Declaration of Helsinki. Ethical approval was granted by the Comitato Etico Territoriale Lombardia 4, which acted as the central ethics committee for the study. The single national opinion (Parere Unico Nazionale) was issued on October 10th, 2024, under the reference code INT 128/22, and is valid for all participating centers involved in the study.

FACS and single-cell transcriptomics methodology

EDTA/Heparin blood samples collected are processed. For each patient/timepoint, it will be generated: a plasma biobank, stored at -80°C in a dedicated freezer; a PBMC biobank, stored in nitrogen. For each patient/timepoint, 5 plasma samples in EDTA, 2 plasma samples in heparin, and 2/3 viable PBMC samples.

For each patient, immediate fresh sample analysis of monocytes and neutrophils is performed with specific markers including: CD11b, CD66b, HLA-DR, CD14, CD15, CD10, CD16, CD11,7 and CD71; these results are reported on a specific updated database, along with the original fcs file generated by the Celesta cytometer.

For a subgroup of patients, a second PBMC sample is analyzed, from which CD66+ CD15+ neutrophils are sorted for scRNAseq. For each scRNAseq analysis, 2 samples are available: 1 sample of FACS-sorted neutrophils (10,000 cells) and 1 sample of total PBMCs (10,000 cells); these data are collected on a dedicated database with all original analyses and all sorting data generated by the FACS Melody cytometer.

The scRNAseq analyses are performed using samples consisting of 50% PBMCs and 50% FACS-sorted neutrophils, in order to produce transcriptomic data from both the mononuclear and polymorphonuclear cell components, which are known to be more labile and difficult to handle in single-cell experiments. The viability of the cells obtained is assessed using Trypan Blue staining. Subsequently, single-cell suspensions are prepared with approximately 10,000 cells per sample. ScRNAseq libraries are created with the Chromium Next GEM kit (10X Genomics). Cells are washed, resuspended in a PBS-BSA solution, and encapsulated in droplets to form gel bead emulsions (GEMs). The GEMs are subjected to reverse transcription in a thermal cycler, then disrupted, and the cDNA is purified and amplified by PCR. For the TCR-seq library, TCR transcripts are enriched from the amplified cDNA and then subjected to various preparation steps. The resulting libraries are purified and assessed for quality and then sequenced on a NovaSeq 6000 sequencer. Data analysis begins with Cell Ranger, followed by further analysis in R using the Seurat package to identify cell types, reduce dimensionality, perform clustering, and visualize data. Cell identification uses supervised algorithms (SingleR, AUcell) and manual curation.

Radiomics methodology

A team of four radiologists with experience in CT scan segmentation will identify the Region Of Interest (ROI) through a semi-automated 3D segmentation process performed with Syngo.via software. The ROI will be selected to include the whole lesion of interest, and it will also encompass the peritumoral region. The segmentation process will be performed by two radiologists and compared with the segmentation performed in an automated way through the U-Net architecture, to ensure the reproducibility of the process. After image pre-processing through different techniques (e.g., gray discretization, intensity normalization, and voxel resampling), radiomic features will be extracted from ROIs and peritumoral areas using the PyRadiomics library.

Given the multicenter nature of the APOLLO11 project, inter-center variability in scanners and acquisition protocols will be systematically assessed, as it represents a major source of batch effects and a critical barrier to the clinical reproducibility of radiomics. A preliminary analysis of acquisition parameters and scanner characteristics will be performed to guide the selection of the most appropriate harmonization strategy. Among the approaches considered, ComBat harmonization will be applied, using the center of image acquisition as the primary batch variable, and, when appropriate, the scanner model or contrast agent use, depending on metadata availability and distribution. These harmonization procedures will be conducted prior to feature selection to minimize technical bias across centers.

To prevent signature overfitting, the dimensionality of features will be reduced before signature construction, firstly excluding features with a high intraclass correlation coefficient and significantly different between the two outcome groups as assessed by one-way analysis of variance (ANOVA). Least absolute shrinkage and selection operator (LASSO) regression and/or Maximum Relevance Minimum Redundancy (MRMR) will then be used for the selection of features to be included in the final model with 5-fold cross-validation.

Model generation and other AI analyses

The entire cohort will be splitted in a training (80%) and a testing cohort (20%), leaving a sample of cases belonging to one of the centers of the consortium as an external validation cohort. Different standard ML classifiers, such as Random Forest, Multilayer perceptron, Logistic Regression, Support Vector Machine, CatBoost, AdaBoost, XGBoost, will be trained and evaluated for this task. Different metrics will be adopted to evaluate the performance of the model on training, cross-validation, and testing sets and on the external validation cohort, such as AUC, sensitivity, or specificity for classification tasks and c-index for survival tasks.

Data availability

The datasets generated during the current study are not publicly available, as this manuscript describes the study design of the APOLLO11 consortium, and the related analyses are still ongoing. However, the data are available from the corresponding author upon reasonable request.

Code availability

Not applicable, as no analyses are included in the present manuscript.

References

Bray, F. et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 68, 394–424 (2018).
PubMed Google Scholar
Howlader, N. et al. The effect of advances in lung-cancer treatment on population mortality. N. Engl. J. Med. 383, 640–649 (2020).
Article CAS PubMed PubMed Central Google Scholar
Siegel, R. L., Miller, K. D., Wagle, N. S. & Jemal, A. Cancer statistics, 2023. CA Cancer J. Clin. 73, 17–48 (2023).
PubMed Google Scholar
Siegel, R. L., Miller, K. D. & Jemal, A. Cancer statistics, 2017. CA Cancer J. Clin. 67, 7–30 (2017).
PubMed Google Scholar
Lynch, T. J. et al. Activating mutations in the epidermal growth factor receptor underlying responsiveness of non-small-cell lung cancer to gefitinib. N. Engl. J. Med. 350, 2129–39 (2004).
Article CAS PubMed Google Scholar
Rosell, R. et al. Erlotinib versus standard chemotherapy as first-line treatment for European patients with advanced EGFR mutation-positive non-small-cell lung cancer (EURTAC): a multicentre, open-label, randomised phase 3 trial. Lancet Oncol. 13, 239–285 (2012).
Article CAS PubMed Google Scholar
Maemondo, M. et al. Gefitinib or chemotherapy for non-small-cell lung cancer with mutated EGFR. N. Engl. J. Med. 362, 2380–8 (2010).
Mitsudomi, T. et al. Gefitinib versus cisplatin plus docetaxel in patients with non-small-cell lung cancer harbouring mutations of the epidermal growth factor receptor (WJTOG3405): an open-label, randomised phase 3 trial. Lancet Oncol. 11, 121–128 (2010).
Article CAS PubMed Google Scholar
Sequist, L. V. et al. Phase III study of afatinib or cisplatin plus pemetrexed in patients with metastatic lung adenocarcinoma with EGFR mutations. J. Clin. Oncol. 31, 3327–34 (2013).
Article CAS PubMed Google Scholar
Soria, J. C. et al. Osimertinib in untreated EGFR-mutated advanced non-small-cell lung cancer. N. Engl. J. Med. 378, 113–125 (2018).
Article CAS PubMed Google Scholar
Borghaei, H. et al. Nivolumab versus docetaxel in advanced nonsquamous non-small-cell lung cancer. N. Engl. J. Med. 373, 1627–1639 (2015).
Article CAS PubMed PubMed Central Google Scholar
Fehrenbacher, L. et al. Atezolizumab versus docetaxel for patients with previously treated non-small-cell lung cancer (POPLAR): a multicentre, open-label, phase 2 randomised controlled trial. Lancet 387, 1837–1846 (2016).
Article CAS PubMed Google Scholar
Reck, M. et al. Pembrolizumab versus chemotherapy for PD-L1-positive non-small-cell lung cancer. N. Engl. J. Med. 375, 1823–1833 (2016).
Article CAS PubMed Google Scholar
Antonia, S. J. et al. Durvalumab after chemoradiotherapy in stage III non-small-cell lung cancer. N. Engl. J. Med. 377, 1919–1929 (2017).
Article CAS PubMed Google Scholar
Gandhi, L. et al. Pembrolizumab plus chemotherapy in metastatic non-small-cell lung cancer. N. Engl. J. Med. 378, 2078–2092 (2018).
Article CAS PubMed Google Scholar
Ferrara, R. et al. Hyperprogressive disease in patients with advanced non-small-cell lung cancer treated with PD-1/PD-L1 inhibitors or with single-agent chemotherapy. JAMA Oncol. 4, 1543–1552 (2018).
Article PubMed PubMed Central Google Scholar
Prelaj, A. et al. Machine learning using real-world and translational data to improve treatment selection for NSCLC patients treated with immunotherapy. Cancers 14, 435 (2022).
Prelaj, A. et al. Artificial intelligence for predictive biomarker discovery in immuno-oncology: a systematic review. Ann. Oncol. 35, 29–65 (2024).
Article CAS PubMed Google Scholar
Lo Russo, G. et al. PEOPLE (NTC03447678), a phase II trial to test pembrolizumab as first-line treatment in patients with advanced NSCLC with PD-L1 <50%: a multiomics analysis. J. Immunother. Cancer 11, e006833 (2023).
Hunter, D. J. & Holmes, C. Where medical statistics meets artificial intelligence. N. Engl. J. Med. 389, 1211–1219 (2023).
Article PubMed Google Scholar
Bzdok, D., Altman, N. & Krzywinski, M. Points of significance: statistics versus machine learning. Nat. Methods 15, 233–234 (2018).
Article CAS PubMed PubMed Central Google Scholar
Haug, C. J. & Drazen, J. M. Artificial intelligence and machine learning in clinical medicine, 2023. N. Engl. J. Med. 388, 1201–1208 (2023).
Article CAS PubMed Google Scholar
Liang, W. et al. Advances, challenges and opportunities in creating data for trustworthy AI. Nat. Mach. Intell. 4, 669–677 (2022).
Article Google Scholar
Evans, R. P., Bryant, L. D., Russell, G. & Absolom, K. Trust and acceptability of data-driven clinical recommendations in everyday practice: a scoping review. Int. J. Med. Inform. 183, 105342 (2024).
Article PubMed Google Scholar
Cresswell, K. et al. Investigating the use of data-driven artificial intelligence in computerised decision support systems for health and social care: a systematic review. Health Inform. J. 26, 2138–2147 (2020).
Article Google Scholar
Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).
Article CAS PubMed Google Scholar
Prelaj, A. et al. APOLLO 11 Project, Consortium in Advanced Lung Cancer Patients Treated With Innovative Therapies: Integration of Real-World Data and Translational Research. Clin. Lung Cancer 25, 190–195 (2024).
Article CAS PubMed Google Scholar
Fania, L. et al. Integrated care pathways and the hub-and-spoke model for the management of non-melanoma skin cancer: a proposal of the Italian Association of Hospital Dermatologists (ADOI). Dermatol. Rep. 13, 9278 (2021).
Article CAS Google Scholar
Munhoz, R., Sabesan, S., Thota, R., Merrill, J. & Hensold, J. O. Revolutionizing rural oncology: innovative models and global perspectives. Am. Soc. Clin. Oncol. Educ. Book 44, e432078 (2024).
Harris, P. A. et al. Research electronic data capture (REDCap)-a metadata-driven methodology and workflow process for providing translational research informatics support. J. Biomed. Inform. 42, 377–381 (2009).
Article PubMed Google Scholar
Williams, M., Bagwell, J. & Nahm Zozus, M. Data management plans, the missing perspective. J. Biomed. Inform. 71, 130–142 (2017).
Article PubMed PubMed Central Google Scholar
Kohlmayer, F., Lautenschläger, R. & Prasser, F. Pseudonymization for research data collection: is the juice worth the squeeze? BMC Med. Inform. Decis. Mak. 19, 178 (2019).
Penberthy, L. T. et al. An overview of real-world data sources for oncology and considerations for research. CA Cancer J. Clin. 72, 287–300 (2022).
PubMed Google Scholar
Goel, A. K., Walter, C., Campbell, S. & Moldwin, R. Structured data capture for oncology. JCO Clin. Cancer Inform. 5, 194–201 (2021).
Article PubMed PubMed Central Google Scholar
Elmahdy, M. & Sebro, R. Radiomics analysis in medical imaging research. J. Med. Radiat. Sci. 70, 3–7 (2023).
Article PubMed PubMed Central Google Scholar
Van Griethuysen, J. J. M. et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 77, e104–e107 (2017).
Article PubMed PubMed Central Google Scholar
Zhao, J., et al. Radiomic and clinical data integration using machine learning predict the efficacy of anti-PD-1 antibodies-based combinational treatment in advanced breast cancer: a multicentered study. J. Immunother. Cancer 11, e006514 (2023).
Isensee, F., Jäger, P. F., Kohl, S. A. A., Petersen, J. & Maier-Hein, K. H. Automated design of deep learning methods for biomedical image segmentation. Nat. Methods 17, 1104–1114 (2020).
Google Scholar
Fedorov, A. et al. 3D slicer as an image computing platform for the quantitative imaging network. Magn. Reson. Imaging 30, 1323–1341 (2012).
Article PubMed PubMed Central Google Scholar
Dolezal, J. M. et al. Deep learning generates synthetic cancer histology for explainability and education. NPJ Precis. Oncol. 7, 49 (2023).
Scherer, J. et al. Joint imaging platform for federated clinical data analytics. JCO Clin. Cancer Inform. 4, 1027–1038 (2020).
Article PubMed PubMed Central Google Scholar
Fu, R., Wu, Y., Xu, Q. & Zhang, M. FEAST: a communication-efficient federated feature selection framework for relational data. Proc. ACM Manag. Data 1, 1–28 (2023).
Google Scholar
Teo, Z. L. et al. Federated machine learning in healthcare: a systematic review on clinical applications and technical architecture. Cell Rep. Med. 5, 101419 (2024).
Article PubMed PubMed Central Google Scholar
Tayebi Arasteh, S. et al. Preserving fairness and diagnostic accuracy in private large-scale AI models for medical imaging. Commun. Med. 4, 462 (2024).
Article Google Scholar
European Parliament. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data. Off. J. Eur. Union L119, 1–88 (2016).
Linardatos, P., Papastefanopoulos, V. & Kotsiantis, S. Explainable AI: a review of machine learning interpretability methods. Entropy 23, 18 (2021).
Article Google Scholar
Wells, L. & Bednarz, T. Explainable AI and reinforcement learning—a systematic review of current approaches and trends. Front. Artif. Intell. 4, 550030 (2021).
Article PubMed PubMed Central Google Scholar
Prelaj, A. et al. Real-world data to build explainable trustworthy artificial intelligence models for prediction of immunotherapy efficacy in NSCLC patients. Front. Oncol. 12, 1078822 (2023).
Article PubMed PubMed Central Google Scholar
Prelaj, A. et al. The EU-funded I3LUNG Project: Integrative Science, Intelligent Data Platform for Individualized LUNG Cancer Care With Immunotherapy. Clin. Lung Cancer 24, 381–387 (2023).
Article CAS PubMed Google Scholar
D'Amico, S. et al. MOSAIC: An Artificial Intelligence-Based Framework for Multimodal Analysis, Classification, and PersonalizedPrognostic Assessment in Rare Cancers. JCO Clin. Cancer Inform. 8, e2400008 (2024).
Article PubMed PubMed Central Google Scholar
Martí-Bonmatí, L. et al. Empowering cancer research in Europe: the EUCAIM cancer imaging infrastructure. Insights Imaging 16, 47 (2025).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank all the patients who accepted to partecipate to the study and IPOP “Insieme per i Pazienti di Oncologia Polmonare” to support and share the project. This work was supported by 5 per 1000 Funds financial support for research 2019, Italian Ministry of University and Research (MUR) — Institutional grant BRI2021. We would like to acknowledge donors for the Excalibur project in memory of Giorgiana Marchesi Bianchini. We especially thank all patients who took part in this clinical trial and their families.

Author information

These authors contributed equally: Arsela Prelaj, Leonardo Provenzano, Vanja Miskovic, Monica Ganzinelli.
These authors jointly supervised this work: Claudia Proto, Andrea Vingiani, Sabina Sangaletti, Giuseppe Lo Russo.

Authors and Affiliations

Medical Oncology Department, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy
Arsela Prelaj, Leonardo Provenzano, Vanja Miskovic, Monica Ganzinelli, Laura Mazzeo, Cecilia Silvestri, Andrea Spagnoletti, Marta Brambilla, Mario Occhipinti, Teresa Beninato, Paolo Ambrosini, Elisa Sottotetti, Aleksandra Zec, Alberto Ferrarin, Giulia Corrao, Marco Meazza Prina, Andra Diana Dumitrascu, Rosa Maria Di Mauro, Claudia Giani, Chiara Cavalli, Roberta Serino, Federica Corso, Luca Invernizzi, Filippo De Braud, Claudia Proto, Giuseppe Lo Russo, Giorgia Di Liberti, Claudia Agosta, Ghazal Farhikhteh, Daniela Miliziano & Giorgia Corbo
Department of Electronic, Information and Bioengineering, Politecnico di Milano, Milan, Italy
Leonardo Provenzano, Vanja Miskovic, Laura Mazzeo, Margherita Favali, Aleksandra Zec, Alessandra Pedrocchi & Francesco Trovò
Medical Oncology Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) MultiMedica, Milan, Italy
Maria Gemelli
Department of Oncology and Oncohematology, University of Milan, Milan, Italy
Cecilia Silvestri, Rebecca Romanò, Paolo Ambrosini, Claudia Giani, Chiara Cavalli, Roberta Serino, Giancarlo Pruneri, Daniele Lorenzini, Filippo De Braud, Andrea Vingiani, Beshoy Guirges & Cristina Licciardello
Niguarda Cancer Center, Grande Ospedale Metropolitano Niguarda, Milan, Italy
Rebecca Romanò, Noemi Salmistraro, Elio Gregory Pizzutilo & Diego Signorelli
Radiology Department, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy
Margherita Ruggirello & Moreno Bruno Marino
Unit of Thoracic Oncology, Gavazzeni Humanitas Bergamo, Via Gavazzeni 21, Bergamo, Milan, Italy
Chiara Catania & Antonella Panzardi
Medical Oncology, Santa Maria della Misericordia Hospital, University of Perugia, Perugia, Italy
Giulio Metro
Department of Onco-Hematology, AUSL della Romagna, Ravenna, Italy
Chiara Bennati
Department of Medical Oncology, Scientific Institute for Research, Hospitalization and Healthcare (IRCCS), San Raffaele Scientific Institute, Milan, Italy
Roberto Ferrara
Department of Oncology, Azienda Sanitaria Universitaria Friuli Centrale (ASUFC), Udine, Italy
Marianna Macerelli
Department of Clinical Medicine and Surgery, University of Naples “Federico II”, Naples, Italy
Alberto Servetto
Department of Oncology, Luigi Sacco University Hospital, ASST Fatebenefratelli Sacco, Milan, Italy
Maria Silvia Cona & Nicla La Verde
Medical Oncology, Humanitas Cancer Center, IRCCS Humanitas Research Hospital, Rozzano, Milan, Italy
Luca Toschi
Data science unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy
Paolo Baili
Information and communication technology Unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Milano, Italy
Emanuela Zito
Medical Oncology Unit, Ospedale di Summa A. Perrino, Brindisi, Italy
Saverio Cinieri
Oncology Clinic, Università Politecnica delle Marche, Ospedali Riuniti di Ancona, Ancona, Italy
Rossana Berardi
Scientific Direction, Fondazione IRCCS Istituto Nazionale dei Tumori, Milano, Italy
Giovanni Scoazec & Lorenzo Antonuzzo
Department of Oncology, IRCCS Ospedale Sacro Cuore Don Calabria Hospital, Negrar di Valpolicella, Milano, Italy
Alessandro Inno & Stefania Gori
Medical Oncology Unit, San Giuseppe Moscati Hospital, Taranto, Italy
Salvatore Pisconti & Federica Buzzacchino
Oncology Unit, ASST Cremona, Cremona, Italy
Matteo Brighenti
Oncology Unit, Azienda Ospedaliera Universitaria Maggiore Della Carità, Novara, Italy
Federica Biello
Division of Medical Oncology, Department of Onco-Hematology, IRCCS-CROB, Rionero in Vulture, Italy
Alfredo Tartarone
Department of Diagnostic Innovation, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy
Giancarlo Pruneri, Antonino Belfiore, Luca Agnelli, Alessandro Guidi & Daniele Lorenzini
Department of Radiotherapy, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy
Andrea Riccardo Filippi & Andrea Vingiani
Department of Thoracic Surgery, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy
Piergiorgio Solli
Department of Oncology, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy
Giulia Galli
Medical Oncology Unit, IRCCS Ospedale Policlinico San Martino, Genova, Italy
Carlo Genova & Francesco Verderame
Dipartimento di Medicina Interna e Specialità Mediche (DIMI), Università degli Studi di Genova, Genova, Italy
Carlo Genova
Department of Precision Medicine, Università Degli Studi Della Campania “Luigi Vanvitelli”, Napoli, Italy
Carminia Maria Della Corte
Department of Pneumology and Oncology, PO Monaldi-AORN Ospedali dei Colli, Naples, Italy
Giuseppe Viscardi
Biological Science Division, University of Chicago Medical Center, Chicago, IL, USA
Marina Chiara Garassino
Department of Medicine and Surgery, Università Campus Bio-Medico di Roma, Rome, Italy
Alessio Cortellini
Department of Surgery and Cancer, Hammersmith Hospital Campus, Imperial College London, London, UK
Alessio Cortellini
Medical Oncology, Fondazione Policlinico Universitario Campus Bio-Medico, Roma, Italy
Alessio Cortellini, Emanuele Mingo, Marco Russano & Giulia Barletta
Molecular Immunology Unit, Department of Experimental Oncology, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
Sabina Sangaletti & Gianpaolo Spinelli
Clinical Oncology Unit, Careggi University Hospital, Florence, Italy
Rita Chiari
Medical Oncology Unit, Villa Sofia-Cervello Hospital, Palermo, Italy
Rita Emili
Oncologia Territoriale Ausl Latina, Aprilia, Italy
Federica Bertolini
Oncology Unit, Azienda Ospedaliera Ospedali Riuniti Marche Nord, Fano, Italy
Grisanti Salvatore
Oncology Unit, Ospedale Santa Maria della Misericordia, Urbino, Italy
Emanuele Vita
Division of Medical Oncology, Azienda Ospedaliero-Universitaria Policlinico, Modena, Italy
Chiara Bonalume
Medical Oncology Unit, ASST Spedali Civili, Brescia, Italy
Michele Aieta
Oncology Department, Policlinico Universitario Fondazione “A.Gemelli” IRCCS, Rome, Italy
Luigi Lacriola & Michele Borraccino
Medical Oncology Unit, Fondazione IRCCS Ca’ Granda Ospedale Maggiore di Milano Policlinico, Milano, Italy
Claudia Bareggi, Fabrizio Citarella & Giovanni Apolone
Technology Transfer Office, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy
Silvia Taverna & Antonio Lugini
Medical Oncology Unit, Azienda Ospedaliera San Giovanni Addolorata Hospital, Rome, Italy
Cesare Fattoi & Alfonso Marchianò
Medical Oncology Unit, University Hospital of Parma, Parma, Italy
Alessandro Leonetti

Authors

Arsela Prelaj
View author publications
Search author on:PubMed Google Scholar
Leonardo Provenzano
View author publications
Search author on:PubMed Google Scholar
Vanja Miskovic
View author publications
Search author on:PubMed Google Scholar
Monica Ganzinelli
View author publications
Search author on:PubMed Google Scholar
Laura Mazzeo
View author publications
Search author on:PubMed Google Scholar
Maria Gemelli
View author publications
Search author on:PubMed Google Scholar
Cecilia Silvestri
View author publications
Search author on:PubMed Google Scholar
Andrea Spagnoletti
View author publications
Search author on:PubMed Google Scholar
Rebecca Romanò
View author publications
Search author on:PubMed Google Scholar
Marta Brambilla
View author publications
Search author on:PubMed Google Scholar
Mario Occhipinti
View author publications
Search author on:PubMed Google Scholar
Teresa Beninato
View author publications
Search author on:PubMed Google Scholar
Paolo Ambrosini
View author publications
Search author on:PubMed Google Scholar
Elisa Sottotetti
View author publications
Search author on:PubMed Google Scholar
Margherita Favali
View author publications
Search author on:PubMed Google Scholar
Aleksandra Zec
View author publications
Search author on:PubMed Google Scholar
Alberto Ferrarin
View author publications
Search author on:PubMed Google Scholar
Giulia Corrao
View author publications
Search author on:PubMed Google Scholar
Marco Meazza Prina
View author publications
Search author on:PubMed Google Scholar
Margherita Ruggirello
View author publications
Search author on:PubMed Google Scholar
Moreno Bruno Marino
View author publications
Search author on:PubMed Google Scholar
Andra Diana Dumitrascu
View author publications
Search author on:PubMed Google Scholar
Rosa Maria Di Mauro
View author publications
Search author on:PubMed Google Scholar
Claudia Giani
View author publications
Search author on:PubMed Google Scholar
Chiara Cavalli
View author publications
Search author on:PubMed Google Scholar
Roberta Serino
View author publications
Search author on:PubMed Google Scholar
Chiara Catania
View author publications
Search author on:PubMed Google Scholar
Antonella Panzardi
View author publications
Search author on:PubMed Google Scholar
Giulio Metro
View author publications
Search author on:PubMed Google Scholar
Chiara Bennati
View author publications
Search author on:PubMed Google Scholar
Roberto Ferrara
View author publications
Search author on:PubMed Google Scholar
Marianna Macerelli
View author publications
Search author on:PubMed Google Scholar
Alberto Servetto
View author publications
Search author on:PubMed Google Scholar
Maria Silvia Cona
View author publications
Search author on:PubMed Google Scholar
Nicla La Verde
View author publications
Search author on:PubMed Google Scholar
Luca Toschi
View author publications
Search author on:PubMed Google Scholar
Paolo Baili
View author publications
Search author on:PubMed Google Scholar
Federica Corso
View author publications
Search author on:PubMed Google Scholar
Emanuela Zito
View author publications
Search author on:PubMed Google Scholar
Saverio Cinieri
View author publications
Search author on:PubMed Google Scholar
Rossana Berardi
View author publications
Search author on:PubMed Google Scholar
Giovanni Scoazec
View author publications
Search author on:PubMed Google Scholar
Alessandro Inno
View author publications
Search author on:PubMed Google Scholar
Stefania Gori
View author publications
Search author on:PubMed Google Scholar
Salvatore Pisconti
View author publications
Search author on:PubMed Google Scholar
Federica Buzzacchino
View author publications
Search author on:PubMed Google Scholar
Matteo Brighenti
View author publications
Search author on:PubMed Google Scholar
Federica Biello
View author publications
Search author on:PubMed Google Scholar
Alfredo Tartarone
View author publications
Search author on:PubMed Google Scholar
Giancarlo Pruneri
View author publications
Search author on:PubMed Google Scholar
Antonino Belfiore
View author publications
Search author on:PubMed Google Scholar
Luca Agnelli
View author publications
Search author on:PubMed Google Scholar
Alessandro Guidi
View author publications
Search author on:PubMed Google Scholar
Luca Invernizzi
View author publications
Search author on:PubMed Google Scholar
Noemi Salmistraro
View author publications
Search author on:PubMed Google Scholar
Andrea Riccardo Filippi
View author publications
Search author on:PubMed Google Scholar
Piergiorgio Solli
View author publications
Search author on:PubMed Google Scholar
Giulia Galli
View author publications
Search author on:PubMed Google Scholar
Daniele Lorenzini
View author publications
Search author on:PubMed Google Scholar
Elio Gregory Pizzutilo
View author publications
Search author on:PubMed Google Scholar
Filippo De Braud
View author publications
Search author on:PubMed Google Scholar
Alessandra Pedrocchi
View author publications
Search author on:PubMed Google Scholar
Francesco Trovò
View author publications
Search author on:PubMed Google Scholar
Carlo Genova
View author publications
Search author on:PubMed Google Scholar
Carminia Maria Della Corte
View author publications
Search author on:PubMed Google Scholar
Giuseppe Viscardi
View author publications
Search author on:PubMed Google Scholar
Marina Chiara Garassino
View author publications
Search author on:PubMed Google Scholar
Alessio Cortellini
View author publications
Search author on:PubMed Google Scholar
Emanuele Mingo
View author publications
Search author on:PubMed Google Scholar
Marco Russano
View author publications
Search author on:PubMed Google Scholar
Diego Signorelli
View author publications
Search author on:PubMed Google Scholar
Claudia Proto
View author publications
Search author on:PubMed Google Scholar
Andrea Vingiani
View author publications
Search author on:PubMed Google Scholar
Sabina Sangaletti
View author publications
Search author on:PubMed Google Scholar
Giuseppe Lo Russo
View author publications
Search author on:PubMed Google Scholar

Consortia

the APOLLO11 study group

Giorgia Di Liberti
, Claudia Agosta
, Ghazal Farhikhteh
, Daniela Miliziano
, Giorgia Corbo
, Beshoy Guirges
, Cristina Licciardello
, Lorenzo Antonuzzo
, Francesco Verderame
, Giulia Barletta
, Gianpaolo Spinelli
, Rita Chiari
, Rita Emili
, Federica Bertolini
, Grisanti Salvatore
, Emanuele Vita
, Chiara Bonalume
, Michele Aieta
, Luigi Lacriola
, Michele Borraccino
, Claudia Bareggi
, Fabrizio Citarella
, Giovanni Apolone
, Silvia Taverna
, Antonio Lugini
, Cesare Fattoi
, Alfonso Marchianò
& Alessandro Leonetti

Contributions

Conceptualization: A.P., V.M., L.P., G.LR., M.G., L.M., A.V. Writing: L.P., V.M., M.G., A.P. Data Collection: A.P., L.P., V.M., M.G., L.M., M.G., C.S., A.S., R.R., M.B., M.O., T.B., P.A., E.S., M.F., A.Z., A.F., G.C., M.MP., M.R., MB.M., AD.D., RM.DM., C.G., C.C., R.S., C.C., A.P., G.M., C.B., R.F., M.M., A.S., MS.C., N.LV., L.T., P.B., F.C., E.Z., S.C., R.B., G.S., A.I., S.G., S.P., F.B., M.B., F.B., A.T., G.P., A.B., L.A., A.G., L.I., N.S., AR.F., P.S., G.G., D.L., EG.P., F.DB., A.P., F.T., C.G., CM.DC., G.V., MC.G., A.C., E.M., M.R., D.S., C.P., A.V., S.S., G.LR. Review: A.P., L.P., V.M., M.G., L.M., M.G., C.S., A.S., R.R., M.B., M.O., T.B., P.A., E.S., M.F., A.Z., A.F., G.C., M.MP., M.R., MB.M., AD.D., RM.DM., C.G., C.C., R.S., C.C., A.P., G.M., C.B., R.F., M.M., A.S., MS.C., N.LV., L.T., P.B., F.C., E.Z., S.C., R.B., G.S., A.I., S.G., S.P., F.B., M.B., F.B., A.T., G.P., A.B., L.A., A.G., L.I., N.S., AR.F., P.S., G.G., D.L., EG.P., F.DB., A.P., F.T., C.G., CM.DC., G.V., MC.G., A.C., E.M., M.R., D.S., C.P., A.V., S.S., G.LR. All authors have read and approved the manuscript.

Corresponding author

Correspondence to Arsela Prelaj.

Ethics declarations

Competing interests

A.P.: consulting/advisory role for BMS, AstraZeneca, Novartis, MSD, Lilly, Amgen, Pfizer, Jonsson & Jonsson; travel, accommodations, or other expenses paid or reimbursed by Roche and Jonsson & Jonsson; principal investigator of Spectrum Pharmaceuticals, BMS, Bayer, MSD, Lilly outside the submitted work. Guest Editor for the NPJ Precision Oncology journal special collection: “Artificial Intelligence Biomarkers in Precision Oncology”. L.P.: invited speaker for Pfizer, Novartis, Merck. V.M.: invited speaker for Novartis. L.M.: honoraria from MSD, Novartis; travel grants from Daiichi Sankyo, LeoPharma. A.S.: invited speaker for Novartis, BMS, MSD. C.G.: advisory role/invited speaker for Amgen, AstraZeneca, BMS, Daiichi Sankyo, Eli Lilly, Johnson&Johnson, MSD, Novartis, Pierre Fabre, Regeneron, Roche, Takeda. T.B.: Travel accommodation and conference grants from MSD, Sanofi, Pfizer, and Lilly. Honoraria from MSD. A.R.F.: grants or contracts from AstraZeneca; investigator for Merck Sharp & Dohme, and F. Hoffmann-La Roche; consulting fees from AstraZeneca and Radiomics; and payment or honoraria for lectures, presentations, speaker bureaus, manuscript writing, or educational events from AstraZeneca, F. Hoffmann-La Roche, Takeda, Merck Sharp & Dohme. Travel expenses from AstraZeneca. L.T.: consulting/advisory boards/speaker bureau fees from Roche, AstraZeneca, Sanofi, Beigene, Daiichi Sankyo, Takeda, Pfizer, Regeneron, MSD, Bristol Myers Squibb, Amgen, Johnson & Johnson, Novartis. Principal investigator of trials sponsored by AstraZeneca, ArriVent, Lilly, Roche, Amgen, BMS, PharmaMar, iTeos, and Daiichi Sankyo. A.C.: consultancies/advisory boards: MSD, OncoC4, Roche, Regeneron, BMS, Amgen, Daiichi Sankyo, AstraZeneca, Access Infinity, Ardelis Health, Alpha Sight, Capvision, Techspert, Alira Health, and Lightning Health. He also received speaker fees from Astrazeneca, Roche, Pierre Fabre, MSD, SANOFI/Regeneron; compensation for writing/editorial activity: BMS, MSD; travel support from Sanofi, MSD, Roche, and funding (to institution) from the International Association for the Study of Lung Cancer. A.S.: consultancies/advisory boards for Novartis, Amgen, MSD; speaker fees from Astrazeneca, Regeneron, Roche, Sanofi, Johnson&Johnson, BMS; funding from Italian Association for Cancer Research (AIRC). G.G.: advisory role for Italpharma; travel accommodation or other expenses paid or reimbursed by Roche, Eli Lilly, Amgen; honoraria by AstraZeneca, BMS, MSD. A.Pe.: cofounder and shareholder of two start-up companies, Agade srl and AllyArm srl; speaker for Novartis. C.B.: consultancies/advisory role for AstraZeneca, Novartis, Roche, Amgen, Pfizer, Johnson & Johnson, Daiichi Sankyo; travel, accommodations, or other expenses paid or reimbursed by Roche, Johnson & Johnson, BMS. M.G.: consultancies/advisory boards from BMS, Roche, Regeneron, Amgen, Johnson&Johnson, MSD; speaker fees from Astrazeneca, MSD, Pfizer, compensation for writing/editorial activity from MSD; travel support from Roche, MSD, BMS, Amgen. Astrazeneca. M.M.: advisory board for MSD, speaker fees from Astrazeneca, MSD, Pfizer, compensation for writing/editorial activity from MSD; travel support from Roche, MSD, BMS, Astrazeneca. E.G.P.: speaking fee from AZ, BMS, Regeneron; Travel Grant from Janssen, Roche. D.S.: honoraria from AstraZeneca, BMS, MSD, Roche, Johnson&Johnson, Sanofi, Novartis, Daiichi. Travel grants from MSD, Sanofi, BMS, Roche, AstraZeneca, and Pfizer. M.S.C.: consulting or advisory role for Pfizer, Daiichi Sankyo, Lilly, Gentili, Accord; speaker bureau for Gentili, Techdow; travel expenses from Pfizer, Sanofi, Bayer; research funding from Gilead. G.V.: grants for advisory boards from: Amgen, MSD, Novartis; speaker fees from Amgen, AstraZeneca, BMS, Merck, Pfizer, Regeneron, Roche, Takeda; travel support from AstraZeneca, MSD, Novartis, Sanofi. A.I.: advisory Board/Honoraria from Amgen, AstraZeneca, Merck Sharp & Dohme, Novartis, Roche. Medical writing grant from Merck Serono. Travel support from Amgen, AstraZeneca, Roche, and Sanofi. S.C.: honoraria from Roche, Lilly Oncology, Menarini Stemline, Novartis; AIOM Foundation President. C.C.: honoraria from AstraZeneca, Roche. R.F.: advisory board for MSD and BeiGene.N.L.V.: consulting or advisory role for Novartis, Pfizer, Roche, MSD, Astrazeneca, EISAI; Speaker bureau for Pfizer, Roche, Gentili, Lilly, Gilead, Daiichi Sankyo, Techdow; Travel expenses from Pfizer, Roche; research funding from GSK, Gilead. R.B.: personal fees from Amgen, MSD, Bristol Myers Squibb, Eisai, Roche, and AstraZeneca. G.P.: personal fees from Roche Foundation One, Bayer, Novartis, Lilly. F.d.B: patent for PCT/IB2020/055956 pending and a patent for IT201900009954 pending; honoraria from Roche, EMD Serono, NMS Nerviano Medical Science, Sanofi, MSD, Novartis, Incyte, BMS, Menarini Healthcare Research & Pharmacoepidemiology, Merck Group, Pfizer, Servier, AMGEN, Incyte. M.C.G.: honoraria from MSD Oncology, AstraZeneca/MedImmune, GlaxoSmithKline, Takeda, Roche, Bristol Myers Squibb; consulting or Advisory Role: Bristol Myers Squibb, MSD, AstraZeneca, Novartis, Takeda, Roche, Tiziana Life Sciences, Sanofi, Celgene, Daiiki Sankyo, Inivata, Incyte, Pfizer, Seattle Genetics, Lilly, GlaxoSmithKline, Bayer, Blueprint Medicines, Janssen; speakers’ bureau from AstraZeneca, Takeda, MSD Oncology, Celgene, Incyte, Roche, Bristol Myers Squibb, Otsuka, Lilly; research funding from Bristol Myers Squibb, MSD, Roche/Genentech, AstraZeneca/MedImmune, AstraZeneca, Pfizer, GlaxoSmithKline, Novartis, Merck, Incyte, Takeda, Spectrum Pharmaceuticals, Blueprint Medicines, Lilly, AstraZeneca, Ipsen, Turning Point Therapeutics, Janssen, Exelixis, MedImmune, Array BioPharma, Sanofi; travel and accommodations expenses from Pfizer, Roche, AstraZeneca. C.P.: personal fees from Italfarmaco, AstraZeneca, BMS, and Merck Sharp and Dohme. G.L.R.: consultation, advisory boards, honoraria, or education grants: Merck Sharp and Dohme, Takeda, Amgen, Eli Lilly, BMS, F. Hoffmann-La Roche, Italfarmaco, Novartis, Sanofi, Pfizer, and AstraZeneca. Other authors declare no financial competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Prelaj, A., Provenzano, L., Miskovic, V. et al. APOLLO11: a bio-data-driven model for clinical and translational research in lung cancer. npj Precis. Onc. 10, 96 (2026). https://doi.org/10.1038/s41698-026-01295-3

Download citation

Received: 03 April 2025
Accepted: 17 January 2026
Published: 29 January 2026
Version of record: 03 March 2026
DOI: https://doi.org/10.1038/s41698-026-01295-3