Abstract
Combinations of cancer drugs have the potential to overcome resistance, improve the response rate of existing drugs and reduce dose-limiting toxicity associated with single agents. Existing drug combination databases only provide response data, such as synergy scores between two drugs, without important contextual information to assist oncologists in matching their patients with these combinations in an evidence-based way. To address this gap, we developed a cancer drug combination database (named as OncoDrug+) by manually collecting and integrating drug combinations and corresponding evidences from FDA databases, clinical guidelines, clinical trials, clinical case reports, patient-derived tumor xenograft models, cell line models and bioinformatics predictions. OncoDrug+ includes 7895 data entries, covering 77 cancer types, including unique 2201 drug combination therapies, involving 1200 biomarkers, 763 published reports and seven types of evidence. Unlike many previous databases only include treatment regime and drug response data, OncoDrug+ provides detailed genetic evidences, pharmacological target information and evidence scores supporting each combination strategy, making evidence-based experimental or clinical applications of cancer drug combinations be possible.
Similar content being viewed by others
Background & Summary
With the advancement of sequencing technologies, an increasing number of patients are undergoing testing for the mutation status of cancer driver genes. These driver mutations play a crucial role in processes such as cancer proliferation, invasion and metastasis, and their inhibition often leads to the death of tumor cells1. Leveraging these insights, pharmaceutical companies have developed a series of small-molecule and antibody drugs that selectively inhibit the aberrantly activated proteins encoded by driver oncogenes. The paradigm of “one genetic abnormality — one drug” is considered a novel approach to drug development2. For instance, patients with melanoma harboring the BRAF V600E mutation typically receive treatment with the BRAF inhibitor vemurafenib, demonstrating a favorable response3. This genomic-based approach to cancer treatment is commonly referred to as “precision medicine”. However, therapies based on biomarkers and predetermined single drugs have limitations in patient matching rates, often ranging only from 5–10%4. Additionally, these drug treatments initially exhibit good efficacy, but due to factors such as compensatory signaling activation and tumor heterogeneity, drug resistance often occurs5.
Combination therapies using anticancer drugs have the potential to overcome these limitations observed in single drug treatment6. For example, dual inhibition of BRAF and MEK kinase can alleviate acquired BRAF inhibitor resistance and extend the duration of response, while also reducing skin toxicity7. Currently, large-scale clinical trials related to precision medicine initiatives are ongoing, such as the molecular analysis for therapy choice trial and Investigation of Profile-Related Evidence Determining Individualized Cancer Therapy, investigating the association between drug combination responses and somatic mutations4,8. Simultaneously, the rapidly accumulating data from cancer cell line chemical screens can provide information on the correlation between drug combination sensitivity and cancer cell line mutation characteristics. For example, a study evaluated the efficacy of 2025 clinically relevant drug combinations, generating a dataset encompassing 125 molecularly characterized breast, colorectal and pancreatic cancer cell lines9. Furthermore, relying on the accumulation of data from clinical trials and chemical screens, bioinformatics tools have been proven capable of predicting potential associations between combination therapies and biomarkers, significantly expanding the knowledge base of precision medicine. For instance, the recently developed REcurrent Features LEveraged for Combination Therapy (REFLECT) method utilizes multi-omics data to map features that repeatedly and concurrently change in patient cohorts to combination therapy, accurately predicting the synergistic effects and survival outcomes of drug combinations10.
With the emergence of such studies, there are currently several databases collecting and integrating drug combination data. Examples include the DCDB11, VICC12, DrugCombDB13 and CDCDB14. However, most of these databases provide only response data, such as synergy scores between drug pairs, without including critical biomarker and cancer type information necessary for oncologists to match patients with drug combinations in an evidence-based manner. Therefore, there is an urgent need for a database that integrates all biomarkers/cancer types information with the correlation of drug combinations use.
To this end, we have established cancer drug combination therapy database OncoDrug+, offers the following advantages: (1) systematic integration of response data of drug combinations with biomarkers and cancer types; (2) rapid retrieval of drug combination schemes related to specific drugs, biomarkers and cancer types through a highly interactive web panel; (3) prioritization of each data entry based on genetic and clinical evidences supporting this combination (i.e. FDA approval status of drugs, type of evidence, the reliability and resolution of biomarkers and the outcomes in clinical trials or experimental tests), enabling clinicians to rank treatment options conveniently. We believe this novel platform will greatly assist clinicians in providing scientifically sound drug treatment schemes for patients. OncoDrug+ is free and open to all users without login and registration at http://www.mulinlab.org/oncodrug.
Methods
The data sources and entry headers of OncoDrug+ are displayed in the abstract graph (Fig. 1a,b). This section provided a detailed explanation of the data collection integration processes employed in building OncoDrug+.
Overview of the data and functions of OncoDrug+. (a) The eight data sources of OncoDrug+ (top panel), along with the number of data entries (number in parentheses) and types of important information (Icons with different shapes). There were six types of information: evidence level, drug combination, cancer type, response, action mutation and drug development stage. OncoDrug+ categorized each data entry into different levels of evidence strength based on the diverse provenance and methodologies (bottom left panel). Level A indicated drug combinations sourced from professional guidelines or FDA-approved therapies. Level B indicated drug combinations sourced from clinical trials, well-powered studies and individual case reports from clinical journals or hospital cases. Level C indicated drug combinations sourced from cell lines and PDX experiments. Level D indicated drug combinations sourced from computational prediction. OncoDrug+ allows users to input patient mutation information for querying and displayed feasible drug combination regimens according to prioritization scores from high to low (bottom right panel). TMUSH is the abbreviation of The Second Hospital of Tianjin Medical University. (b) OncoDrug+ encompasses five primary functionalities, notably including search, analysis, visualization, statistics and download. (c) The number of data entries from each data source.
Data sources
We systematically complied drug combination information from three different sources, including six public online drug databases or cancer treatment knowledge bases, 757 biomedical literature and electronic medical records from 233 cancer patients who were treated with combinatorial therapy at The Second Hospital of Tianjin Medical University (TMUSH) (Table 1). We required that drug combinations included in OncoDrug+ should be annotated with cancer type and drug response information. Furthermore, the treatment regimen must incorporate at least one targeted anti-cancer drug. For entries with identical drug combination therapy, biomarkers and cancer types, only those with the highest evidence strength level (see the section Definition of the evidence level for more details) were retained. The details are described as below (Fig. 1c):
-
1)
Collecting data from drug databases or cancer treatment knowledge bases: The databases included: (i) VICC aggregated, harmonized and analyzed clinical interpretations of cancer variants12. We downloaded raw JSON files from its website (https://search.cancervariants.org/) and defined entries in the “response” column labeled as “responsive” or “sensitivity” as sensitive. In the VICC, we collected 298 drug combination data entries. (ii) DCDB amassed data on approved or investigational drug combinations for various diseases11. We downloaded raw data from the DCDB database and defined drug combinations labeled as “Efficacious” as sensitive. In the DCDB, we collected 177 drug combination data entries. (iii) DrugCombDB collected and integrated information from high-throughput drug screening experiments on cell lines13. We downloaded raw data from the DrugCombDB database and evaluated drug combinations based on synergy scores, including HSA, Bliss, Loewe, and ZIP, as defined in the literature. Drug combinations with synergy scores above the upper quartile of each model’s score distribution were defined as synergistic. Drug combinations were classified based on a majority voting strategy, and a combination was designated as synergistic only when all four scoring models consistently indicated synergy. We collected 6 drug combination data entries. (iv) NCI (National Cancer Institute)/National Comprehensive Cancer Network (NCCN) Clinical Practice Guidelines in Oncology included FDA-approved treatment regimens. For the NCI/NCCN Guideline, we manually searched for drug combination regimens on corresponding websites (https://www.cancer.gov/about-cancer/treatment/drugs/cancer-type and https://www.nccn.org/guidelines/category_1). Entries containing terms such as “response” or “improve survival” were classified as sensitive. In the NCI/NCCN Guidelines, we collected 60 drug combination data entries. (v) We searched the ClinicalTrials.gov website (https://clinicaltrials.gov/) using the following parameters: condition or disease is set to “cancer,” clinical phase is set to “early phase 1,” “phase 1–4,” and primary completion date is set to “from 2020/1/1 to 2024/6/30,” resulting in 11761 records. Based on the inclusion criteria, 349 records were retained. And studies whose “Detailed Description” included terms such as “better outcome,” “clinical benefit,” “partial or complete response,” “stable disease,” “sensitive” or “improved survival” were classified as sensitive. From ClinicalTrials.gov database, we collected 349 drug combination data entries. (vi) The bioinformatic algorithm REFELCT, which predicted drug combination, was collected in the Oncodrug+. For bioinformatics prediction data, we included only REFLECT data for integration into OncoDrug+. All data entries in this database were classified as sensitive. The rationale for this decision is: Firstly, The REFLECT method identifies precision drug combination therapies based on multi-omic co-alteration signatures, including mutations. In contrast, other predictive approaches, such as OncoTreat and CellNOpt, primarily rely on mRNA expression data and signaling pathway databases15,16. Compared to these methods, REFLECT aligns with OncoDrug+ ‘s design principles, as both curate drug combination strategies based on specific mutation-driven molecular features. Drug combinations predicted by REFLECT exhibit higher interpretability. Secondly, each REFLECT signature defines a patient sub-cohort, with members distinguished from other sub-cohorts based on recurrent co-alterations. Finally, the REFLECT pipeline has been validated with data from patient-derived xenografts, in vitro drug screens and clinical trials involving combination therapies, supporting the reliability of its predictions. As REFLECT does not provide direct drug combination data, we employed the following strategy to integrate relevant drug information. For each gene identified in the REFLECT signatures, we queried the DGIdb database to retrieve all potential drugs and selected the FDA-approved agent with the highest interaction score as the most reliable candidate17. In DGIdb, the interaction score reflects the strength of evidence supporting a gene–drug interaction. It incorporates factors such as the number of supporting publications and data sources, the degree of overlap between the drug and gene with other interacting partners in the query set, and the extent of known interactions across the database. This means that genes and drugs with many overlapping interactions in the search set will rank more highly, with the caveat that drugs or genes involved in many interactions, in general, will have lowered scores. Accordingly, a higher interaction score indicates greater confidence in the gene–drug relationship. Using this approach, we assigned a single high-confidence drug to each gene, where applicable. Based on these gene–drug mappings, we annotated the REFLECT entries with drug combination information. Data entries for which no drug could be matched to the target gene were excluded from further analysis. These data were extracted from the REFLECT website (https://bioinformatics.mdanderson.org/reflect/) using a bespoke Python script, including a total number of 5066 data entries.
-
2)
Collecting data from biomedical literature: We compiled drug combinations with synergy from by performing literature survey. As the results, we obtained 1577 drug combination data entries from six large-scale cell line-based drug screening experiments and 30 data entries from PDX-based drug screening experiments18,19,20,21,22. The ALMANAC, AZ-DREAM, and O’Neil et al. datasets provide high-throughput platforms for the unbiased identification of synergistic drug combinations18,23,24. As the study by Menden et al. (2019) quantified drug combination synergy and predicted associated biomarkers, we extracted these datasets directly from their publication24. Drug combinations with synergy scores above the upper quartile in each dataset were defined as exhibiting synergistic effects. Based on this criterion and after data pre-processing, we identified 312 unique drug combination entries from a total of 2,032 records annotated with both synergy scores and synergy biomarkers. Another 13 drug combination data entries were collected from 13 reports published between 2010 and 202425,26,27,28,29,30,31,32,33,34,35,36,37. For these entries, synergistic effects were defined based on the reported descriptions of drug combination outcomes in the original literature. The corresponding textual descriptions are provided in the “Drug combination response in sources” column of the figshare table. The list of these articles can also be found in Table S1 (see Supplementary Information document).
-
3)
Collecting data from electronic medical records: We collected electronic medical record data because they serve as a valuable resource for drug combination information. Case-based studies are valuable as these real-world data enhance the mapping of mutational features to drug responses, contributing to the advancement of precision medicine in cancer research. Such insights may further optimize treatment strategies for heterogeneous cancers through tailored drug combinations. Representative examples include the I-PREDICT study and the study by Gainor et al.4,38. We retrospectively collected drug response data in combinational drug therapies from electronic medical records of 233 patients treated with targeted therapy in the TMUSH between 2018 to 2023. These patients underwent targeted sequencing or exome sequencing. Possible driver mutations in their tumor cells were identified. Molecular pathology and treatment history of patients were listed in Table S2 (see Supplementary Information document). The most prevalent cancer types were non-small cell lung cancer (27%, N = 62), invasive breast carcinoma (18%, N = 41) and colorectal adenocarcinoma (17%, N = 39). The median number of detectable biomarkers per patient was 1 (range: 1–5 biomarkers), with the most common biomarkers being HER2 Mutation (13%, N = 31), EGFR Mutation (9%, N = 22) and VEGFR Mutation (5%, N = 13). Patients received a median of 2 drugs in their regimen (range: 2–6 drugs). The most frequent drug combination therapies were bevacizumab + paclitaxel (4%, N = 11), bevacizumab + pemetrexed (3%, N = 8) and ruxolitinib + lenvatinib + sintilimab (3%, N = 8). The response outcomes of combination drug therapy were defined based on the Response Evaluation Criteria in Solid Tumors in patient medical records39. 92% of patients were assessed as sensitive to treatment (defined as complete response (N = 3), partial response (N = 171), stable disease (N = 41), while the remaining patients were evaluated as resistant to treatment (defined as progressive disease (N = 18). Additionally, 99% of patients had metastatic lesions (N = 231).
The clinical study on drug combination was approved by hospital ethics committees of the TMUSH (KY2021K105). Participants were recruited through purposive sampling from the Department of Oncology at the TMUSH. Each recruited participant was a patient with advanced cancer who had exhausted standard treatment options or was intolerant to standard therapy. All participants provided written informed consent to receive the study treatment. Before treatment, all patients were fully informed of the study’s objectives, the principle of voluntary participation, and the data usage policy. Participants were required to be 18 years of age or older and diagnosed with advanced cancer. The patients’ ages ranged from 33 to 89 years. Since minors were not included in the study, guardian consent was not required.
All participants signed a written informed consent form, which included the following: (i) Study Background: This section primarily outlines (a) the study objective, which is a prospective observational study exploring personalized targeted therapy for patients with advanced refractory solid tumors, and (b) ethical approval from the Medical Ethics Committee of the TMUSH. (ii) Study Design and Procedures: The study design involves the collection of clinical baseline data, next-generation sequencing reports, drug therapy information, efficacy assessment data, and outpatient follow-up records for cancer patients treated in the oncology department. The study procedures include sample collection processes, sample data annotation and management, and measures for ensuring the confidentiality of personal information. (iii) Potential Risks and Benefits and (iv) Study Costs and Compensation. Participants consented to the public disclosure of anonymized data and its use in this study.
Data anonymization and protection measures were used, including: (i) Direct identifiers (e.g., patient names) were pseudonymized. Each enrolled patient was assigned a unique identifier, which was used during data processing and analysis. (ii) Indirect identifiers (e.g., patient occupation, address) were blurred during data collection and directly removed during data processing and analysis. (iii) Upon hospital admission, all patients’ clinical baseline data were recorded in the clinical patient management system of the TMUSH. Data storage was secured using encryption and access control systems to prevent unauthorized disclosure.
The current version of OncoDrug+ comprises a comprehensive collection of 7895 entries documenting various drug combinations. Among these, entries sourced from FDA approvals, guidelines, case reports, clinical trials, PDX, cell lines and bioinformatics predictions are 16, 58, 259, 701, 30, 1665 and 5066, respectively. Within this dataset, there are 2201 unique drug combination strategies, involving 1200 biomarkers and spanning across 77 different cancer types (Figure S1, see Supplementary Information document).
Drug nomenclature and classification
To standardize drug nomenclature, we aligned the names of collected drug with drug accession numbers of DrugBank database40. There are 4591, 3177, 82, 17 and 2 entries involving two, three, four, five and six drugs, respectively (Fig. 2a). We also downloaded drug background, mechanism of action, target, pharmacodynamics and indication from DrugBank and integrated them into OncoDrug+. All drugs were categorized into targeted and non-targeted classes utilizing the ATC classification system provided by DrugBank. Specifically, drugs with ATC codes L04 (immunosuppressants, e.g. sirolimus), L03 (immunostimulants, e.g. filgrastim), L02 (endocrine therapy, e.g. tamoxifen), L01F (monoclonal antibodies and antibody-drug conjugates, e.g. trastuzumab), L01E (protein kinase inhibitors, e.g. afatinib) and L01X (other antineoplastic agents, e.g. vismodegib) were classified as targeted drugs, whereas the remaining drugs were classified as non-targeted. Additionally, for categories L04, L03, L02 and L01X, we manually selected as classic targeted drugs. Within OncoDrug+, there are 4766 entries exclusively involving targeted drugs, while 3129 entries encompassing both targeted and non-targeted drugs (Fig. 2b).
Statistical information on drugs, evidence levels, cancer types and biomarkers in OncoDrug+. (a) Distribution of drug combination entries by the number of drugs included. (b) Distribution of drug combination entries containing only targeted drugs versus those containing both targeted and non-targeted drugs. (c) Distribution of mutation types in mutated genes and HGVSp. (d) Distribution of entries categorized by different levels of evidence. (e) Distribution of entries related to different cancer types. (f) Distribution of sensitive and resistant drug combination data entries.
Definition of the evidence level
Given the diverse sources of these drug combination records, resulting in disparate evidence strengths, a universal standard was designed to map each entry to a reasoned evidence strength level. (1) Level A includes drug combinations substantiated by evidence from professional guidelines or FDA-approved therapies related to a specific biomarker and disease. For instance, the FDA approval of dabrafenib in combination with trametinib for treating melanoma patients with the BRAF V600E mutation41. (2) Level B consists of drug combinations supported by evidence from clinical trials, well-powered studies and individual case reports from clinical journals or hospital cases. An example is the responsiveness of non-small cell lung cancer patients with the EGFR T790M mutation to the combination of Erlotinib and bevacizumab, as observed in ClinicalTrials.gov42. (3) Level C comprises drug combinations supported by evidence sourced from in vivo or in vitro experiments, such as mouse studies, cell lines, molecular assays and so on. For example, the combined use of fluorouracil and trametinib significantly inhibits the proliferation in breast cancer cell lines carrying BRCA2 mutations9. (4) Level D is computational prediction (i.e. the REFLECT record in this work), such as the prediction that melanoma patients with CCND1 amplification can be sensitive to ribociclib and venetoclax10. As the results, 74, 960, 1795 and 5066 entries were classified as evidence levels A, B, C and D, respectively (Fig. 2d).
Unifying mutation annotations and cancer type
For biomarkers, standardization was according to mutation format of the Cancer Genome Interpreter database, encompassing formats included those in the Human Genome Variation Society (HGVS), encompassing the cDNA sequence (c.), protein sequence (p.) and nucleotide sequence (g.) of the corresponding genes, as well as copy number alterations and translocations43. The collected entries in OncoDrug+ encompass a total of 812 genetic mutations and 1200 HGVSp variants (Fig. 2c). The top five genetic mutation-associated genes in the collected data entries are CCND1, PTEN, EGFR, CDKN2B and CCND3. Cancer types related to drug combinations were mapped to the secondary level of the OncoTree cancer classification system44. For specific cancer types, OncoDrug+ gathered 791, 653, 416, 415, 409 and 5211 data entries for invasive breast carcinoma, non-small cell lung cancer, esophagogastric adenocarcinoma, colorectal adenocarcinoma, endometrial carcinoma and other cancer types, respectively (Fig. 2e).
Calculation of prioritization score
To prioritize the quality of each drug combination, we used a rank system based on several key criteria, including: reliability of evidences (or evidence level score), FDA approval status of drugs, degree of match between biomarkers and drug combinations, precision of biomarkers and outcome of drug combination treatment. Users can rank the credibility of drug combination therapy from high to low using this prioritization score, facilitating in selection of highly credible drug combination therapy for patients.
Specifically, for evidence level score (E_score), we assigned data entries in level A, B, C and D, with corresponding scores of 4, 3, 2 and 1. For FDA approved score (F_score), a score = 3 indicates that all drugs are FDA-approved, a score = 2 signifies that some are approved, and a score = 1 is given when none of the drugs have received FDA approval. For biomarker matching score (M_score), a score = 3 is given when the mutation indicates a response to the entire drug combination, score = 2 when it indicates a response to one or some of the drugs, and score = 1 when there is no annotation of biomarkers. For biomarker precision score (P_score), if all annotated biomarkers in the data entry are at the single amino acid change level, the score is 3. If some biomarkers were at the single amino acid change level or some were at the gene level, the score is 2. If all annotated biomarkers were at the gene level, the score is 1. For response score (R_score), if the drug combination response outcome is sensitive, the score is 1. If it was resistant, the score is −1. OncoDrug+ encompasses 7664 entries annotated as patients with a sensitive response and 231 entries annotated as patients with a resistant response (Fig. 2f). The prioritization score is then calculated according to the following formula:
where E_score is evidence level score, F_score is FDA approved score, M_score is biomarker matching score, P_score is biomarker precision score, and R_score is response score (Fig. 3a).
The prioritization score system evaluates the feasibility of known sensitive results based on factors such as the experimental approach used to assess drug combinations, the FDA approval status of the drugs, and the accuracy of genetic feature matching. However, it does not assess the potential for combining any two or more drugs for cancer treatment. Furthermore, we have collected drug combination data with definitive evidences supporting drug resistance or ineffectiveness in patient response and prognosis post-treatment, defining these entries as having a “Response score” of −1 or “resistant” in the “Response” columns. And these drug combinations are assigned with negative prioritization scores according to the system. Using OD-A-0010 and OD-C-2783 as examples to illustrate the meaning of our prioritization score system, both of which are used for melanoma treatment, we compared the factors contributing to their differing prioritization scores. OD-A-0010 has a prioritization score of 13, supported by FDA website data indicating that both Dabrafenib and Vemurafenib are FDA-approved drugs. The mutation BRAF V600E is explicitly identified and predicts responsiveness to the entire drug combination. Consequently, the evidence level score is 3, the FDA approval score is 3, the biomarker matching score is 3, the biomarker precision score is 3, and the response score is 1. In contrast, OD-C-2783 has a prioritization score of 6. This entry is based on data from a PDX model, where MK-2206 remains unapproved, and no specific mutation feature is identified. The response outcome is classified as “sensitive.” Accordingly, the evidence level score is 2, the FDA approval score is 2, the biomarker matching score is 1, the biomarker precision score is 1, and the response score is 1.
The distribution of drug prioritization scores indicates that the top three scores, ranked by the number of entries, are 8, 9 and 10, with corresponding entry counts of 3529, 2468 and 899, respectively. Furthermore, the highest prioritization scores for datasets categorized by evidence levels A, B, C and D are 13, 12, 11 and 10, respectively, with the respective numbers of data entries being 30, 143, 59 and 331 (Fig. 3b).
Data Records
To keep the database up to date, we manually review articles from PubMed and Google Scholar every six months to extract new literature-supported drug combinations and corresponding evidences. Specifically, we use the keywords ‘drug combination(s)’, ‘combination drug(s)’, ‘combinatorial drug(s)’, and ‘anti-cancer’ to search the PubMed database, focusing on publications that include at least one of these terms in their titles or abstracts. We are going to updated OncoDrug+ annually, and new results will be released on figshare and OncoDrug+’s website. All drug combination entries segmented by different evidence strength levels can be downloaded from figshare at https://doi.org/10.6084/m9.figshare.27795573.v445. To access the latest version of OncoDrug+, please also visit our website: http://www.mulinlab.org/oncodrug.
The following are the details and definitions of each column in the data tables: (1) Drug combination ID: a unique OncoDrug +accession number for each entry. (2) Evidence level: OncoDrug+classifies the data into four evidence levels based on the diverse provenance and methodologies of each data entry. (3) Prioritization score: a credibility score assigned to drug combinations. (4) Targeted drug & (5) Non-targeted drug: the names of targeted (non-targeted) drugs in the drug combination. (6) Cancer type: cancer types related to the drug combination. It is corresponding to the secondary level of the OncoTree cancer classification system. (7) Biomarker: biomarkers used to predict a patient’s response to the drug combination. (8) Response:for patient experimental datasets, drug response is defined according to the Response Evaluation Criteria in Solid Tumors. For cell line or mouse model data, drug response is determined based on the drug synergy scores reported in the literature. This involves calculating synergy scores by observing changes in cell viability or tumor volume in cell lines or mouse models treated with single-agent targeted therapies and combination therapies. High synergy scores are defined as positive drug response outcomes. For computational prediction models, drug combinations predicted to have synergistic effects are considered to have positive response outcomes. Based on response information, drug combinations are classified as sensitive or resistant. (9) Adverse effect: an undesired effect of drug combination treatment if it is available. (10) Drug dosage: the specific dosage or weight in medication administered. (11) Drug targets for targeted drug & (12) Non-targeted drug: drug target (non-targeted drug) information obtained from DrugBank. (13) Evidence level score, (14) Biomarker matching score, (15) FDA approved score, (16) Response score and (17) Biomarker precision score: see the section of Calculation of prioritization score in Methods for more details. (18) Sources of evidence level: the sources of data for each entry, which include FDA approvals, guidelines, case reports, clinical trials, patient-derived xenografts, cell lines, and bioinformatics predictions. (19) Stage: The cancer stage reported with the drug indication, including metastatic and primary stages. (20) Drug combination indications in sources: the drug indications recorded in the raw data. (21) Drug combination response in sources: the drug response recorded in the raw data. (22) Detailed descriptions in sources: detailed descriptions about drug indications recorded in the raw data. (23) Data source: the hyperlink to the evidence source.
Technical Validation
Through systematic literature survey, we compiled data on combination drug therapy from the NCI, NCCN Clinical Practice Guidelines in Oncology and ClinicalTrials.gov. The clinical records system at the TMUSH documented clinical baseline data, next-generation sequencing reports, drug therapy, efficacy assessment data and outpatient follow-up information for cancer patients treated in the oncology department.
To ensure the comprehensiveness of our data collection, we conducted a thorough search for databases, literature and clinical guidelines related to combination drug therapy for cancer from January 2000 to July 2024. In addition, we gathered case data for cancer patients admitted to the TMUSH from January 2018 to April 2023. We filtered the entries to include only those related to anti-cancer drug combination therapies that incorporated at least one targeted drug and contained detailed information on biomarkers and cancer types. To guarantee the accuracy and reliability of data collection process, the extraction of data from literature, clinical guideline materials and clinical records data was independently conducted by TY and NL. This independent extraction was crucial for maintaining consistency and verifying the integrity of the collected data.
To evaluate data consistency across different levels, we quantified the overlap in drug combinations, drug-cancer type combinations, and drug-cancer-biomarker combinations (Figure S2, see Supplementary Information document). As shown in Fig. S2a, level A includes 50 drug combinations, with a 50% overlap with combinations recorded at other levels. Level B contains 412 drug combinations, with a 6% overlap, while level C has 298 drug combinations, with a 3% overlap. Level D comprises 1448 drug combinations, with a 0.6% overlap. The overall overlap across all levels is 4%. In Fig. S2b, the overall overlap across levels is 5%, while in Fig. S2c, it is 0.3%. In summary, level A shows the highest overlap with other levels, indicating greater accuracy compared to lower evidence strength levels. Furthermore, the complementary nature of data across different evidence levels suggests a relatively comprehensive data collection.
Given that our database contains multiple records on melanoma-related treatment regimens and considering the rapid evolution of combination therapies for melanoma, we compared clinical guidelines of BRAF V600E mutant melanoma with our scoring system to ensure its alignment with real-world clinical practices. OD-A-0010 was classified as level A with a prioritization score of 13, supported by data from the FDA website indicating that both Dabrafenib and Vemurafenib are FDA-approved drugs. The BRAF V600E mutant melanoma is associated with a response to this drug combination, and the recommended regimen is considered a first-line treatment for this cancer type46. OD-B-0135 was classified as level B with a prioritization score of 12, based on an individual case report. Its recommended regimen, consisting of Palbociclib and Vemurafenib, includes FDA-approved drugs. The mutation feature suggests a response to the entire drug combination, and the regimen is typically considered a second- or third-line treatment for this cancer type47. OD-C-1425 was classified as level C with a prioritization score of 10, based on cell line experiments. Its recommended regimen includes Dactolisib, which is not FDA-approved. However, the mutation feature suggests a response to the combination, with Selumetinib and Dactolisib currently under clinical investigation48. OD-D-5123 was classified as level D with a prioritization score of 9, based on bioinformatics predictions. Its recommended regimen, Dabrafenib and Plerixafor, has not been recognized as a standard treatment for any malignancy in international clinical guidelines. And Dinaciclib has not received FDA approval for marketing. These examples show that higher prioritization scores tend to correlate with treatment regimens more closely aligned with established standards for BRAF V600E-mutant melanoma.
Code availability
All of the source code for OncoDrug+ database generation has been uploaded to GitHub (https://github.com/mulinlab/OncoDrug). No custom code was used during this study for the curation and/or validation of the dataset.
References
Martinez-Jimenez, F. et al. A compendium of mutational cancer driver genes. Nat Rev Cancer 20, 555–572, https://doi.org/10.1038/s41568-020-0290-x (2020).
Cacabelos, R. New Paradigms in Pharmaceutical Development. Life (Basel) 12, https://doi.org/10.3390/life12091433 (2022).
Kim, A. & Cohen, M. S. The discovery of vemurafenib for the treatment of BRAF-mutated metastatic melanoma. Expert Opin Drug Discov 11, 907–916, https://doi.org/10.1080/17460441.2016.1201057 (2016).
Sicklick, J. K. et al. Molecular profiling of cancer patients enables personalized combination therapy: the I-PREDICT study. Nat Med 25, 744–750, https://doi.org/10.1038/s41591-019-0407-5 (2019).
Vasan, N., Baselga, J. & Hyman, D. M. A view on drug resistance in cancer. Nature 575, 299–309, https://doi.org/10.1038/s41586-019-1730-1 (2019).
Al-Lazikani, B., Banerji, U. & Workman, P. Combinatorial drug therapy for cancer in the post-genomic era. Nat Biotechnol 30, 679–692, https://doi.org/10.1038/nbt.2284 (2012).
Eroglu, Z. & Ribas, A. Combination therapy with BRAF and MEK inhibitors for melanoma: latest evidence and place in therapy. Ther Adv Med Oncol 8, 48–56, https://doi.org/10.1177/1758834015616934 (2016).
O’Dwyer, P. J. et al. The NCI-MATCH trial: lessons for precision oncology. Nat Med 29, 1349–1357, https://doi.org/10.1038/s41591-023-02379-4 (2023).
Jaaks, P. et al. Effective drug combinations in breast, colon and pancreatic cancer cells. Nature 603, 166–173, https://doi.org/10.1038/s41586-022-04437-2 (2022).
Li, X. et al. Precision Combination Therapies Based on Recurrent Oncogenic Coalterations. Cancer Discov 12, 1542–1559, https://doi.org/10.1158/2159-8290.CD-21-0832 (2022).
Liu, Y. et al. DCDB 2.0: a major update of the drug combination database. Database (Oxford) 2014, bau124, https://doi.org/10.1093/database/bau124 (2014).
Wagner, A. H. et al. A harmonized meta-knowledgebase of clinical interpretations of somatic genomic variants in cancer. Nat Genet 52, 448–457, https://doi.org/10.1038/s41588-020-0603-8 (2020).
Liu, H. et al. DrugCombDB: a comprehensive database of drug combinations toward the discovery of combinatorial therapy. Nucleic Acids Res 48, D871–D881, https://doi.org/10.1093/nar/gkz1007 (2020).
Shtar, G., Azulay, L., Nizri, O., Rokach, L. & Shapira, B. CDCDB: A large and continuously updated drug combination database. Sci Data 9, 263, https://doi.org/10.1038/s41597-022-01360-z (2022).
Mundi, P. S. et al. A Transcriptome-Based Precision Oncology Platform for Patient-Therapy Alignment in a Diverse Set of Treatment-Resistant Malignancies. Cancer Discov 13, 1386–1407, https://doi.org/10.1158/2159-8290.CD-22-1020 (2023).
Gjerga, E. et al. Converting networks to predictive logic models from perturbation signalling data with CellNOpt. Bioinformatics 36, 4523–4524, https://doi.org/10.1093/bioinformatics/btaa561 (2020).
Cannon, M. et al. DGIdb 5.0: rebuilding the drug-gene interaction database for precision medicine and drug discovery platforms. Nucleic Acids Res 52, D1227–D1235, https://doi.org/10.1093/nar/gkad1040 (2024).
O’Neil, J. et al. An Unbiased Oncology Compound Screen to Identify Novel Combination Strategies. Mol Cancer Ther 15, 1155–1162, https://doi.org/10.1158/1535-7163.MCT-15-0843 (2016).
Gao, H. et al. High-throughput screening using patient-derived tumor xenografts to predict clinical trial drug response. Nat Med 21, 1318–1325, https://doi.org/10.1038/nm.3954 (2015).
Nair, N. U. et al. A landscape of response to drug combinations in non-small cell lung cancer. Nat Commun 14, 3830, https://doi.org/10.1038/s41467-023-39528-9 (2023).
Narayan, R. S. et al. A cancer drug atlas enables synergistic targeting of independent drug vulnerabilities. Nat Commun 11, 2935, https://doi.org/10.1038/s41467-020-16735-2 (2020).
Bashi, A. C. et al. Large-scale Pan-cancer Cell Line Screening Identifies Actionable and Effective Drug Combinations. Cancer Discov, OF1-OF20, https://doi.org/10.1158/2159-8290.CD-23-0388 (2024).
Holbeck, S. L. et al. The National Cancer Institute ALMANAC: A Comprehensive Screening Resource for the Detection of Anticancer Drug Pairs with Enhanced Therapeutic Activity. Cancer Res 77, 3564–3576, https://doi.org/10.1158/0008-5472.CAN-17-0489 (2017).
Menden, M. P. et al. Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen. Nat Commun 10, 2674, https://doi.org/10.1038/s41467-019-09799-2 (2019).
Lee, E. Q. et al. NRG/RTOG 1122: A phase 2, double-blinded, placebo-controlled study of bevacizumab with and without trebananib in patients with recurrent glioblastoma or gliosarcoma. Cancer 126, 2821–2828, https://doi.org/10.1002/cncr.32811 (2020).
Moran, T. et al. A phase Ib trial of continuous once-daily oral afatinib plus sirolimus in patients with epidermal growth factor receptor mutation-positive non-small cell lung cancer and/or disease progression following prior erlotinib or gefitinib. Lung Cancer 108, 154–160, https://doi.org/10.1016/j.lungcan.2017.03.009 (2017).
Flaherty, K. T. et al. BEST: A Randomized Phase II Study of Vascular Endothelial Growth Factor, RAF Kinase, and Mammalian Target of Rapamycin Combination Targeted Therapy With Bevacizumab, Sorafenib, and Temsirolimus in Advanced Renal Cell Carcinoma–A Trial of the ECOG-ACRIN Cancer Research Group (E2804). J Clin Oncol 33, 2384–2391, https://doi.org/10.1200/JCO.2015.60.9727 (2015).
Mokdad, A. A. et al. Efficacy and Safety of Bavituximab in Combination with Sorafenib in Advanced Hepatocellular Carcinoma: A Single-Arm, Open-Label, Phase II Clinical Trial. Target Oncol 14, 541–550, https://doi.org/10.1007/s11523-019-00663-3 (2019).
Bardia, A. et al. Phase Ib Study of Combination Therapy with MEK Inhibitor Binimetinib and Phosphatidylinositol 3-Kinase Inhibitor Buparlisib in Patients with Advanced Solid Tumors with RAS/RAF Alterations. Oncologist 25, e160–e169, https://doi.org/10.1634/theoncologist.2019-0297 (2020).
Gelderblom, H. et al. Imatinib in combination with phosphoinositol kinase inhibitor buparlisib in patients with gastrointestinal stromal tumour who failed prior therapy with imatinib and sunitinib: a Phase 1b, multicentre study. Br J Cancer 122, 1158–1165, https://doi.org/10.1038/s41416-020-0769-y (2020).
Voss, M. H. et al. A randomized phase II trial of CRLX101 in combination with bevacizumab versus standard of care in patients with advanced renal cell carcinoma. Ann Oncol 28, 2754–2760, https://doi.org/10.1093/annonc/mdx493 (2017).
Grande, E. et al. Sunitinib and Evofosfamide (TH-302) in Systemic Treatment-Naive Patients with Grade 1/2 Metastatic Pancreatic Neuroendocrine Tumors: The GETNE-1408 Trial. Oncologist 26, 941–949, https://doi.org/10.1002/onco.13885 (2021).
van den Bent, M. J. et al. Bevacizumab and temozolomide in patients with first recurrence of WHO grade II and III glioma, without 1p/19q co-deletion (TAVAREC): a randomised controlled phase 2 EORTC trial. Lancet Oncol 19, 1170–1179, https://doi.org/10.1016/S1470-2045(18)30362-0 (2018).
Kelly, C. M. et al. A phase Ib study of BGJ398, a pan-FGFR kinase inhibitor in combination with imatinib in patients with advanced gastrointestinal stromal tumor. Invest New Drugs 37, 282–290, https://doi.org/10.1007/s10637-018-0648-z (2019).
Joly, F. et al. Paclitaxel with or without pazopanib for ovarian cancer relapsing during bevacizumab maintenance therapy: The GINECO randomized phase II TAPAZ study. Gynecol Oncol 166, 389–396, https://doi.org/10.1016/j.ygyno.2022.06.022 (2022).
Hussain, M. et al. A randomized phase 2 trial of gemcitabine/cisplatin with or without cetuximab in patients with advanced urothelial carcinoma. Cancer 120, 2684–2693, https://doi.org/10.1002/cncr.28767 (2014).
Soria, J. C. et al. Gefitinib plus chemotherapy versus placebo plus chemotherapy in EGFR-mutation-positive non-small-cell lung cancer after progression on first-line gefitinib (IMPRESS): a phase 3 randomised trial. Lancet Oncol 16, 990–998, https://doi.org/10.1016/S1470-2045(15)00121-7 (2015).
Gainor, J. F. et al. Dramatic Response to Combination Erlotinib and Crizotinib in a Patient with Advanced, EGFR-Mutant Lung Cancer Harboring De Novo MET Amplification. J Thorac Oncol 11, e83–85, https://doi.org/10.1016/j.jtho.2016.02.021 (2016).
Eisenhauer, E. A. et al. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer 45, 228–247, https://doi.org/10.1016/j.ejca.2008.10.026 (2009).
Wishart, D. S. et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res 36, D901–906, https://doi.org/10.1093/nar/gkm958 (2008).
Winn, R. J. & McClure, J. The NCCN clinical practice guidelines in oncology: a primer for users. J Natl Compr Canc Netw 1, 5–13, https://doi.org/10.6004/jnccn.2003.0003 (2003).
Piccirillo, M. C. et al. Addition of Bevacizumab to Erlotinib as First-Line Treatment of Patients With EGFR-Mutated Advanced Nonsquamous NSCLC: The BEVERLY Multicenter Randomized Phase 3 Trial. J Thorac Oncol 17, 1086–1097, https://doi.org/10.1016/j.jtho.2022.05.008 (2022).
Tamborero, D. et al. Cancer Genome Interpreter annotates the biological and clinical relevance of tumor alterations. Genome Med 10, 25, https://doi.org/10.1186/s13073-018-0531-8 (2018).
Kundra, R. et al. OncoTree: A Cancer Classification System for Precision Oncology. JCO Clin Cancer Inform 5, 221–230, https://doi.org/10.1200/CCI.20.00108 (2021).
Dong, X. OncoDrug+ data 2.0. https://doi.org/10.6084/m9.figshare.27795573.v4 (2024).
Robert, C. et al. Improved overall survival in melanoma with combined dabrafenib and trametinib. N Engl J Med 372, 30–39, https://doi.org/10.1056/NEJMoa1412690 (2015).
Louveau, B. et al. Phase I-II Open-Label Multicenter Study of Palbociclib + Vemurafenib in BRAF (V600MUT) Metastatic Melanoma Patients: Uncovering CHEK2 as a Major Response Mechanism. Clin Cancer Res 27, 3876–3883, https://doi.org/10.1158/1078-0432.CCR-20-4050 (2021).
El Zaoui, I. et al. Conjunctival Melanoma Targeted Therapy: MAPK and PI3K/mTOR Pathways Inhibition. Invest Ophthalmol Vis Sci 60, 2764–2772, https://doi.org/10.1167/iovs.18-26508 (2019).
Acknowledgements
We thank the described patients for participating in our studies. The work was supported by the following grants: National Natural Science Foundation of China, 32070675 (M.J.L.) and 31801122 (X.D.). Youth Research Incubation Fund of School of Basic Medical Sciences, Tianjin Medical University, 2024FY05 (X.D.).
Author information
Authors and Affiliations
Contributions
M.J.L. and X.D. conceived this work. T.Y., L.W., J.W., X.X. and N.L. conducted data retrieval, processing and quality control. T.Y. performed the analyses and developed the website. X.D. designed the website. T.Y., D.X. and N.L. performed technical validation. H.W., M.J.L. and X.D. supervised the project. T.Y. and X.D. wrote the manuscript. All authors approved the final version of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
You, T., Wang, L., Wang, J. et al. A highly annotated drug combination resource for catalyzing precision combinatorial therapy. Sci Data 12, 1284 (2025). https://doi.org/10.1038/s41597-025-05630-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-025-05630-4