Abstract
Although significant advances have been made in the early detection of many cancers, challenges remain in the early diagnosis of rare cancers, including Wilms tumor, Clear Cell Sarcoma of the Kidney, Neuroblastoma, Osteosarcoma, and Acute Myeloid Leukemia, perhaps due to their relative obscurity and scarce data compared to common cancers. Application of artificial intelligence or deep learning has shown promising results in disease diagnosis including in their ability to diagnose cancers and detect their tissue of origin. However, their ability to detect rare cancers is yet to be comprehensively assessed. This motivated us to develop, RareNet, which leverages transfer learning of an established deep learning model, namely, CancerNet, to classify rare cancers. The transfer learning framework of RareNet utilized DNA methylation data of various biopsied rare cancers to learn epigenetic signatures of rare cancers. RareNet achieved an overall accuracy (F1 score) of ~ 96%, outperforming other machine learning models including Random Forest, K Nearest Neighbors, Decision Tree Classifier, and Support Vector Classifier.
Similar content being viewed by others
Introduction
Rare cancers are generally defined as cancers that occur in less than 6 people per 100,000 individuals1. Despite each rare cancer affecting only a small portion of the population, cancers that fall under the umbrella of rare cancers actually make up approximately 22% of cancer diagnoses2. Notably, patients with these cancers often face worse outcomes. The five-year relative survival of rare cancers is a dismal 47%, compared to 65% for common cancers3.
One potential reason for the low survival rate of rare cancer patients is often an incorrect or late diagnosis. Current diagnostic measures typically are based on histopathology. Some of these techniques include staining with haematoxylin and eosin, immunohistochemistry, conventional karyotyping, and fluorescence in-situ hybridization (FISH). However, the diagnosis through these techniques is subject to interpretational error at a rate of approximately 4%4. This discrepancy is even more pronounced in rare cancers. Soft cell sarcomas are an umbrella classification that contains over fifty different subtypes of tumors– all of which are considered rare tumors5. When initial histological diagnoses of sarcoma patients were compared to those from a panel of experts, the diagnoses differed approximately 42% of all instances6. Furthermore, as histological diagnosis requires tissue samples of a tumor, cancer patients have to be subjected to an invasive procedure to obtain these samples. These samples cannot even be collected if the tumor is not of a noticeable size yet. Both of these factors can delay diagnosis and treatment of the cancer, resulting in worse patient outcomes7.
While histopathology is still by far the most common diagnostic method, Whole Genome Sequencing (WGS) has begun to gain more attraction in recent years. WGS is a process in which entire DNA complement within a cell can be sequenced, including the tumor DNA. It remains a promising way to identify and classify cancers; however, large scale sequencing could be cost prohibitive. DNA molecules harbored by cancerous tissues display methylation patterns that distinct from those of normal tissues and furthermore, methylation patterns differ among different cancers. Quantification of DNA methylation through techniques such as, Whole Genome Bisulfite Sequencing (WGBS), is thus a promising alternative that could be leveraged to diagnose and classify cancers8.
Indeed, the DNA methylation patterns in cancers were exploited in several studies recently to classify cancers. One such diagnostic tool, CancerNet9, utilizes a deep learning framework with a variational autoencoder that can diagnose cancer and predict the tissue of origin based on DNA methylation patterns of different cancers. Taking a cue from CancerNet, here, we propose a new deep learning algorithm, RareNet, for rare cancer detection. RareNet performs transfer learning on CancerNet to build a robust model for rare cancer detection. To address challenges associated with limited data availability and expertise in diagnosing such conditions, we exploited the transfer learning technique that allows transfer of learned features from a pre-trained robust model for use in a model trained on scarce data. Rare cancers, by definition, occur infrequently, resulting in sparse datasets that hinder the development of accurate and robust diagnostic models.
By employing transfer learning, which leverages the CancerNet model pre-trained on larger and more diverse datasets, RareNet is able to capitalize on the information gained from commonly occurring cancers and apply it to the diagnosis of rare cancers. This approach allows for the extraction of relevant features and patterns from existing models designed for similar purposes, which can then be used to fine-tune a model designed for a related task. Transfer learning thus augmented the power of RareNet in its use of limited rare cancer data for rare cancer classification. RareNet compared favorably with the established machine learning models including Random Forest, K Nearest Neighbors, Decision Tree Classifier, and Support Vector Classifier. Our findings indicate that the RareNet model offers a more robust solution to diagnosing rare cancers. The application of transfer learning to rare cancer diagnostics offers a promising avenue to enhance accuracy, improve early detection, and enable timely intervention, leading to more effective treatment strategies and improved patient outcomes.
Materials and methods
Datasets
RareNet was assessed on three different rare cancer datasets. The inclusion of diverse datasets facilitated the assessment of RareNet’s generalization to yet unseen data. These datasets are described below.
The TCGA dataset
The TCGA dataset is comprised of DNA methylation data for a total of 13,325 samples, covering 33 different types of cancer along with a normal class derived from these datasets. The cancers in the TCGA dataset include Adrenocortical carcinoma (ACC), Bladder urothelial carcinoma (BLCA), Breast invasive carcinoma (BRCA), Cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), Cholangiocarcinoma (CHOL), Colon adenocarcinoma (COAD), Lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), Esophageal carcinoma (ESCA), Glioblastoma multiforme (GBM), Head and Neck squamous cell carcinoma (HNSC), Kidney chromophobe (KICH), Kidney renal clear cell carcinoma (KIRC), Kidney renal papillary cell carcinoma (KIRP), Acute myeloid leukemia (LAML), Brain lower grade glioma (LGG), Liver hepatocellular carcinoma (LIHC), Lung adenocarcinoma (LUAD), Lung squamous cell carcinoma (LUSC), Mesothelioma (MESO), Ovarian serous cystadenocarcinoma (OV), Pancreatic adenocarcinoma (PAAD), Pheochromocytoma and paraganglioma (PCPG), Prostate adenocarcinoma (PRAD), Rectum adenocarcinoma (READ), Sarcoma (SARC), Skin cutaneous melanoma (SKCM), Stomach adenocarcinoma (STAD), Testicular germ cell tumors (TGCT), Thyroid carcinoma (THCA), Thymoma (THYM), Uterine corpus endometrial carcinoma (UCEC), Uterine carcinosarcoma (UCS), and Uveal melanoma (UVM). Samples for the “Normal” class were sourced from the data encompassing 33 types of cancer, providing a baseline for comparing cancerous tissues and enhancing the reliability and accuracy of genomic analyses9.
The TARGET dataset
DNA methylation data for 5 rare cancers, namely, Wilms Tumor (WT) (11 samples), Clear Cell Sarcoma of the Kidney (CCSK) (86 samples), Osteosarcoma (OST) (171 samples), Neuroblastoma (NB) (221 samples), and Acute Myeloid Leukemia (AML) (130 samples) were obtained from the Tumor Alterations Relevant for Genomic-driven Therapy (TARGET) database. The “healthy” samples were categorized as normal class (158 samples). Note these cancers were represented in the CancerNet’s training dataset. The TARGET dataset is comprised of a total of 777 DNA methylation samples representing the five rare cancers.
The NCBI GEO dataset
The DNA methylation data from various rare cancers, including neuroblastoma (31 samples), clear cell sarcoma of the kidney (CCSK) (55 samples), and acute myeloid leukemia (AML) (73 samples), as well as normal samples (29 samples), were obtained from the Gene Expression Omnibus (GEO) database. This dataset comprises a total of 188 DNA methylation samples (accession numbers: GSE54719, GSE113501, GSE125645, GSE59157, GSE62298, GSE58477).
Each dataset was split into 80% training set, 10% validation set, and 10% test set. The source code of RareNet and associated data have been made available at the project’s GitHub site: https://github.com/DanyangShao/Rare-CancerNet.
Variational autoencoder
Due to their dimensionality reduction and classification use cases, we chose to use a variational autoencoder (VAE) for rare cancer classification. VAE uses an encoder to reduce the dimensions of the input data into a more usable latent space and then they use a decoder to reconstruct the latent space into an output that is the closest possible to the input. This is done so that only the more vital information is maintained in the latent space.
We employed a methodology similar to that used in CancerNet, utilizing the CpG density clustering approach for preprocessing methylation data prior to inputting into CancerNet. CpGs not associated with CpG islands were excluded, and the remaining data were scanned for Illumina 450 K probes located within 100 bp of each other. Subsequently, these probes were concatenated into clusters, which were further refined by removing clusters containing fewer than 3 CpGs10. The resulting 24,565 clusters were then subjected to averaging of CpG (beta) values for each cluster. These averaged beta values served as the input features, resulting in 24,565 input nodes utilized in our analysis. Our VAE used these 24,565 inputs to generate a 100-dimension latent space embedding.
VAEs have been used in the past for detection of cancers using methylated DNA biopsied directly from tumor sites. RareNet implements transfer learning on the CancerNet, model that was previously developed to detect 34 common cancers9. Before implementing the transfer learning procedure, we retrieved methylation data for 5 rare cancers from TCGA. To evaluate the model performance, we applied a tenfold cross-validation strategy. In each of the ten rounds of validation, the data was divided into ten folds: one fold was held out as the test dataset, while the remaining nine folds were used for model development. From the latter (nine folds), eight were used as the training dataset and one was used as the validation dataset during training. RareNet was first trained on the training dataset, and then assessed and model parameters adjusted for optimal performance and generalizability using the validation dataset during the training procedure, followed by evaluation on the test dataset. For each performance metric, final metric value was reported as the average over metric values from ten rounds of testing.
The architecture of RareNet is similar to that of CancerNet with the same hyperparameters as of CancerNet. The existing weights from CancerNet were loaded to RareNet. Note that in contrast to the classifier of CancerNet that has 34 output nodes for 33 different cancers and 1 normal, RareNet’s classifier has 6 output nodes, 5 for rare cancers and 1 for normal. The classifier was trained with the weights of the encoder and decoder frozen. This allowed the classifier to begin training without modifying the latent space, thus performing the transfer learning (Fig. 1).
RareNet’s Variational Auto-Encoder (VAE) Architecture. The model’s input has 24,565 nodes and takes in DNA methylation data. The process starts similar to that of CancerNet9. The model processes DNA methylation data through ReLU-activated layers, compressing it into a probabilistic latent space. The data is reconstructed via a sigmoid-activated decoder. From the latent space, latent variables are sampled and classification of rare cancer and normal samples is performed using a softmax layer. Initially, transfer learning from CancerNet is performed, wherein the parameters of the encoder and decoder layers, as obtained from CancerNet, are frozen and only the softmax classifier is trained.
Comparative assessment
Comparative assessment of RareNet with other machine learning models was performed using the Scikit-Learn Python library that offers a suite of machine learning algorithms for classification. We used the same datasets as used with RareNet for training, validating, and testing the other machine learning models, namely, Random Forest, K Nearest Neighbors, Decision Tree Classifier, Support Vector Classifier, and additionally, a deep learning baseline, a Multi-Layer Perceptron (MLP) neural network. The Random Forest model and the Decision Tree Classifier model were tuned over parameters such as the number of estimators, maximum tree depth, and minimum samples per split. The K Nearest Neighbors model was optimized based on the number of neighbors and distance weighting and the Support Vector Classifier model was evaluated using both linear and RBF kernels with varying values of regularization (C) and gamma. For the MLP model, we tested different hidden layer sizes and regularization strengths while using ReLU activation and the Adam optimizer. The final performance metrics were obtained through stratified tenfold cross-validation to ensure model stability, fairness, and generalizability.
Results
RareNet was first implemented with transfer leaning turned off, that is, the model was trained and validated on DNA methylation data from rare cancer and normal tissues and then evaluated on a held-out test-set. Application to TCGA dataset yielded an overall F1 score for classification of only 34%, highlighting the challenges in the usage of such models with very limited amount of training data. Following this experiment, we explored whether transfer learning, that is, the transfer of learned parameter values from CancerNet to RareNet followed by fine-tuning, could improve the classification. Indeed, the overall accuracy (F1 score) of RareNet leaped to ~ 96%, more than a two-fold increase. The performance of RareNet can be further assessed through confusion matrices, one constructed for cancer diagnosis (cancer versus normal, Fig. 2) and the other for cancer classification that displays a breakdown on a cancer-by-cancer basis (with normal also included, Fig. 3). The former shows a false negative rate of 5% and a false positive rate of 0%. This means that 5% of cancer samples were predicted as normal, while none of the normal samples were predicted as cancer by RareNet (Fig. 2). Application of RareNet to the NCBI GEO dataset yielded an F1 score of 85.5%. The results from the application to the NCBI GEO dataset further highlight the model’s generalizability.
Performance of RareNet in cancer diagnosis is illustrated through the confusion matrix, which shows the ability of RareNet to distinguish between cancer (positive) and normal (negative) samples. The matrix shows a false negative rate of 5%, meaning that 5% of cancer samples were incorrectly predicted as normal. The false positive rate is 0%, that is, none of normal samples were misclassified as cancer by RareNet.
The confusion matrix presents the classification performance of RareNet across different rare cancer types. The rare cancers (actual classes) include Wilms Tumor (WT) with 2 samples (both correctly identified/classified), Clear Cell Sarcoma of the Kidney (CCSK) with 9 samples (8 correctly classified and 1 misclassified), Osteosarcoma (OST) with 17 samples (15 correctly classified and 2 misclassified), Neuroblastoma (NB) with 24 samples (all correctly identified), Acute Myeloid Leukemia (AML) with 11 samples (all correctly identified), and Healthy (H) with 12 samples (all correctly identified).
Additionally, model training was performed on a dataset representing DNA methylation patterns of 33 common cancers and 5 rare cancers, without using the transfer learning. The model achieved an F1 score of 58.4% for the rare cancers, which was substantially lower than the performance obtained with transfer learning (~ 96%). This further demonstrates the effectiveness of transfer learning in improving the classification accuracy for rare cancers.
RareNet demonstrated superior performance compared to several widely used machine learning models—Random Forest, K Nearest Neighbors, Decision Tree Classifier, and Support Vector Classifier—as well as a deep learning baseline model, Multi-Layer Perceptron (MLP), across multiple datasets. All baseline models were tuned using grid search with fivefold cross-validation to ensure fair comparisons. For instance, as illustrated in Figs. 4 and 5, Random Forest achieved F1 scores of 91.3% (Precision: 91.4%, Recall: 91.3%) on the TARGET test dataset and 61.8% (Precision: 73.6%, Recall: 67.2%) on the GEO test dataset. K Nearest Neighbors attained F1 scores of 82.7% (Precision: 85.8%, Recall: 83.1%) on the TARGET test dataset and 61.5% (Precision: 63.4%, Recall: 62.4%) on the GEO test dataset, whereas Decision Tree Classifier achieved F1 scores of 83% (Precision: 83.5%, Recall: 82.8%) on the TARGET test dataset and 73.7% (Precision: 79.7%, Recall: 76.6%) on the GEO test dataset. Support Vector Classifier produced F1 scores of 89.5% (Precision: 92.5%, Recall: 89.6%) on the TARGET test dataset and 71.8% (Precision: 75.0%, Recall: 73.4%) on the GEO test dataset. The MLP model, trained with optimized hidden layer configurations, yielded F1 scores of 88.3% (Precision: 92%, Recall: 88.3%) on TARGET and 77.6% (Precision: 81.5%, Recall: 75.9%) on GEO. In contrast, RareNet consistently outperformed these models, achieving F1 scores of 96% (Precision: 96.1%, Recall: 96%) and 85.5% (Precision: 86.1%, Recall: 84.9%) on the TARGET and GEO test datasets, respectively. This substantial superior performance highlights RareNet’s efficacy in cancer classification tasks (Table 1). To further evaluate RareNet’s classification performance, we plotted the receiver operating characteristic (ROC) curve for each rare cancer type. As shown in Fig. 6, RareNet achieved high AUC values across all rare cancer types.
Comparison of Precision, Recall, and F1 scores for various machine learning models assessed on rare cancer datasets. The light blue bars represent the Precision values, the green bars represent the Recall values, and the salmon bars represent the F1 scores. This comparison demonstrates the strengths of each model, with RareNet exhibiting the highest scores across all metrics, indicating its effectiveness in accurately identifying rare cancer cases.
Comparison of Precision, Recall, and F1 scores for various machine learning models evaluated on the GEO dataset: The light blue bars represent the Precision values, the green bars represent the Recall values, and the salmon bars represent the F1 scores. This comparison highlights RareNet’s superior performance across all metrics, demonstrating its effectiveness in identifying rare cancer cases compared to other models.
Receiver Operating Characteristic (ROC) curves for RareNet across all rare cancer types and healthy class. The area under the curve (AUC) values demonstrate the model’s ability to distinguish each rare cancer type from the rest, with higher values indicating stronger classification performance. The legend includes six classes: Healthy, Wilms Tumor (WT), Clear Cell Sarcoma of the Kidney (CCSK), Neuroblastoma (NB), Acute Myeloid Leukemia (AML), and Osteosarcoma (OST).
Following the reduction of dimension of the latent space using t-SNE (from 100 to 2 dimensions), the rare cancer and normal samples were visualized in the reduced latent space (Fig. 7). Samples from each rare cancer lie in a distinct cluster of their own in the latent space. Healthy samples also formed a distinct cluster in this space. This shows that the model is able to effectively differentiate between rare cancer samples and normal samples, as well as rare cancer samples by the respective cancer types, based on their latent representations.
Visualization of test samples in the latent space. T-SNE was employed to reduce the latent space dimension from 100 to 2. Samples from Wilms Tumor (WT), Clear Cell Sarcoma of the Kidney (CCSK), Osteosarcoma (OST), Neuroblastoma (NB), Acute Myeloid Leukemia (AML), and Healthy (H) segregate into distinct clusters, differentiated based on their DNA methylation patterns by the model.
Discussion
Our results demonstrated the usefulness of transfer learning for rare-cancer diagnosis. RareNet, designed specifically to detect and classify rare cancers, achieved a high accuracy even with low amount of rare cancer data to train on. RareNet achieved an overall accuracy of 96.1% across all rare cancers, whereas, in a past study, pathologists identified the correct tissue for a cancer as their first choice only 49% of the time. Transfer learning on CancerNet allowed our model to identify rare cancers with as few as 11 registered samples. Benchmarking analyses established the superior performance of our model using transfer learning vis-à-vis other machine learning models (specifically, Random Forest, K Nearest Neighbors, Decision Tree Classifier, Support Vector Classifier, and Multi-Layer Perceptron) and therefore, should be the model of choice for similar scenarios.
We envisage the use of transfer learning powered neural network models to detect cancers based on circulating tumor DNA (ctDNA) in the blood stream11. In the past, experiments have been conducted to detect ctDNA and predict cancer locations, but the lack of preferred samples such as blood, urine, sputum, and saliva containing cfDNA, ctDNA, and DNA methylated fragments released by tumor cells hinders the development of an accurate test11,12. We posit that by leveraging transfer learning from established cancer models, these bottlenecks could be overcome, leading to a reliable test system based on liquid biopsy.
To further address the potential influence of batch effects or platform differences across datasets, we employed a consistent preprocessing and normalization pipeline for all samples. Although we did not apply explicit batch correction techniques such as ComBat, RareNet maintained robust performance on the independent GEO dataset—collected using a different experimental platform— achieving an F1 score of 85.5%. This result suggests that RareNet is resilient to potential non-biological variations and supports its applicability across heterogeneous data sources. Prior studies have emphasized the importance of harmonization across cohorts, and our findings are consistent with such conclusions13,14,15,16. To further enhance the interpretability of, and biological insights from, models such as RareNet, we are actively integrating SHAP (SHapley Additive Explanations) analyses to quantify feature contributions. This will help us identify key methylation sites and class-specific patterns, potentially linked to rare cancer mechanisms, which we intend to deliver to the cancer research community in the near future. We consider this work an important foundation toward building interpretable, clinically-relevant rare cancer diagnostic models.
Data availability
The software and associated datasets are available at https://github.com/DanyangShao/Rare-CancerNet. All other datasets are provided with the article. We also declare that no biomolecular data (e.g. proteomics data and protein sequences, DNA and RNA sequences, genetic polymorphisms, linked genotype and phenotype data, macromolecular structure, gene expression data, crystallographic data for small molecules) were generated from this work. The GEO datasets used in this study were obtained from NCBI GEO repository (https://www.ncbi.nlm.nih.gov/geo/), with accession numbers GSE54719, GSE113501, GSE125645, GSE59157, GSE62298, GSE58477.
References
Gatta, G. et al. Burden and centralised treatment in Europe of rare tumours: results of RARECAREnet: a population-based study. Lancet Oncol. 18, 1022–1039 (2017).
Pillai, R. & Jayasree, K. Rare cancers: challenges & issues. Indian J. Med. Res. 145, 17 (2017).
Gatta, G. et al. Rare cancers are not so rare: the rare cancer burden in Europe. Eur. J. Cancer 47, 2493–2511 (2011).
Crabtree, M., Cai, J. & Qing, X. Conventional karyotyping and fluorescence in situ hybridization for detection of chromosomal abnormalities in multiple myeloma. J Hematol. 11, 87–91 (2022).
Fletcher, C. D. M. The evolving classification of soft tissue tumours: an update based on the new WHO classification. Histopathology 48, 3–12 (2006).
Ray-Coquard, I. et al. Sarcoma: concordance between initial diagnosis and centralized expert review in a population-based study within three European regions. Ann. Oncol. 23, 2442–2449 (2012).
Lawrence, R., Watters, M., Davies, C. R., Pantel, K. & Lu, Y.-J. Circulating tumour cells for early detection of clinically relevant cancer. Nat Rev Clin Oncol. 20, 487–500 (2023).
Lee, S.-T. & Wiemels, J. L. Genome-wide CpG island methylation and intergenic demethylation propensities vary among different tumor sites. Nucleic Acids Res. 44, 1105–1117 (2016).
Gore, S. & Azad, R. K. CancerNet: a unified deep learning network for pan-cancer diagnostics. BMC Bioinform. 23, 229 (2022).
Kang, S. et al. CancerLocator: non-invasive cancer diagnosis and tissue-of-origin prediction using methylation profiles of cell-free DNA. Genome Biol. 18, 53 (2017).
Ma, M. et al. “Liquid biopsy”—ctDNA detection with great potential and challenges. Ann Transl Med. 3, 235 (2015).
Chen, M. & Zhao, H. Next-generation sequencing in liquid biopsy: cancer screening and early detection. Hum Genom. 13, 34 (2019).
Wang. Y., Li, J., & Zhang, H. Addressing batch effects in methylation data integration using adversarial training. bioRxiv. (2024).
Takasawa, K., Shimo, A., Miyagawa, H., & Yamashita, K. Cross-cohort harmonization of DNA methylation datasets for robust cancer subtype detection. Exp. Mol. Med. (2024).
Akulenko, R., Merl, M. & Helms, V. Detection of batch effects in DNA methylation data and the impact on sample classification. PLoS ONE 11, e0159921 (2016).
Sun, W., Reich, B. J. & Perou, C. M. ComBat-adjusted gene expression data improve prediction of cancer outcome. BMC Med Genom. 4, 84 (2011).
Funding
The authors did not receive funds for publication from any funding organization for the submitted work.
Author information
Authors and Affiliations
Contributions
D.S., S.G., and R.K.A. conceived and designed the experiments. D.S., S.A., J.C., A.J., and L.D. performed the experiments. D.S., S.A., J.C., and R.K.A. analyzed the data. D.S., A.J., L.D., and R.K.A. wrote the paper. R.K.A. edited and finalized the paper for submission. D.S. and R.K.A. completed the revision following the first round of peer-review. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval and consent to participate
Please note that we have used only publicly available data. Our work presented here does not involve any human subjects and this is not considered human subjects research. The relevant guidelines and regulations (Helsinki declarations/national/institutional guidelines) are therefore not applicable here.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Shao, D., Addagudi, S., Cowles, J. et al. RareNet: a deep learning model for rare cancer diagnosis. Sci Rep 15, 22732 (2025). https://doi.org/10.1038/s41598-025-08829-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-08829-y