Multimodal fusion of radio-pathology and proteogenomics identify integrated glioma subtypes with prognostic and therapeutic opportunities

Liu, Zaoqu; Wu, Yushuai; Xu, Hui; Wang, Minkai; Weng, Siyuan; Pei, Dongling; Chen, Shuang; Wang, WeiWei; Yan, Jing; Cui, Li; Duan, Jingxian; Zhao, Yuanshen; Wang, Zilong; Ma, Zeyu; Li, Ran; Duan, Wenchao; Qiu, Yuning; Su, Dingyuan; Li, Sen; Liu, Haoran; Li, Wenyuan; Ma, Caoyuan; Yu, Miaomiao; Yu, Yinhui; Chen, Te; Fu, Jing; Zhen, YingWei; Yu, Bin; Ji, Yuchen; Zheng, Hairong; Liang, Dong; Liu, Xianzhi; Yan, Dongming; Han, Xinwei; Wang, Fubing; Li, Zhi-Cheng; Zhang, Zhenyu

doi:10.1038/s41467-025-58675-9

Download PDF

Article
Open access
Published: 13 April 2025

Multimodal fusion of radio-pathology and proteogenomics identify integrated glioma subtypes with prognostic and therapeutic opportunities

Zaoqu Liu ORCID: orcid.org/0000-0002-0452-742X^1,2,3,4^na1,
Yushuai Wu⁵^na1,
Hui Xu¹^na1,
Minkai Wang⁶,
Siyuan Weng¹,
Dongling Pei⁶,
Shuang Chen⁷,
WeiWei Wang⁸,
Jing Yan⁹,
Li Cui⁸,
Jingxian Duan ORCID: orcid.org/0000-0002-9453-2651¹⁰,
Yuanshen Zhao¹⁰,
Zilong Wang⁶,
Zeyu Ma⁶,
Ran Li¹¹,
Wenchao Duan⁶,
Yuning Qiu⁶,
Dingyuan Su⁶,
Sen Li⁶,
Haoran Liu⁶,
Wenyuan Li⁶,
Caoyuan Ma⁶,
Miaomiao Yu⁶,
Yinhui Yu⁶,
Te Chen⁶,
Jing Fu⁶,
YingWei Zhen⁶,
Bin Yu⁶,
Yuchen Ji⁶,
Hairong Zheng ORCID: orcid.org/0000-0002-8558-5102^10,12,
Dong Liang^10,12,
Xianzhi Liu⁶,
Dongming Yan⁶,
Xinwei Han ORCID: orcid.org/0000-0003-4407-4864^1,3,4,
Fubing Wang ORCID: orcid.org/0000-0002-5971-2622^13,14,15,
Zhi-Cheng Li ORCID: orcid.org/0000-0003-4140-0580^10,12 &
…
Zhenyu Zhang ORCID: orcid.org/0000-0001-5570-090X⁶

Nature Communications volume 16, Article number: 3510 (2025) Cite this article

16k Accesses
16 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Integrating multimodal data can uncover causal features hidden in single-modality analyses, offering a comprehensive understanding of disease complexity. This study introduces a multimodal fusion subtyping (MOFS) framework that integrates radiological, pathological, genomic, transcriptomic, and proteomic data from 122 patients with IDH-wildtype adult glioma, identifying three subtypes: MOFS1 (proneural) with favorable prognosis, elevated neurodevelopmental activity, and abundant neurocyte infiltration; MOFS2 (proliferative) with the worst prognosis, superior proliferative activity, and genome instability; MOFS3 (TME-rich) with intermediate prognosis, abundant immune and stromal components, and sensitive to anti-PD-1 immunotherapy. STRAP emerges as a prognostic biomarker and potential therapeutic target for MOFS2, associated with its proliferative phenotype. Stromal infiltration in MOFS3 serves as a crucial prognostic indicator, allowing for further prognostic stratification. Additionally, we develop a deep neural network (DNN) classifier based on radiological features to further enhance the clinical translatability, providing a non-invasive tool for predicting MOFS subtypes. Overall, these findings highlight the potential of multimodal fusion in improving the classification, prognostic accuracy, and precision therapy of IDH-wildtype glioma, offering an avenue for personalized management.

Integrating imaging and genomic data for the discovery of distinct glioblastoma subtypes: a joint learning approach

Article Open access 28 February 2024

Biologically interpretable multi-task deep learning pipeline predicts molecular alterations, grade, and prognosis in glioma patients

Article Open access 16 August 2024

Quantitative MRI-based radiomics for noninvasively predicting molecular subtypes and survival in glioma patients

Article Open access 26 July 2021

Introduction

Diffuse gliomas in adults represent the most common primary malignant central nervous system (CNS) neoplasms¹. Isocitrate dehydrogenase (IDH) mutations are found as the most impactful molecular markers that characterize diffuse glioma patients into two groups with different prognosis, genetic traits, and potential treatment options^2,3. The recent World Health Organization (WHO) Classification of Tumors of the Central Nervous System was released classified IDH mutant gliomas and IDH wild-type gliomas as distinct tumor entities in adult patients⁴. Notably, IDH-wildtype glioblastoma (GBM) is the most prevalent and aggressive glioma subtype in adults, with a 5-year survival rate of less than 10%⁵. Standard treatment for GBM includes gross total resection followed by radiotherapy and temozolomide (TMZ) chemotherapy⁶. Nevertheless, GBM is marked by substantial intertumoral and intratumoral heterogeneity, with significant discrepancies in genomic, transcriptomic, proteomic, and epigenomic levels, posing considerable challenges for homogenized management⁷. Thus, further investigation on the heterogeneity and stratification of IDH-wildtype GBMs is imperative.

Over the past decade, several studies have applied large-scale, high-throughput sequencing to investigate GBM heterogeneity, identify key oncogenic events and potential therapeutic targets, and establish molecular subtypes for patient stratification^{8,9,10,11,12,13,14}. A landmark study by The Cancer Genome Atlas (TCGA) in 2010 classified GBM into four transcriptional subtypes: proneural, neural, mesenchymal, and classical¹³. Subsequent research highlighted the profound influence of the tumor microenvironment (TME) on GBM subtyping¹⁵. To address this, Wang et al.¹² refined the classification system, retaining the proneural, mesenchymal, and classical subtypes while excluding the neural subtype due to contamination by normal cellular components. More recent studies have proposed further insights. Wu et al.¹⁶ introduced a classification based on pharmacological response and RNA expression in cancer cell lines, defining proneural, oxidative phosphorylation, and mesenchymal subtypes. White et al.¹⁷ proposed TME-based subtypes—TME^low, TME^medium, and TME^high with prognostic relevance. Moreover, single-cell sequencing (scRNA-seq) has revealed multiple transcriptional states within GBMs, reflecting distinct biological processes such as hypoxia and cell cycle regulation¹⁸. Neftel et al.¹⁹ demonstrated that these transcriptional states converge into four tumor cell subpopulations, underscoring the transcriptional complexity. However, transcriptome-based subtyping remains limited in its ability to predict patient survival and treatment response¹¹.

With advances in technology, the integration of multimodal data has brought insights into tumor stratification¹⁰. Generally, radiology, pathology, DNA, RNA, and protein could reflect the anatomical, cellular, genetic, transcriptional, and functional levels of disease, respectively. The integration of multimodal profiling enables us to better understand the linkages between complex disease phenotypes and biological mechanisms. Indeed, cellular-level microscopic phenotypes that arise from gene mutations, aberrant signaling, and abnormal expression have profound impacts on crucial cellular processes, such as cell proliferation, inflammatory response, and angiogenesis²⁰. These alterations can be captured by histological images. Similarly, macroscopic phenotypes, such as tumor shape, texture, edema, and necrosis, can be visualized using advanced imaging techniques like multiparametric MRIs. Previous studies have also shown that radiological or histopathological images can reflect the mutational status or expression level of genes^21,22,23. This not only offers a bridge between radio-histopathological images and molecular omics but also serves as the theoretical basis for multimodal fusion analysis.

Here, with the integration of multimodal data from MRI-derived radiomics, whole slide images (WSI)-derived pathomics, whole-exon sequencing (WES), RNA sequencing (RNA-seq), and mass spectrometry (MS) based proteomics, we identify and validate three distinct subtypes in IDH-wildtype adult gliomas. The MOFSR package is designed for multimodal data fusion and analysis (https://github.com/Zaoqu-Liu/MOFS). Our multimodal analysis provides insight into GBM intertumoral heterogeneity in phenotypic manifestation, oncogenic signaling, immune response, and genomic alterations. Moreover, this work highlights potentially effective prognostication and therapeutic strategies for IDH-wildtype GBM patients.

Results

Identification of multimodal fusion subtypes in IDH-wildtype adult gliomas

Based on 2021 WHO Classification of CNS tumors (Fig. S1A), this study enrolled 1194 adult patients with IDH wild-type glioma, whose preoperative multiparametric MRIs (T1WI, CE-T1WI, T2WI, FLAIR, and ADC) performed image segmentation and feature extraction. Among these cases, 202 gliomas underwent RNA sequencing (RNA-seq), 180 underwent mass spectrometry (MS) for proteomics, and 122 underwent whole-exon sequencing (WES). Histological whole-slide images (WSIs) of 122 patients with corresponding radiology and sequencing data were generated via scanned Hematoxylin and Eosin (HE)-stained pathological slices (Fig. S1B). We divided the 1,194 patients into three cohorts: 122 patients with all multimodal data (FAHZZU1), 80 patients with RNA-seq and partial proteomics data (FAHZZU2), and 992 patients with only MRI data (FAHZZU3) (Fig. S1B). Details of all multimodal and clinicopathological data refer to Supplementary Data 1.

Integrating multimodal data reveals causal features that may be obscured in single-modality analyses, offering a holistic understanding of diseases. Multimodal data fusion can be categorized into early, intermediate, and late fusion based on the integration timing. Intermediate fusion is generally more advanced than early and late fusion, but it demands higher requirements for the integration algorithm^7,24. In this study, we introduced a multimodal fusion subtyping (MOFS) framework (Fig. 1A). Briefly, we performed the intermediate fusion of multimodal data (FAHZZU1 cohort) by integrating 11 algorithms based on different principles (Supplementary Data 2), followed by late fusion of the results obtained from the 11 algorithms to yield the final clustering results.

**Fig. 1: Multimodal data integration and clustering analysis for identifying multimodal fusion subtypes.**

Initially, the clustering prediction index (CPI)²⁵ and GAP statistic²⁶ were calculated for different combinations of modal variables, and the optimal clustering number was achieved when K = 3 (Fig. 1B and Supplementary Data 3). Additional support for this decision was provided by the proportion of ambiguous clustering (PAC)²⁷ and Calinski-Harabasz index (CHI)²⁸, which indicated that the classification was more robust with three subtypes (Fig. S2A, B). Using 11 distinct intermediate fusion algorithms with various principles²⁹, we next performed multimodal fusion clustering on 122 IDH-wildtype glioma patients with all modalities available. To further generate the consensus clustering results, the late fusion²⁴ was conducted on the Jaccard distance matrix between the samples, which revealed three MOFS subtypes (Fig. 1C and Fig. S2C). Principal component analysis (PCA) suggested distinct separation among three subtypes in two-dimensional spatial coordinates (Fig. 1D). The silhouette statistic was utilized to identify the samples that best represented one of three subtypes, yielding a core set of 116 cases for MOFS identification (Fig. S2D).

Radio-pathological and biological peculiarities of MOFS subtypes

Radiological and pathological examinations unveiled distinct characteristics among the three MOFS subtypes identified through multimodal fusion analysis. First, MOFS1 gliomas were characterized by unsignificant or limited enhancement in CE-T1WI MRI, paired with relatively regular cell morphology, weak atypia, and uniform cell density (Figs. 2A and S2E), indicative of a relatively less aggressive radio-pathological phenotype. Among the 34 identified MOFS1 gliomas, there are 23 histological GBM, 6 molecular GBM and 5 IDH-wildtype diffuse gliomas. Second, MOFS2 gliomas were characterized by mass-like enhancement in CE-T1WI MRI, suggesting a more invasive nature. Pathological manifestation of MOFS2 is marked by a heterogeneous cellular architecture with varied cell sizes and shapes, pronounced atypia, and high cellular heterogeneity (Figs. 2A and S2E). Among the 33 identified MOFS2 gliomas, there are 30 histological GBM, 2 molecular GBM and 1 IDH-wildtype diffuse glioma. Third, MOFS3 gliomas were distinguished by ring-like enhancement with prominent central necrosis in CE-T1WI MRI. Pathologically, these tumors exhibited cellular atypia and focal immune infiltration (Figs. 2A and S2E). All the 49 identified MOFS3 gliomas are histological GBM.

To further investigate the biological characteristic among the MOFS subtypes, functional enrichment analyses were performed using over-representation analysis (ORA), gene-set enrichment analysis (GSEA), and single-sample gene-set testing (ssGST) on both transcriptomic and proteomic data (Supplementary Data 4). MOFS1 was enriched in pathways related to neurodevelopment, including distal axon, GABA receptor binding, long-term synaptic depression, and neuron-to-neuron synapse (Figs. 2B, C and S3–4). MOFS2 showed significant enrichment in proliferation-related pathways such as G1/S-specific transcription, G2/M checkpoint, E2F targets, and cell cycle (Figs. 2B, C and S3–4). MOFS3 was predominantly associated with TME-related pathways, including cell-extracellular matrix interaction, TNF-alpha signaling via NF-kB, immune cell activation, and interferon-gamma response (Figs. 2B, C and S3–4).

Notably, approximately one-third of MOFS3 samples exhibited high cell cycle activity (Figs. S3–4 and S5A, B), suggesting the presence of heterogeneity within the MOFS3 subtype. Further analysis found no significant correlation between cell cycle activity and stromal scores in this subgroup (Fig. S5C). Functional enrichment analysis indicated that the high proliferation samples within MOFS3 remained enriched in TME-related pathways, highlighting the TME-rich identity of MOFS3 rather than a proliferation-driven phenotype (Fig. S5D). Jaccard similarity analysis further supported this conclusion, showing a high degree of concordance between the high and low proliferation subgroups within MOFS3 (Fig. S5E). While high cell cycle activity was associated with poorer prognosis across all samples (P = 0.042), it showed no prognostic significance within the MOFS3 subtype (P = 0.63) (Fig. S5F). These findings highlight the stability of MOFS3 as a distinct TME-rich subtype, despite its internal heterogeneity.

Based on these findings, MOFS1 was termed the proneural subtype due to its enrichment in neurodevelopmental pathways and less aggressive nature. MOFS2 was defined as the proliferative subtype due to its significant enrichment in proliferation-related pathways and high cellular heterogeneity. MOFS3 was classified as the TME-rich subtype, reflecting its association with abundant TME components.

Evaluation of modality contributions and validation of MOFS subtypes

To further evaluate the robustness of the MOFS framework, we performed single-modality clustering and assessed the impact of excluding individual modalities from the multimodal integration. Single-modality clustering showed limited concordance with MOFS subtypes and failed to achieve significant prognostic separation (Figs. 3A and S6A). In contrast, excluding individual modalities from the multimodal framework maintained relatively high concordance with MOFS subtypes and retained statistically significant prognostic power (Figs. 3A and S6B). Notably, Kaplan-Meier survival analysis demonstrated that the full MOFS framework outperformed both single-modality clustering and partial multimodal clustering in terms of prognostic discrimination (Fig. 3B). These findings underscore the importance of integrating multiple data layers to achieve a more robust and accurate classification system.

We next compared the MOFS taxonomy with the traditional transcriptome-based classifications^12,13,16,17. To validate the prognostic performance and biological functions of MOFS subtypes in public cohorts, we developed an integrated classification framework based on transcriptome expression profiles. Given the lack of multimodal data in public datasets, which predominantly consist of high-quality transcriptome data, this approach inevitably sacrificed some information. To mitigate overfitting and extract as much information as possible from the available transcriptome data, we constructed and validated an ensemble classifier (Fig. S7). Our results indicated that MOFS subtypes exhibited moderate correlation with previous classification systems (Fig. 3C). This discrepancy highlights the potential for our MOFS subtyping to offer insights into GBM heterogeneity. Kaplan-Meier survival analysis demonstrated significant survival differences among the three MOFS subtypes across eight cohorts (P < 0.05), with MOFS1 associated with the best prognosis and MOFS2 with the worst (Fig. 3D). The consistency and significance of survival trends for MOFS subtypes in seven external cohorts underscore the robustness of this taxonomy. In contrast, the traditional classifications exhibited only limited performance in predicting prognosis (Fig. S8–9). Considering that clinical variables such as age, sex, MGMT promoter methylation status, and treatment strategies significantly influence survival outcomes, we performed a multivariate Cox regression analysis to account for these factors. Even after adjusting for these variables, MOFS1 consistently correlated with better overall survival, MOFS2 remained predictive of poor prognosis, and MOFS3 showed no significant association with survival (Supplementary Data 5A). Additionally, MOFS1 and MOFS3 patients treated with TMZ exhibited significantly improved survival (HR < 1, P < 0.05), while no survival benefit was observed for MOFS2 patients receiving TMZ compared to untreated cases (P = 0.179) (Supplementary Data 5B). This finding suggested that MOFS2 patients exhibit resistance to TMZ, which may partially account for their overall worse prognosis. As TMZ is the standard chemotherapeutic agent for GBM, the lack of benefit in MOFS2 further underscores the need for alternative therapeutic strategies in this subtype. These findings further validate the robustness and predictive value of the MOFS classification system. Additionally, functional enrichment analyses in these seven external cohorts confirmed the stability and reproducibility of the biological characteristics associated with each MOFS subtype (Fig. S10).

Genomic alteration characteristics of MOFS subtypes

To identify the genetic traits peculiar to individual MOFS subtype, the genomic landscape was characterized. The analysis revealed no significant differences in single nucleotide variations (SNVs) (Fig. S11A), insertions/deletions (INDELs) (Fig. S11B), and tumor mutation burden (TMB) (Fig. 4A). Mutations in TP53, LRP2, and MCM10 were associated with unfavorable prognosis (P < 0.05) (Fig. 4B and Supplementary Data 6). Notably, SCN5, USH2A, PLEC, and DNAH3 showed significant mutation differences among three subtypes (P < 0.05) (Figs. 4C and S11C). MOFS2 exhibited a high frequency of mutations in SCN5, USH2A, and PLEC, while DNAH3 mutations were more prevalent in MOFS1 (Fig. 4C and Supplementary Data 6).

Further analysis of mutational profiles in 10 canonical pathways revealed a higher frequency of mutations in the RTK-RAS pathway, with mutations in TP53 pathway particularly evident in MOFS2 (Fig. S11D). Additionally, MOFS2 demonstrated significant copy number variations (CNVs), exhibiting a heavy CNV broad and focal burden (Fig. 4D, E). This suggests that MOFS2 is characterized by a chromosomal instability (CIN) phenotype, a hallmark of human malignancies associated with poor prognosis, tumor metastasis, and drug resistance³⁰. The MOFS2-specific functional CNV genes were significantly associated with proliferation-related pathways (Fig. 2B), indicating a potential link between genome instability and the proliferative nature of MOFS2.

STRAP amplification and its prognostic significance in MOFS2

To further explore the relationship between CNVs and gene expression in MOFS subtypes, 1023 subtype-specific functional CNV genes were identified (Supplementary Data 7). Cox analysis revealed that STRAP, SCFD2, FIP1L1, and EXOC1 were risk factors in MOFS2, while KIF21A was a protective factor (P < 0.05) (Fig. S12). Notably, STRAP amplification was observed exclusively in MOFS2 (P < 0.0001) (Fig. S12), with significantly higher STRAP expression levels compared to other subtypes (P < 0.0001) (Fig. 4F). This finding was further validated by immunohistochemistry (IHC) results of glioma tissues (Fig. 4G).

ROC analysis indicated that STRAP could accurately predict the MOFS2 subtype (AUC = 0.802) (Fig. 4H). Furthermore, STRAP was predominantly associated with the dismal prognosis of MOFS2, with no significant association observed in the other subtypes (Fig. S12). Kaplan-Meier analysis further demonstrated that high CNV or expression of STRAP was associated with the worst prognosis (P < 0.05) (Fig. 4I). IHC results from tissue microarray (TMA) confirmed that high protein levels of STRAP were negatively associated with prognosis (P = 0.00015) (Fig. 4J and Supplementary Data 8). Functional analysis of the top 500 genes positively correlated with STRAP demonstrated significant enrichment in proliferation-related pathways (Supplementary Data 9). The above demonstrated that STRAP was specifically overexpressed and amplified in MOFS2, suggesting its potential role in promoting the proliferative phenotype of MOFS2.

MOFS3 tumors conveyed rich immune infiltration and sensitive immunotherapy efficacy

MOFS3 tumors were characterized by lower tumor purity but significantly higher immune and stromal components compared to the other subtypes (P < 0.0001) (Fig. 5A–C). Our analysis revealed that immune cells and immunomodulators were predominantly more abundant in MOFS3, reinforcing its TME-rich features (Fig. 5D). MOFS1, identified as the proneural subtype, exhibited higher infiltration levels of neurons, astrocytes, and oligodendrocytes (P < 0.0001) (Fig. 5E–G). To systematically assess the immunotherapeutic potential of the three subtypes, we constructed an immunogram for the cancer-immunity cycle (CIC)³¹, which underscores the dynamic and multifaceted nature of intratumoral immunity (Fig. 5H). Despite comparable tumor antigenicity among three subtypes, likely due to similar tumor mutation burdens (TMB) (Fig. 4A), MOFS3 stood out with heightened activation of other immune pathways (Fig. 5H). This points to a potentially greater benefit from immunotherapy for MOFS3 tumors, given their enriched immune contexture. To further substantiate this, we analyzed transcriptome expression profiles from GBM patients who received anti-PD-1 immunotherapy³². Our findings indicated that responders exhibited higher MOFS3 activity, while non-responders were more associated with MOFS2 activity (Fig. 5I, J). Notably, most of immunotherapy responders were classified as MOFS3, compared to only 25% in MOFS2 (P = 0.09) (Fig. 5K). This disparity underscored the differential immunotherapy responsiveness across MOFS subtypes and highlights MOFS3 as particularly amenable to immune checkpoint blockade.

**Fig. 5: Immune infiltration and immunotherapy response in MOFS subtypes.**

Stroma refined the prognostic stratification of MOFS3

MOFS3 tumors exhibited not only significant enrichment in various immune pathways but also a higher abundance of stromal components (P < 0.0001) (Fig. 5C). Moreover, endothelial cells and pericytes were also found to be significantly more abundant in MOFS3 than in MOFS1 and MOFS2 (Fig. S13A). These findings suggested that MOFS3 tumors contain a higher proportion of non-malignant cellular components. Further survival analyses revealed a strong association between stromal abundance and prognosis within MOFS3 (P < 0.01), a relationship not observed in the other subtypes (Fig. S13B). Prior findings indicated that MOFS3 displayed no significant prognostic value in our cohort (P = 0.57) (Fig. S13C). Intriguingly, upon integration of stroma into the MOFS taxonomy, different abundances of stroma within MOFS3 exhibited significant prognostic significance (P < 0.05) (Fig. 5L). More specifically, MOFS3 tumors with low stromal content demonstrated a median survival rate similar to MOFS1 (P = 0.832), whereas those with high stromal content resembled MOFS2 in terms of prognosis (P = 0.974, Fig. 5L).

To assess the predictive capacity of stromal biomarkers, ROC analysis was conducted. The canonical stromal marker S100A4 showed relatively accurate predictive ability at both the mRNA and protein levels (RNA: AUC = 0.72; protein: AUC = 0.83) (Fig. 5M). IHC analysis also indicated that MOFS3 tumors were characterized by higher expression levels of S100A4 (Fig. 5N). Furthermore, subsequent survival analysis of the tissue microarray (TMA) results confirmed the association between high levels of stromal content and worse prognosis (P = 0.042) (Fig. S13D and Supplementary Data 8). These findings underscore the importance of stromal components in refining the prognostic stratification within the MOFS3 subtype.

Development of MRI classifier for non-invasively predicting MOFS

To enhance the clinical applicability of our MOFS classification system, we leveraged readily accessible, non-invasive radiological imaging to predict MOFS subtypes. We initially filtered 22 quantitative imaging features derived from MRI scans to construct a deep neural network (DNN) model optimized through elastic backpropagation (Fig. 6A and Supplementary Data 10). Hyperparameter tuning led to the development of a deep neural network (DNN) model comprising two hidden layers.

**Fig. 6: Neural network radiomics classifier for predicting MOFS subtypes.**

The DNN was trained using the FAHZZU1 dataset, with 70% of the samples allocated to the training set and 30% to the testing set. Additional testing was conducted using the FAHZZU2 and FAHZZU3 datasets. On the FAHZZU1 training set, the model achieved a perfect area under the curve (AUC) of 1 for each MOFS subtype. On the FAHZZU1 testing set, the AUCs were 0.9, 0.968, and 0.889 for MOFS1, MOFS2, and MOFS3, respectively. For the FAHZZU2 dataset, where MOFS labels were predicted using an RNA-based ensemble classifier, the model achieved AUCs of 0.862, 0.958, and 0.898 for MOFS1, MOFS2, and MOFS3, respectively (Fig. S14A). The slight reduction in AUC values for this cohort may stem from differences in label precision, as the RNA-seq-based predictions lack the comprehensive biological context provided by the original multimodal clustering approach. These variations underscore the potential impact of modality on the accuracy of subtype assignments. The confusion matrix further confirmed the robust performance of the model, showing accuracies of 1, 0.917, and 0.825 in the FAHZZU1 training set, testing set, and FAHZZU2 validation set, respectively (Fig. 6B). Due to the lack of molecular multi-omics data in FAHZZU3 cohort, MOFS labels were unavailable. Despite this limitation, survival analysis of the predicted MOFS subtypes in FAHZZU3 demonstrated significant prognostic differences consistent with prior findings (P = 0.00025), supporting the predictive relevance of the DNN classifications (Fig. 6C). The MRI classifier demonstrates a reasonable level of concordance with the RNA-based MOFS ensemble classifier, capturing the general trends of MOFS1, MOFS2, and MOFS3 subtypes (Fig. S14B). This suggests that imaging-derived features can approximate transcriptomic-driven subtypes to a significant extent. However, some discrepancies highlight the inherent limitations of relying solely on a single modality, such as MRI or RNA, to fully capture the complexity of multimodal subtypes. To facilitate practical application by researchers and clinicians, we developed an accessible tool that allows users to input imaging feature data and obtain MOFS subtype predictions (Fig. 6D).

Discussion

The 2016 and 2021 WHO Classification of CNS tumors have resulted in a major improvement of the classification of adult diffuse gliomas base on IDH mutations^4,33, in which the latest edition have prioritized IDH mutations over histological features⁴. Due to the poor prognosis and high prevalence, IDH-wildtype GBM represents a particularly noteworthy subpopulation³⁴. Previous subtyping systems emerged earlier than the latest edition of the WHO Classification, which only focus on histologically determined GBM irrespective of IDH mutations, and neglect molecularly determined GBM. In addition, these studies derived only from transcriptome data, and resulted in limited prognostic value^8,12,13.With the advancement in artificial intelligence, it is imperative to integrate multimodal data to subtype IDH-wildtype gliomas with high prevalence and much greater homogeneity. In this study, we integrated radio-pathology and proteogenomics to systematically investigate the heterogeneity of IDH wild-type gliomas. Our findings demonstrated that MOFS subtypes offer superior prognostic value compared to traditional classification systems, provide insights into the biological heterogeneity of GBM, and suggest specific therapeutic strategies for different subtypes.

MOFS1 is a proneural subtype with relatively favorable prognosis, endowed with elevated neurodevelopmental activity, and abundant neurocyte infiltration. This subtype also emerged as a transcriptional subtype proposed by Phillips et al.⁸ in 2006 and Wang et al.¹² in 2017, which also displayed relatively better prognosis and neurodevelopmental properties. However, the transcriptional subtyping systems proposed by Phillips et al.⁸ and Wang et al.¹² encompassed patients harboring IDH-mutant GBM and therefore lag behind the 2021 WHO classification of CNS tumors, which classify previously defined IDH-mutant GBM as IDH-mutant astrocytomas, Grade 4⁴. In this study, we employed multimodal data from IDH wild-type glioma patients to identify the proneural subtype (MOFS1), which demonstrated moderate correlation with traditional classification systems. This suggests that patient selection and data modality have a significant impact on study results during the course of the study. Notably, MOFS1 gliomas are comprised of 23 histological GBM (102 in total), 6 molecular GBM (8 in total) and 5 IDH-wildtype diffuse gliomas (6 in total). This composition suggests MOFS1 have identified a proportion of histologically GBM with relatively better survival, and nearly all the molecular GBM and IDH-wildtype diffuse gliomas. Although molecular GBM were determined by TERT promoter mutations/ EGFR amplification/ whole chromosome 7 gain and whole chromosome 10 loss, and were reclassified as IDH-wildtype GBM in the latest WHO classification⁴, there prognosis were found significantly better than IDH-wildtype histologically determined GBM³⁵. This result accords with ours in that the less aggressive MOFS1 embraces the majority of molecular GBM. IDH-wild-type lower-grade diffuse gliomas are indistinguishable from molecular GBM in the aspect of radiology and pathology, and MOFS1 include nearly all the IDH-wildtype lower-grade diffuse gliomas, suggesting these rare entities may also share similar biological traits and clinical outcomes with molecular GBM. Genetically, MOFS1 was enriched for the superior mutational frequency of DNAH3, a gene implicated in microtubule motility and ATP binding. A previous WES study identified DNAH3 as a high-risk variant in breast cancer³⁶.

MOFS2 is a proliferative subtype with unfavorable prognosis, characterized by dismal prognosis, superior proliferative activity, genome instability, and high tumor purity. MOFS2 mainly includes 30 histological GBM, which may be the most aggressive subpopulation of histologically GBM. Compared to other subtypes, MOFS2 exhibited notable CNV, with its subtype-specific CNV genes significantly enriched in proliferation-related pathways. This implies that the proliferative traits of MOFS2 might be driven by gene CNV. Consequently, we further explored the clinical significance and functional features of the MOFS2-specific gene STRAP. This gene was exclusively amplified and overexpressed in MOFS2, and correspondingly, prognostic analysis revealed that STRAP held prognostic value solely in MOFS2, with high expression or amplification suggesting poor prognosis for these patients. Therefore, the subtype-specific gene STRAP may be an essential component of the proliferative phenotype of MOFS2, and targeting this gene may improve the clinical outcomes of MOFS2 patients. Additionally, further analysis revealed that immunotherapy non-responders displayed higher MOFS2 activity, possibly linked to superior tumor purity and diminished immune components.

MOFS3 is composed of a proportion of histologically GBM that characterized by a TME-rich subtype with intermediate prognosis, featured by abundant immune and stromal components. Immune infiltration analysis demonstrated that MOFS3 displayed high levels of immunomodulators (e.g., PD-1 and PD-L1) and CD8 + T cell infiltration, indicating its potential for immunotherapy benefits, which was validated in a GBM cohort treated with anti-PD-1 immunotherapy. Furthermore, MOFS3 also enriched luxuriant stromal contents, which served as a risk prognostic factor sorely in MOFS3. Intriguingly, upon integration of stroma abundance into the MOFS taxonomy, different abundances of stroma within MOFS3 exhibited significant prognostic value. More specifically, the low and high stroma groups demonstrated a median survival rate similar to MOFS1 and MOFS2, respectively. The findings suggest that further stratification of MOFS3 subtypes by high/low stroma can provide prognostic significance. To facilitate clinical feasibility, we identified the classic stromal marker S100A4, which could accurately predict stroma levels and identify MOFS3 patients.

Despite the encouraging results, our study has limitations. First, the reliance on transcriptomic data for classifier development, due to the lack of multi-omics data in public cohorts, may have limited the capture of the full molecular complexity of GBM. Future studies should aim to incorporate more comprehensive multi-omics datasets to enhance the classifier’s accuracy and robustness. Second, while the radiology classifier demonstrated strong predictive performance, its validation in larger and external cohorts is essential to confirm its generalizability. Assigning a single MOFS class to an entire tumor, based on dominant imaging features, oversimplifies the spatial heterogeneity often observed in GBM. Incorporating spatial resolution to differentiate tumor regions could improve prognostic accuracy, particularly in post-treatment recurrence scenarios, where distinguishing treatment effects from true tumor recurrence remains challenging. Moreover, Al-Dalahmah et al.³⁷ utilized scRNA-seq data to describe prognostic GBM tissue states based on cellular composition, offering an alternative perspective that captures some aspects of the MOFS framework. A deeper integration of single-cell RNA sequencing with multimodal approaches may complement the MOFS system and further refine its prognostic and biological insights. Lastly, due to the absence of MGMT methylation status data in the FAHZZU cohort, survival analyses could not control for this crucial prognostic factor. The reliability of survival analysis results will be affected to a certain extent.

By leveraging artificial intelligence in multimodal integration for oncology, the MOFS classification system may represent a significant advancement in the understanding of GBM heterogeneity, offering superior prognostic value and informing precision oncology. The development of a neural network radiology classifier further enhances the clinical translatability of our findings, providing a non-invasive tool for predicting MOFS subtypes. Our study underscores the importance of integrating multi-omics data in cancer classification and paves the way for more personalized and effective GBM treatments.

Taken together, this study integrated radio-histology and proteogenomics to refine three subgroups in IDH-wildtype gliomas with prognostic and therapeutic opportunities. The multifarious biological and clinical peculiarities of the MOFS taxonomy improve the understanding of GBM heterogeneity and facilitate clinical stratification and individualized management. STRAP is significantly associated with the prognosis and proliferative phenotype of MOFS2 patients, thereby representing a potential therapeutic target for this subtype. The abundance of stroma serves as a vital prognostic index, which could reassess the survival risk of MOFS3 patients. To further facilitate researchers and clinical practitioners, we developed the MRI-based classifier for predicting MOFS subtypes. We believe this high-resolution taxonomy could facilitate more effective management of patients with IDH wild-type GBM.

Methods

Data and sample collection

The study was approved by The Human Scientific Ethics Committee of the First Affiliated Hospital of Zhengzhou University (FAHZZU; Approval No. 2019-KY-176 and 2023-KY-1028). Informed consent has been obtained from the patients for all fresh tumor specimens used in this study. This study retrospectively collected data on IDH wild-type glioma patients who underwent radical resection at FAHZZU between 2015 and 2021. Inclusion criteria were: age ≥18 years; primary glioma; integrated diagnosis of IDH wild-type GBM and lower-grade diffuse gliomas were reclassified according to the 2021 WHO classification⁴ (Fig. S1A); no previous radiation or chemotherapy before admission; complete clinical data and follow-up information; no serious systemic abnormalities before surgery; preoperative MRI data including T1WI, CE-T1WI, T2WI, FLAIR, and ADC maps obtained from DWI with good image quality and no significant differences; clear HE-stained pathological slices with high-quality scanned images; and well-preserved pathological tissues. Exclusion criteria were: history of brain surgery or trauma; previous radiation or chemotherapy before surgery; and presence of artifacts on MRI that would affect lesion observation or delineation.

A total of 1194 IDH wild-type glioma patients with complete and qualified MRI data were included in the study. Among these cases, fresh surgical tumor specimens were collected from 202 patients. These specimens were immediately frozen in liquid nitrogen and stored at −80 °C for tissue sequencing. Among them, 202 tissues underwent RNA sequencing (RNA-seq), 180 underwent mass spectrometry, and 122 underwent whole-exon sequencing (WES). Histological whole-slide images (WSIs) of the 122 patients with all radiomics and sequencing data were obtained via scanning HE-stained pathological slices (Fig. S1B). Furthermore, 5 adjacent brain tissues were collected from the tumor margin and 19 peripheral blood samples were taken before surgery as normal controls for WES. This study designated 122 samples with all modal data as the FAHZZU1 cohort, 80 samples with transcriptome or mass spectrometry data as the FAHZZU2 cohort, and 992 samples with only MRI data as the FAHZZU3 cohort (Fig. S1B).

MRI scanning and imaging feature extraction

Patient MRI images were acquired during routine examination using a 3.0 T MRI scanner (Siemens Magnetom Skyra/Trio TIM; GE Discovery MR750; Philips Ingenia). Sequences included: axial and sagittal T1-weighted imaging (T1WI), axial T2-weighted imaging (T2WI), axial T2-weighted fluid-attenuated inversion recovery (FLAIR) imaging, as well as axial, sagittal, and coronal post-contrast T1-weighted imaging (CE-T1WI) immediately after intravenous injection of a 0.1 mmol/kg dose of gadolinium-based contrast agent. Apparent diffusion coefficient (ADC) maps were obtained from axial diffusion-weighted imaging (DWI). The acquisition parameters for each sequence were as follows:

a.
T1WI and CE-T1WI: Repetition time (TR) 220–1750 ms; echo time (TE) 2.3–24 ms; echo train length (ETL) 1–12; slice thickness 5 mm; averages/excitations 1; flip angle (FA) 70°–111°; field of view (FOV) 220 × 192–240 × 240 mm²; matrix 256 × 162–320 × 256 mm².
b.
T2WI: TR 1873–5390 ms; TE 70–117 ms; ETL 16–32; slice thickness 5 mm; averages/excitations 1; FA 90°–142°; FOV 220 × 192–240 × 240 mm²; matrix 320 × 238–512 × 512 mm².
c.
FLAIR: TR 4500–8400 ms; TE 85–150 ms; inversion time (TI) 1670–2250 ms; ETL 1–38; slice thickness 5 mm; averages/excitations 1; FA 90°–150°; FOV 220 × 192–240 × 240 mm²; matrix 256 × 179–256 × 256 mm².
d.
DWI: Images were processed by the corresponding post-processing workstation, and ADC images were calculated from DWI acquired at b-values of 0 and 1000 s/mm². Sequence parameters included: TR 2121–6000 ms; TE 77–119 ms; ETL 1–82; slice thickness 5 mm; averages/excitations 1; FA 90°; FOV 220 × 220–240 × 240 mm²; matrix 152 × 114–192 × 192 mm². ADC maps for all imaging planes were generated on a voxel-by-voxel basis using a single-exponential model.

First, the N4ITK algorithm was employed to correct bias field distortions for all sequences. After isotropic voxel resampling to 1 × 1 × 1 mm³ through trilinear interpolation, multi-sequence MRI rigid registration for each patient was performed using the axial resampled CE-T1WI as a template, and mutual information as similarity measure. This process was completed using the 3D Slicer software, generating registered images rT1WI, rCE-T1WI, rT2WI, rFLAIR, and rADC. Histogram matching was used for gray-level normalization on rT1WI, rCE-T1WI, rT2WI, and rFLAIR. We set the histogram level to 1024 and the number of matching points to 10 to achieve a finer match while preserve more details. A deputy chief physician in neuroradiology with over 10 years of experience in head MRI diagnosis manually delineated the tumor region of interest (ROI) on the axial plane of rFLAIR, rT2WI, and rCE-T1WI images using ITK-SNAP software, obtaining the tumor volume of interest (VOI). The VOI was defined as the enhanced area, non-enhanced area, and necrotic area of the tumor. The VOI contour was drawn based on FLAIR images, while rT2WI and rCE-T1WI were used for cross-checking the tumor extent and fine-tuning the tumor contour. Z-score normalization was applied within the VOI for all sequences to adjust the ROI intensity to have a mean of 0 and a standard deviation of 1. This radiologist and a deputy chief physician in neurosurgery with over 10 years of work experience randomly selected 100 patients within the group for VOI redrawing using a simple random sampling method. Interclass correlation coefficients (ICC) were used to evaluate intra-rater reliability analysis for the test-retest dataset and inter-rater reliability analysis for the multiple description dataset, retaining features with ICC ≥ 0.75. The obtained VOI was then overlaid with co-registered rT1WI, rCE-T1WI, rT2WI, rFLAIR, and rADC images.

PyRadiomics was used to extract three categories of features, including first-order intensity statistics, shape descriptors, and higher-order texture features. Five basic matrices were employed to define texture features: the gray-level co-occurrence matrix (GLCM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), gray-level dependence matrix (GLDM), and neighborhood gray-tone difference matrix (NGTDM). In this study, imaging features were extracted from three types of images: original images, wavelet images, and Gaussian Laplace images. A PyRadiomics parameters file was provided in Github repository to enhance the reproducibility of feature extraction (https://github.com/Zaoqu-Liu/MOFS). Ultimately, 5929 features were extracted from the five MRI sequences, retaining 4271 features with ICC ≥ 0.75.

Hematoxylin and eosin (H&E) histological slide scanning and feature analysis

Pathology slides were scanned at ×20 magnification using a digital pathology scanner (KF-PRO-120-HI) to obtain the original whole slide images (WSI). Subsequently, the original WSI underwent color space conversion, tissue segmentation, patch selection, and feature extraction. Specifically, the WSI at the 5x resolution was converted from RGB to Lab color space, and Otsu’s algorithm was then applied to calculate a segmentation threshold for segmenting the tissue from the WSI. The obtained tissue image was tiled into many 1024×1024 patches at ×20 magnifications, where these patches were adjacent to one another covering the WSI. A Python package Yottixel was used to select the optimal patches for further analysis³⁸. Finally, CellProfiler (v4.2.5) software was used to extract features from each selected patch.

Whole exome sequencing (WES) and analysis

Tumor tissue and adjacent brain tissue DNA were extracted from samples using the QIAamp Fast DNA Tissue Kit (Qiagen). Blood samples were collected in tubes containing EDTA and centrifuged at 1600 × g for 10 min at 4 °C within 2 h of collection. Peripheral blood lymphocyte (PBL) pellets were stored at −20 °C until further use, and PBL DNA was extracted using the RelaxGene Blood DNA System (Tiangen Biotech Co., Ltd., Beijing, China). DNA quantification was performed using the Qubit 3.0 Fluorometer and Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, Inc., Waltham, MA, USA). DNA collected from tissue and PBL samples were fragmented using dsDNA Fragmentase enzyme (New England BioLabs, Inc., Ipswich, MA, USA), followed by size selection of DNA fragments (150–250 bp) using Ampure XP beads (Beckman Coulter, Inc., Brea, CA, USA). The KAPA Library Preparation Kit (Kapa Biosystems, Inc., Wilmington, MA, USA) was employed for the construction of DNA fragment libraries. Cleanup steps were performed using Agencourt AMPure XP beads (Beckman Coulter, Inc., Brea, CA, USA). After DNA fragmentation, end repair and 3’ A-tailing were conducted, followed by exon capture using the Agilent SureSelect Human All Exon V6 kit. The Qubit 3.0 Fluorometer and Qubit dsDNA HS Assay Kit were utilized to assess the purity and concentration of DNA fragments. Fragment length was measured using the DNA 1000 kit (Agilent Technologies, Inc., Santa Clara, CA, USA) on a 4200 Bioanalyzer (Agilent Technologies, Inc., Santa Clara, CA, USA). DNA libraries with 150 bp end sequences were sequenced using the Illumina Novaseq 6000 system. Raw data were converted to FASTQ files, and adapter and low-quality reads were trimmed using Trimmomatic (v0.39). We achieved a median coverage depth of 112x for tumor specimens and 128x for non-tumor specimens.

GATK (v4.2) tools were used to identify single nucleotide variants (SNVs) and insertions or deletions (INDELs). Paired-end WES reads were mapped to the human reference genome (hg38) using BWA-mem (v0.7.17). BAM files were further processed by reordering, sorting, marking duplicates, and adding read groups using Picard (v2.24.2). Base quality score recalibration was performed using the BaseRecalibrator module in GATK, followed by the assessment of cross-sample contamination using the GetPileupSummaries and CalculateContamination modules. Somatic variants were detected by MuTect2 and annotated using ANNOVAR, with patient-matched normal DNA sequencing reads serving as reference. Candidate somatic variants were distinguished based on the following filtering criteria: ① Variants outside of exonic regions and splice sites were excluded; ② Variants with a variant allele fraction (VAF) ≥ 5% and at least 2 supporting variant reads in tumor samples were retained; ③ Variants with a mutation allele frequency (MAF) ≥ 5% in at least one database, including 1000 Genomes, ESP6500, gnomAD, and ExAC, were removed. Normal samples were sequenced using the same scheme, each sample was reduced to 4%, and then pooled as reference. To obtain high-quality and reliable somatic variants, we employed stringent downstream filtering criteria: ① Variants outside of exonic regions and splice sites were excluded; ② Variants with a VAF ≥ 5%, at least 5 supporting variant reads in tumor samples, and variants with a VAF in the tumor that was more than five times the VAF in the normal sample were retained; ③ Variants with more than 100 occurrences in COSMIC (v92) were retained; ④ Variants with a MAF ≥ 1% in at least one variant database (1000 Genomes, ESP6500, gnomAD, and ExAC) were removed; ⑤ Variants predicted as benign in at least two of the following tools: MutationAssessor, MutationTaster2, Polyphen2, and SIFT, were removed. Somatic CNVs were inferred by CNVkit (v0.9.9) based on BAM files generated during the somatic mutation detection process, using the default circular binary segmentation algorithm. Segment-level log2 ratios were calculated and transformed as input for the GISTIC2.0 software to identify significantly amplified or deleted chromosomal regions in the tumors. CNV amplifications and deletions were defined using a ± 0.3 log2 ratio threshold.

RNA sequencing (RNA-seq) and analysis

Total RNA was extracted from tissue samples using the TRIzol Reagent Kit (Ambion, Invitrogen, USA). RNA concentration and integrity were assessed using the Qubit RNA Assay Kit, Qubit 2.0 Fluorometer (Life Technologies), and Agilent 2100 Bioanalyzer (Agilent Technologies). Samples with an RNA integrity number greater than 5 were included in the study. Libraries were prepared from samples with high RNA integrity, no contaminants, and sufficient RNA quantity. Poly-T oligonucleotide magnetic beads were used to purify RNA from total RNA. RNA was fragmented in NEBNext First Strand Synthesis Reaction Buffer (5X) using divalent cations at elevated temperatures. cDNA synthesis, end repair, A-tailing, and NEBNext Adaptor ligation were performed using the NEBNext Ultra RNA Library Prep Kit. Library fragments were purified with AMPure XP (Beckman Coulter, Beverly, USA), selecting cDNA fragments of 150–200 bp in length. Library quality was assessed using the Agilent Bioanalyzer 2100. Libraries were sequenced on the Illumina HiSeq X Ten platform, generating 150 bp paired-end reads. Sequencing data were filtered using Trimmomatic software to remove adaptors and low-quality sequences, followed by data quality assessment using FastQC. STAR (v2.7.6a) was used to align sequences to the reference genome (hg38). Gene expression values were calculated using RSEM (v1.3.3) based on the GENCODE (v35) gene annotation file. HTSeq v0.6.0 was used to count the number of reads aligned to each gene, and gene expression levels were quantified as FPKM (Fragments Per Kilobase of exon model per Million mapped fragments) and TPM (Transcripts Per Kilobase of exon model per Million mapped reads).

Mass spectrometry

Samples were removed from −80 °C storage, and an appropriate amount of tissue was weighed and placed into a liquid nitrogen pre-chilled mortar. Liquid nitrogen was added, and the tissue was thoroughly ground into a powder. Lysis buffer (1% Triton X-100, 1% protease inhibitor, 1% phosphatase inhibitor, 3 μM TSA, 50 mM NAM) was added to each sample at four times the volume of the powder, followed by ultrasonic lysis. Samples were centrifuged at 4 °C, 12,000 × g for 10 min to remove cell debris, and the supernatant was transferred to a centrifuge tube. Protein concentration was determined using a BCA assay kit.

Equal amounts of protein from each sample were digested with trypsin, and the volume was adjusted using lysis buffer. One volume of pre-chilled acetone was added, vortexed, and then four volumes of pre-chilled acetone were added, followed by precipitation at −20 °C for two hours. Samples were centrifuged at 4500 × g for 5 min, and the supernatant was discarded. The pellet was washed twice with pre-chilled acetone. After air-drying the pellet, it was resuspended in 200 mM TEAB, and trypsin was added at a 1:50 ratio (protease: protein, w/w) for overnight digestion. Dithiothreitol (DTT) was added to a final concentration of 5 mM, and samples were reduced at 56 °C for 30 min. Iodoacetamide (IAA) was added to a final concentration of 11 mM, and samples were incubated in the dark at room temperature for 15 min.

Samples were separated using an Agilent 300Extend C18 column (4.6 × 250 mm), with detection at 214 nm, a column oven temperature of 35 °C, and 95% buffer A for 30 min to equilibrate the column. After baseline stabilization, the fractionation gradient method was initiated, and peptide samples were loaded onto the high-performance liquid chromatography (HPLC) fractionation column. Samples were collected at 1-min intervals, with fractions 11 to 46 combined into 12 groups and vacuum-dried. Peptides were dissolved in mobile phase A and separated using an EASY-nLC 1200 ultra-HPLC system. Mobile phase A consisted of a 0.1% formic acid and 2% acetonitrile aqueous solution, while mobile phase B consisted of a 0.1% formic acid and 90% acetonitrile aqueous solution. The gradient was set as follows: 0–96 min, 6%–25% B; 96–114 min, 25%–35% B; 114–117 min, 35–80% B; and 117–1200 min, 80% B, with a flow rate maintained at 500 nl/min. Separated peptides were ionized in the NSI ion source and data were collected using the Orbitrap Exploris 480 mass spectrometer.

Liquid chromatography (LC) parameters were consistent with those used during library construction. Peptides were separated using the ultra-high-performance liquid chromatography system and analyzed using the Orbitrap Exploris 480 mass spectrometer. Precursor ions and their fragment ions were detected and analyzed using the high-resolution Orbitrap. FAIMS compensation voltage (CV) settings were −40 V, −55 V, and −70 V. The primary mass scan range was set at 350–1350 m/z with a resolution of 120,000; the secondary scan resolution was set at 30,000. The secondary data acquisition mode was set to DIA mode, which followed a primary scan with 20 m/z window peptide ions entering the HCD collision cell using 32% collision energy for fragmentation, and subsequent secondary mass analysis. The automatic gain control (AGC) for the secondary spectrum was set at 600%.

Collection and processing of GBM data from public databases

In this study, we aimed to collect as much glioblastoma multiforme (GBM) sequencing data as possible from public databases to validate conclusions and enrich research content. We gathered 16 GBM datasets from the Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/) and the Chinese Glioma Genome Atlas (CGGA; http://www.cgga.org.cn/), including GSE72951 (GPL14951), GSE43289 (GPL570), GSE43378 (GPL570), GSE7696 (GPL570), GSE13041 (GPL570 and GPL96), GSE15824 (GPL570), GSE33331 (GPL570), GSE74187 (GPL6480), GSE83300 (GPL6480), GSE4271 (GPL96 and GPL97), GSE4412 (GPL96 and GPL97), CGGA-array (GPL4133), and CGGA-RNAseq (Illumina HiSeq). Transcriptomic expression profiles and clinical data of the samples were obtained from the series matrix files uploaded by the authors. We used the normalizeBetweenArrays function in the limma package to perform quantile normalization on the microarray data. Subsequently, we merged datasets from the same sequencing platform after removing batch effects, resulting in GBM-GPL570 (n = 215), GBM-GPL6480 (n = 110), GBM-GPL96 (n = 326), and GBM-GPL97 (n = 135) cohorts. The Combat function in the sva package was used to remove batch effects. Finally, each dataset was standardized using the scale function. Raw transcriptome sequencing data of GBM patients with anti-PD-1 immunotherapy information were obtained from the Sequence Read Archive (SRA, https://www.ncbi.nlm.nih.gov/sra) database, as uploaded by Zhao et al. (SRA accession: PRJNA482620)³².

Multimodal fusion for unsupervised clustering

Integrating multimodal data can reveal causal features that may be obscured in single-modality analyses, and offer a holistic understanding of the intricacy of diseases by exploring the interplay across modalities and how these relationships drive discrepancies in patient outcomes such as survival and drug response^24,39. Multimodal data fusion strategies can be classified into early, intermediate, and late fusion according to the time period²⁴. Early fusion concatenates all modalities of data into a single matrix, which can lead to the “curse of dimensionality” and variable shift in subsequent analyses, and cannot correct for the imbalance of multimodal data, which may have adverse effects on downstream analyses. Late fusion involves separately analyzing each omics layer and then integrating the results to yield consistent results and outputs. However, this approach sacrifices the complementary interactional information of multimodal data. Intermediate fusion usually involves integrating and clustering simultaneously to connect the dependency between different omics layers, identify multimodal joint clusters, and deduce patient stratification and molecular mechanisms^24,40. In general, intermediate fusion is more advanced, but it demands higher requirements for the integration algorithm²⁴. Here we performed the intermediate fusion of multimodal data (FAHZZU1 cohort) by integrating 11 algorithms based on different principles, followed by late fusion of the results obtained from the 11 algorithms to yield the final clustering results (Fig. 1A). Our process is as follows:

1.
Data preprocessing: For mutation data, a binary matrix was generated with “0” representing wild-type and “1” indicating mutations. Copy number variation (CNV) data was analyzed in segment mean values, capturing chromosomal amplifications and deletions. RNA-seq data was preprocessed to log2(FPKM) values. Protein data was similarly normalized to log2 intensity values to ensure comparability. For pathology and radiology data, feature extraction resulted in quantitative metrics. These values represented diverse descriptors, including first-order statistics, texture, and shape-based features for MRI and cellular-level attributes for pathology.
2.
Selection of clustering variables: To determine the optimal number of clustering variables, we first calculated the median absolute deviation (MAD) of variables in each modality layer including radiology, WSI, transcriptomics, and proteomics. We then selected the top-n variables from each layer, and combined them into 2640 variable combinations (Supplementary Data 3). The cluster prediction index (CPI)²⁵ and GAP statistic²⁶ were calculated for each combination, and the optimal number of clusters and input features for the final clustering were determined based on the combination with the highest sum of CPI and GAP. The CPI calculation was implemented using the IntNMF R package. The GAP statistic was calculated using the mogsa R package.
3.
Intermediate fusion of multimodal data was performed using 11 algorithms based on different principles, including CIMLR, CPCA, iClusterBayes, IntNMF, LRAcluster, MCIA, NEMO, PINSPlus, RGCCA, SGCCA, and SNF²⁹ (Supplementary Data 2).
4.
The clustering results are converted into a binary matrix⁴¹. Each cluster is represented by a separate column, and each sample is assigned a value of 1 in the column corresponding to its cluster and 0 in all others. Jaccard index was calculated using binary results from the 11 algorithms to evaluate the similarity between samples.
5.
Clustering of cluster analysis (COCA)²⁴ was used to obtain consensus results from the 11 algorithms based on the Jaccard distance matrix. Specifically, 70% of the samples were randomly selected in each iteration, and this process was repeated 10,000 times to generate a consensus matrix.
6.
Proportion of ambiguous clustering (PAC)²⁷ and Calinski and Harabasz index (CHI)²⁸ were used to evaluate the fitness of clustering numbers.
7.
Silhouette coefficients⁴² were calculated for each cluster, and samples with a silhouette coefficient below 0.4 were removed to obtain a core sample set.

Functional enrichment analysis

This study integrated C2-CP, C5-GO, and Hallmark gene sets from the MSigDB database (http://www.gsea-msigdb.org/gsea/msigdb/). Three methods were used for functional enrichment analysis of transcriptomic or proteomic data from different subtypes, including single-sample gene-set testing (ssGST)¹¹, over-representation analysis (ORA)⁴³, and gene-set enrichment analysis (GSEA)⁴⁴. ssGST analysis was performed using the yaGST package to obtain normalized enrichment scores (NES) for each pathway in each sample. Then, differential analysis was conducted on each pathway for different subtypes, and pathways with FDR < 0.001 and NES difference >1 were considered significantly enriched. ORA analysis was conducted using default parameters in the Metascape tool. GSEA analysis was performed using the clusterProfiler package, and pathways with FDR < 0.001 were considered significantly enriched.

Genomic alteration analysis

Using the maftools package, mutation data were processed and single nucleotide variations (SNVs), insertions/deletions (INDELs), and tumor mutation burden (TMB) were calculated for each sample. Genes with mutation frequencies greater than 5% across all samples were retained. Broad and focal CNV burden were defined as the sum of CNVs occurring in chromosome arms and focal segments, respectively. Due to the limited number of proteins detected by MS in this study, mRNA expression profiles were associated with CNVs to identify functional CNV genes. Pearson correlation was calculated between the mRNA expression and CNV variation scores for each gene, and genes with FDR < 0.05 and Pearson coefficient > 0.3 were retained (n = 3888). Subsequently, CNV amplification and deletion were defined using a critical value of ±0.3, and genes with alteration rates > 5% were retained. If a certain CNV variation frequency of a gene was relatively high, it was considered as the dominant variation for that gene. Genes with dominant variations more than two times greater than non-dominant variations were retained (n = 2168). Fisher’s exact test was used to test the CNV differences between the three subtypes for each gene, and genes with FDR < 0.05 were retained (n = 1023).

Transcriptome and proteome expression profiling analysis

Differential gene expression between the three subtypes was compared using Kruskal-Wallis test, and genes with an FDR < 0.05 were retained. Subsequently, Wilcoxon rank sum test was used to further compare the expression differences between two groups, and genes with an FDR < 0.05 were considered as subtype-specific genes.

Development of the MOFS ensemble classifier

Given the abundance of high-quality GBM transcriptome data in public databases, we developed an integrated classification framework based on transcriptome expression profiles to identify MOFS subtypes in external cohorts. The development process was based on the FAHZZU1 cohort as a training set, as illustrated in Fig. S7:

1.
Logistic regression and receiver operating characteristic (ROC) analysis were performed on all genes for each subtype. Genes with an FDR < 0.05 and an area under the ROC curve (AUC) > 0.7 were retained for each subtype.
2.
The Lasso algorithm was used for feature selection and dimension reduction, and genes with non-zero Lasso coefficients were used as input variables for modeling.
3.
The classifier was developed using an ensemble of 17 algorithms, including GST, adaptive boosting (AdaBoost), decision trees (DT), elastic net (Enet), gradient-boosted decision trees (GBDT), k-nearest neighbors (KNN), Lasso, linear discriminant analysis (LDA), naive Bayes (NBayes), neural network (NNet), principal component analysis (PCA), random forest (RF), ridge regression, stepwise logistic regression (StepLR), singular value decomposition (SVD), support vector machine (SVM), and XGBoost. The output of each algorithm was the probability of the three subtypes, and the sum of the probabilities for the three subtypes equaled 1. Algorithms that showed median survival trends inconsistent with the training set were considered ineligible.
4.
For each subtype, the average discriminant probability of all qualified algorithms was used as the final subtype probability, and the subtype with the highest probability was the final classification result (Fig. S7A).
5.
The classifier was initially trained on all samples with the true MOFS labels. Within the FAHZZU1 cohort, we performed a 60:40 random split to generate two independent subsets: FAHZZU1-test1 and FAHZZU1-test2. The resulting confusion matrices for both subsets consistently demonstrated high classification accuracy across the three MOFS subtypes, with a robust alignment between predicted and actual subtype labels (Fig. S7B-C). Subsequently, we assessed the model’s robustness under two distinct normalization schemes, TPM (Transcripts Per Million) and FPKM across multiple datasets. The confusion matrices for both the FAHZZU1 (Fig. S7D) and CGGA-RNAseq (Fig. S7E) cohorts further affirmed that the classifier exhibits comparable accuracy. To ensure the model’s fairness and reliability, we then performed stratified analyses, factoring in a range of clinical and treatment-related variables. The results from these analyses consistently revealed balanced performance across all demographic and treatment subgroups. Taken together, these results demonstrate the stability of our ensemble MOFS classifier.

Tumor microenvironment (TME) analysis

The ESTIMATE package⁴⁵ was utilized to evaluate the abundance of immune components, stromal components, and tumor purity in the GBM expression data. The single-sample Gene Set Enrichment Analysis (ssGSEA) algorithm⁴⁶ was used to calculate the infiltration abundance of immune and stromal cells based on markers from previous single-cell studies of GBM^9,47,48. The ssGSEA-based enrichment scores are relative indicators of gene signature activity across samples, rather than reflecting absolute cell counts or histological observations. The immunoregulatory gene set was downloaded from the TISIDB database (http://cis.hku.hk/TISIDB/)⁴⁹, which includes five categories: antigen presentation, immune co-stimulation, immune checkpoint, chemokines, and receptors. Differences in TME components, cell abundances, and immunoregulatory factors were compared among different subtypes.

To better characterize the cancer-immunity cycle (CIC), we constructed an Immunogram⁵⁰ consisting of eight aspects: tumor antigenicity, T cell chemotaxis and infiltration, T cell immunity, tumor cell recognition, T cell priming and activation, immune stimulatory factors, immune inhibitory molecules, and cytotoxicity. Tumor antigenicity was represented by log2(TMB). Cytotoxicity was calculated based on the formula proposed by Rooney et al.⁵¹, and the other pathways were calculated using published gene sets for the immune cycle and the GSVA package. When plotting the Immunogram, the score of each of the eight immune pathways for each patient was transformed into a Z score. If M represents the mean value of the score and SD represents the standard deviation of the score, the final score for each patient was calculated as 3 + 1.5 × (Score-M)/SD⁵⁰.

Tissue microarray

From SUPERBIOTEK company (Shanghai Superbiotek Pharmaceutical Technology, Shanghai, China), two glioma tissue microarrays (NGL1001) were purchased. The clinical data of tumor patients were obtained from the company’s official website.

Immunohistochemical staining

Immunohistochemistry (IHC) experiments were performed using anti-STRAP (Cat No. 18277-1-AP, Proteintech; 1:200) and anti-S100A4 (Cat No. 16105-1-AP, Proteintech; 1:200). Staining percentages were scored as 1 (1–25%), 2 (26–50%), 3 (51–75%), or 4 (76–100%), while staining intensity was scored from 0 (no signal color) to 3 (light yellow, brown, and dark brown). The final IHC score was calculated by multiplying the percentage of positively stained cells and the score for nuclear staining intensity.

A deep neural network model based on MRI features to predict MOFS subtypes

In clinical practice, radiology images have advantages such as convenience, low cost, and non-invasive acquisition over molecular omics data. To promote the clinical translation of our work, this study utilized a resilient backpropagation-based neural network algorithm, further improving the clinical practicality of the research. The process was as follows:

1.
Feature selection: For each MOFS subtype, MRI imaging features with a univariate logistic regression P value less than 0.01 were retained. Then, the Bootstrapping method was used to randomly extract 70% of the samples from all samples for logistic regression, which was repeated 1000 times. Genes with a resampling process maintained a significance level above 95% (P < 0.05) were retained. Next, the Lasso algorithm was used for further dimensionality reduction and model simplification, retaining input variables with non-zero Lasso coefficients as the input variables for modeling.
2.
Hyperparameter optimization: We divided the FAHZZU1 cohort into training and testing sets in a 7:3 ratio. We used the neuralnet package to construct the neural network model, with parameters including learning rate, loss function, activation function, number of hidden layers, and number of nodes in each layer. We performed hyperparameter optimization via grid search, selecting the parameter combination with the highest accuracy on the testing set as the final model.
3.
Model validation: Validation was performed using the confusion matrix and ROC analysis on the training set, testing set, FAHZZU2 validation set, and FAHZZU3 validation set.

Statistical analysis

All data processing, statistical analyses, and graphing were performed using R software (version 4.2.2). The Wilcoxon rank-sum test or t test was used to compare continuous variables between two groups, while the Kruskal-Wallis test or analysis of variance was used to compare continuous variables among three groups. The Fisher exact test was used for categorical variables. The correlation between two continuous variables was evaluated using Spearman or Pearson correlation coefficient. Survival analysis and Kaplan-Meier curve plotting were performed using the survival and survminer packages. The Benjamini-Hochberg method was used to correct the FDR value obtained from multiple comparisons of P values. All statistical tests were two-sided, with P < 0.05 indicating statistical significance.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The raw WES and RNA-seq data generated in this study have been deposited in the Genome Sequence Archive (GSA) database under accession code HRA006184. The raw MS data-based proteomics supporting the findings of this study have been deposited in the iProX database under accession code PXD062023. Raw transcriptome sequencing data of GBM patients with anti-PD-1 immunotherapy information were obtained from the Sequence Read Archive (SRA, https://www.ncbi.nlm.nih.gov/sra) database, as uploaded by Zhao et al. (SRA accession: PRJNA482620). The raw radiomics data and pathomics data are protected and are not available due to data privacy laws. Processed omics data are available at https://doi.org/10.5281/zenodo.14898297. The data utilized in this study is under controlled and is available only to qualified researchers upon reasonable request. Restrictions: Access is limited to researchers affiliated with recognized institutions and requires approval from the relevant ethics committee and data management committee. Access Procedure: Interested researchers must submit a formal request outlining their research objectives and provide approval letters from their institution’s ethics and data management committees. The review and approval process typically takes approximately one month. Contact Information: For inquiries regarding data access, please contact Dr. Zhenyu Zhang, Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, via email at fcczhangzy1@zzu.edu.cn.

Code availability

The MOFSR package is available on Github (https://github.com/Zaoqu-Liu/MOFS).

References

Ostrom, Q. T. et al. CBTRUS statistical report: primary brain and other central nervous system tumors diagnosed in the United States in 2015–2019. Neuro Oncol. 24, v1–v95 (2022).
Article CAS PubMed PubMed Central Google Scholar
Yan, H. et al. IDH1 and IDH2 mutations in gliomas. N. Engl. J. Med. 360, 765–773 (2009).
Article CAS PubMed PubMed Central Google Scholar
Mellinghoff, I. K. et al. Vorasidenib in IDH1- or IDH2-mutant low-grade glioma. N. Engl. J. Med. 389, 589–601 (2023).
Article CAS PubMed PubMed Central Google Scholar
Louis, D. N. et al. The 2021 WHO classification of tumors of the central nervous system: A summary. Neuro Oncol. 23, 1231–1251 (2021).
Article CAS PubMed PubMed Central Google Scholar
Delgado-Martin, B. & Medina, M. A. Advances in the knowledge of the molecular biology of glioblastoma and its impact in patient diagnosis, stratification, and treatment. Adv. Sci. (Weinh.) 7, 1902971 (2020).
CAS PubMed Google Scholar
Osuka, S. & Van Meir, E. G. Overcoming therapeutic resistance in glioblastoma: The way forward. J. Clin. Invest 127, 415–426 (2017).
Article PubMed PubMed Central Google Scholar
White, K. et al. New hints towards a precision medicine strategy for IDH wild-type glioblastoma. Ann. Oncol. 31, 1679–1692 (2020).
Article CAS PubMed Google Scholar
Phillips, H. S. et al. Molecular subclasses of high-grade glioma predict prognosis, delineate a pattern of disease progression, and resemble stages in neurogenesis. Cancer Cell 9, 157–173 (2006).
Article CAS PubMed Google Scholar
Wang, L. B. et al. Proteogenomic and metabolomic characterization of human glioblastoma. Cancer Cell 39, 509–528.e520 (2021).
Article CAS PubMed PubMed Central Google Scholar
Yanovich-Arad, G. et al. Proteogenomics of glioblastoma associates molecular patterns with survival. Cell Rep. 34, 108787 (2021).
Article CAS PubMed Google Scholar
Garofano, L. et al. Pathway-based classification of glioblastoma uncovers a mitochondrial subtype with therapeutic vulnerabilities. Nat. Cancer 2, 141–156 (2021).
Article CAS PubMed PubMed Central Google Scholar
Wang, Q. et al. Tumor evolution of glioma-intrinsic gene expression subtypes associates with immunological changes in the microenvironment. Cancer Cell 33, 152 (2018).
Article CAS PubMed PubMed Central Google Scholar
Verhaak, R. G. et al. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell 17, 98–110 (2010).
Article CAS PubMed PubMed Central Google Scholar
Ceccarelli, M. et al. Molecular profiling reveals biologically discrete subsets and pathways of progression in diffuse glioma. Cell 164, 550–563 (2016).
Article CAS PubMed PubMed Central Google Scholar
Cooper, L. A. et al. The tumor microenvironment strongly impacts master transcriptional regulators and gene expression class of glioblastoma. Am. J. Pathol. 180, 2108–2119 (2012).
Article CAS PubMed PubMed Central Google Scholar
Wu, M. et al. Multi-omics and pharmacological characterization of patient-derived glioma cell lines. Nat. Commun. 15, 6740 (2024).
Article CAS PubMed PubMed Central Google Scholar
White, K. et al. Identification, validation and biological characterisation of novel glioblastoma tumour microenvironment subtypes: implications for precision immunotherapy. Ann. Oncol. 34, 300–314 (2023).
Article CAS PubMed Google Scholar
Patel, A. P. et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344, 1396–1401 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Neftel, C. et al. An integrative model of cellular states, plasticity, and genetics for glioblastoma. Cell 178, 835–849.e821 (2019).
Article CAS PubMed PubMed Central Google Scholar
Weller, M. et al. Glioma. Nat. Rev. Dis. Prim. 1, 15017 (2015).
Article PubMed Google Scholar
Bi, W. L. et al. Artificial intelligence in cancer imaging: Clinical challenges and applications. CA Cancer J. Clin. 69, 127–157 (2019).
Article PubMed PubMed Central Google Scholar
Aftab, K. et al. Radiomics for precision medicine in glioblastoma. J. Neurooncol 156, 217–231 (2022).
Article PubMed Google Scholar
Schmauch, B. et al. A deep learning model to predict RNA-Seq expression of tumours from whole slide images. Nat. Commun. 11, 3877 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Lipkova, J. et al. Artificial intelligence for multimodal data integration in oncology. Cancer Cell 40, 1095–1110 (2022).
Article CAS PubMed PubMed Central Google Scholar
Chalise, P. & Fridley, B. L. Integrative clustering of multi-level ‘omic data based on non-negative matrix factorization algorithm. PLoS One 12, e0176278 (2017).
Article PubMed PubMed Central Google Scholar
Meng, C. et al. MOGSA: Integrative single sample gene-set analysis of multiple omics data. Mol. Cell Proteom. 18, S153–S168 (2019).
Article CAS Google Scholar
Senbabaoglu, Y., Michailidis, G. & Li, J. Z. Critical limitations of consensus clustering in class discovery. Sci. Rep. 4, 6207 (2014).
Article CAS PubMed PubMed Central Google Scholar
Caliński, T. & Harabasz, J. A dendrite method for cluster analysis. Commun. Stat. 3, 1–27 (1974).
MathSciNet Google Scholar
Pierre-Jean, M., Deleuze, J. F., Le Floch, E. & Mauger, F. Clustering and variable selection evaluation of 13 unsupervised methods for multi-omics data integration. Brief. Bioinform 21, 2011–2030 (2020).
Article CAS PubMed Google Scholar
Drews, R. M. et al. A pan-cancer compendium of chromosomal instability. Nature 606, 976–983 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Chen, D. S. & Mellman, I. Oncology meets immunology: the cancer-immunity cycle. Immunity 39, 1–10 (2013).
Article PubMed Google Scholar
Zhao, J. et al. Immune and genomic correlates of response to anti-PD-1 immunotherapy in glioblastoma. Nat. Med 25, 462–469 (2019).
Article PubMed PubMed Central Google Scholar
Louis, D. N. et al. The 2016 World Health Organization Classification of Tumors of the Central Nervous System: a summary. Acta Neuropathol. 131, 803–820 (2016).
Article PubMed Google Scholar
McKinnon, C., Nandhabalan, M., Murray, S. A. & Plaha, P. Glioblastoma: Clinical presentation, diagnosis, and management. BMJ 374, n1560 (2021).
Article PubMed Google Scholar
Nakasu, S., Deguchi, S. & Nakasu, Y. IDH wild-type lower-grade gliomas with glioblastoma molecular features: A systematic review and meta-analysis. Brain Tumor Pathol. 40, 143–157 (2023).
Article CAS PubMed Google Scholar
Hamdi, Y. et al. Family specific genetic predisposition to breast cancer: Results from Tunisian whole exome sequenced breast cancer cases. J. Transl. Med. 16, 158 (2018).
Article CAS PubMed PubMed Central Google Scholar
Al-Dalahmah, O. et al. Re-convolving the compositional landscape of primary and recurrent glioblastoma reveals prognostic and targetable tissue states. Nat. Commun. 14, 2586 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Kalra, S. et al. Yottixel - An image search engine for large archives of histopathology whole slide images. Med Image Anal. 65, 101757 (2020).
Article PubMed Google Scholar
Cantini, L. et al. Benchmarking joint multi-omics dimensionality reduction approaches for the study of cancer. Nat. Commun. 12, 124 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Lobato-Delgado, B., Priego-Torres, B. & Sanchez-Morillo, D. Combining molecular, imaging, and clinical data analysis for predicting cancer prognosis. Cancers (Basel) 14, 3215 (2022).
Nassiri, F. et al. A clinically applicable integrative molecular classification of meningiomas. Nature 597, 119–125 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Rousseeuw, P. J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987).
Article Google Scholar
Zhou, Y. et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat. Commun. 10, 1523 (2019).
Article ADS PubMed PubMed Central Google Scholar
Wu, T. et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innov. (Camb.) 2, 100141 (2021).
CAS Google Scholar
Yoshihara, K. et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat. Commun. 4, 2612 (2013).
Article ADS PubMed Google Scholar
Barbie, D. A. et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature 462, 108–112 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Abdelfattah, N. et al. Single-cell analysis of human glioma and immune cells identifies S100A4 as an immunotherapy target. Nat. Commun. 13, 767 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Becht, E. et al. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol. 17, 218 (2016).
Article PubMed PubMed Central Google Scholar
Ru, B. et al. TISIDB: an integrated repository portal for tumor-immune system interactions. Bioinformatics 35, 4200–4202 (2019).
Article CAS PubMed Google Scholar
Karasaki, T. et al. An immunogram for the cancer-immunity cycle: Towards personalized immunotherapy of lung cancer. J. Thorac. Oncol. 12, 791–803 (2017).
Article PubMed Google Scholar
Rooney, M. S., Shukla, S. A., Wu, C. J., Getz, G. & Hacohen, N. Molecular and genetic properties of tumors associated with local immune cytolytic activity. Cell 160, 48–61 (2015).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This study was supported by the National Natural Science Foundation of China (grant numbers: 82273493, 82173090, 62201557, U20A20171), the Natural Science Foundation of Henan Province for Excellent Young Scholars (grant number: 232300421057), the Science and Technology Research and Development Joint Fund of Henan Province (242301420014), the Henan Provincial Science and Technology Research Project (221100310100), and the Science and Technology Program of Henan Province (grant numbers: 242102311107). Young and Middle-aged Discipline Leaders Project in Health of Henan Province (To: Zhenyu Zhang), Support Plan for Scientific and Technological Innovation Talents in Colleges and Universities of Henan Province (To: Zhenyu Zhang), Shenzhen Science and Technology Program (JCYJ20241202125014018), Shenzhen Medical Research Fund (A2303008). All funders played no role in study design, data collection, analysis and interpretation of data, or the writing of this manuscript.

Author information

These authors contributed equally: Zaoqu Liu, Yushuai Wu, Hui Xu.

Authors and Affiliations

Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, 450052, China
Zaoqu Liu, Hui Xu, Siyuan Weng & Xinwei Han
Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
Zaoqu Liu
Interventional Institute of Zhengzhou University, Zhengzhou, Henan, 450052, China
Zaoqu Liu & Xinwei Han
Interventional Treatment and Clinical Research Center of Henan Province, Zhengzhou, Henan, 450052, China
Zaoqu Liu & Xinwei Han
Shanghai Academy of Artificial Intelligence for Science, Shanghai, China
Yushuai Wu
Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
Minkai Wang, Dongling Pei, Zilong Wang, Zeyu Ma, Wenchao Duan, Yuning Qiu, Dingyuan Su, Sen Li, Haoran Liu, Wenyuan Li, Caoyuan Ma, Miaomiao Yu, Yinhui Yu, Te Chen, Jing Fu, YingWei Zhen, Bin Yu, Yuchen Ji, Xianzhi Liu, Dongming Yan & Zhenyu Zhang
Center of Reproductive Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
Shuang Chen
Department of Pathology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
WeiWei Wang & Li Cui
Department of MRI, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
Jing Yan
Institute of Biomedical and Health Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Jingxian Duan, Yuanshen Zhao, Hairong Zheng, Dong Liang & Zhi-Cheng Li
School of Medicine, Hangzhou City University, Hangzhou, Zhejiang, China
Ran Li
Key Laboratory of Biomedical Imaging Science and System, Chinese Academy of Sciences, State Key Laboratory of Biomedical Imaging Science and System, Shenzhen, China
Hairong Zheng, Dong Liang & Zhi-Cheng Li
Center for Single-Cell Omics and Tumor Liquid Biopsy, Zhongnan Hospital of Wuhan University, Wuhan, China
Fubing Wang
Department of Laboratory Medicine, Zhongnan Hospital of Wuhan University, Wuhan, China
Fubing Wang
Wuhan Research Center for Infectious Diseases and Cancer, Chinese Academy of Medical Sciences, Wuhan, China
Fubing Wang

Authors

Zaoqu Liu
View author publications
Search author on:PubMed Google Scholar
Yushuai Wu
View author publications
Search author on:PubMed Google Scholar
Hui Xu
View author publications
Search author on:PubMed Google Scholar
Minkai Wang
View author publications
Search author on:PubMed Google Scholar
Siyuan Weng
View author publications
Search author on:PubMed Google Scholar
Dongling Pei
View author publications
Search author on:PubMed Google Scholar
Shuang Chen
View author publications
Search author on:PubMed Google Scholar
WeiWei Wang
View author publications
Search author on:PubMed Google Scholar
Jing Yan
View author publications
Search author on:PubMed Google Scholar
Li Cui
View author publications
Search author on:PubMed Google Scholar
Jingxian Duan
View author publications
Search author on:PubMed Google Scholar
Yuanshen Zhao
View author publications
Search author on:PubMed Google Scholar
Zilong Wang
View author publications
Search author on:PubMed Google Scholar
Zeyu Ma
View author publications
Search author on:PubMed Google Scholar
Ran Li
View author publications
Search author on:PubMed Google Scholar
Wenchao Duan
View author publications
Search author on:PubMed Google Scholar
Yuning Qiu
View author publications
Search author on:PubMed Google Scholar
Dingyuan Su
View author publications
Search author on:PubMed Google Scholar
Sen Li
View author publications
Search author on:PubMed Google Scholar
Haoran Liu
View author publications
Search author on:PubMed Google Scholar
Wenyuan Li
View author publications
Search author on:PubMed Google Scholar
Caoyuan Ma
View author publications
Search author on:PubMed Google Scholar
Miaomiao Yu
View author publications
Search author on:PubMed Google Scholar
Yinhui Yu
View author publications
Search author on:PubMed Google Scholar
Te Chen
View author publications
Search author on:PubMed Google Scholar
Jing Fu
View author publications
Search author on:PubMed Google Scholar
YingWei Zhen
View author publications
Search author on:PubMed Google Scholar
Bin Yu
View author publications
Search author on:PubMed Google Scholar
Yuchen Ji
View author publications
Search author on:PubMed Google Scholar
Hairong Zheng
View author publications
Search author on:PubMed Google Scholar
Dong Liang
View author publications
Search author on:PubMed Google Scholar
Xianzhi Liu
View author publications
Search author on:PubMed Google Scholar
Dongming Yan
View author publications
Search author on:PubMed Google Scholar
Xinwei Han
View author publications
Search author on:PubMed Google Scholar
Fubing Wang
View author publications
Search author on:PubMed Google Scholar
Zhi-Cheng Li
View author publications
Search author on:PubMed Google Scholar
Zhenyu Zhang
View author publications
Search author on:PubMed Google Scholar

Contributions

Z.Y.Z., Z.C.L., F.B.W. and X.W.H. performed the research conception. Z.Q.L. contributed study design, data analysis and paper writing. Y.S.W. contributed to study design and paper revisiting. H.X. contributed the collection and processing of public data. S.Y.W. and S.C. contributed to the drawing of flow charts and model diagrams. M.K.W., D.L.P., W.W.W., J.Y., L.C., J.X.D., Y.S.Z., Z.L.W., Y.N.Q., W.C.D., Z.Y.M., J.Y., D.Y.S., S.L., H.R.L., W.Y.L., C.Y.M., M.M.Y., Y.H.Y., T.C. and J.F. performed the data acquisition. M.K.W., Z.L.W., Y.N.Q., W.C.D., Z.Y.M., J.Y., D.Y.S., S.L., H.R.L., W.Y.L., C.Y.M., M.M.Y., Y.H.Y., H.R.Z., D.L., T.C. and J.F. performed the data processing. Z.Y.Z., Z.C.L., F.B.W., X.W.H., D.M.Y., X.Z.L., Y.C.J., B.Y. and Y.W.Z. performed the project administration. All authors have read and approved the final version of the manuscript.

Corresponding authors

Correspondence to Xinwei Han, Fubing Wang, Zhi-Cheng Li or Zhenyu Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Guillermo Gomez, who coiewed with Amin Zadeh Shirazi, Osama Al-Dalahmah and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Supplementary Data 7

Supplementary Data 8

Supplementary Data 9

Supplementary Data 10

Reporting Summary

Transparent Peer Review file

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Liu, Z., Wu, Y., Xu, H. et al. Multimodal fusion of radio-pathology and proteogenomics identify integrated glioma subtypes with prognostic and therapeutic opportunities. Nat Commun 16, 3510 (2025). https://doi.org/10.1038/s41467-025-58675-9

Download citation

Received: 29 August 2024
Accepted: 26 March 2025
Published: 13 April 2025
Version of record: 13 April 2025
DOI: https://doi.org/10.1038/s41467-025-58675-9

This article is cited by

More than meets the eye: decoding glioma genomes in MRI pixel patterns
- Zhi-Cheng Li
- Dong Liang
- Hairong Zheng
European Radiology (2025)
Enhancing decision-making in glioblastoma surgery through an explainable human-AI collaboration: an international multicenter model development and external validation study
- Julius M. Kernbach
- Urte Schroeder
- Daniel Delev
npj Precision Oncology (2025)