Introduction

Diffuse gliomas in adults represent the most common primary malignant central nervous system (CNS) neoplasms1. Isocitrate dehydrogenase (IDH) mutations are found as the most impactful molecular markers that characterize diffuse glioma patients into two groups with different prognosis, genetic traits, and potential treatment options2,3. The recent World Health Organization (WHO) Classification of Tumors of the Central Nervous System was released classified IDH mutant gliomas and IDH wild-type gliomas as distinct tumor entities in adult patients4. Notably, IDH-wildtype glioblastoma (GBM) is the most prevalent and aggressive glioma subtype in adults, with a 5-year survival rate of less than 10%5. Standard treatment for GBM includes gross total resection followed by radiotherapy and temozolomide (TMZ) chemotherapy6. Nevertheless, GBM is marked by substantial intertumoral and intratumoral heterogeneity, with significant discrepancies in genomic, transcriptomic, proteomic, and epigenomic levels, posing considerable challenges for homogenized management7. Thus, further investigation on the heterogeneity and stratification of IDH-wildtype GBMs is imperative.

Over the past decade, several studies have applied large-scale, high-throughput sequencing to investigate GBM heterogeneity, identify key oncogenic events and potential therapeutic targets, and establish molecular subtypes for patient stratification8,9,10,11,12,13,14. A landmark study by The Cancer Genome Atlas (TCGA) in 2010 classified GBM into four transcriptional subtypes: proneural, neural, mesenchymal, and classical13. Subsequent research highlighted the profound influence of the tumor microenvironment (TME) on GBM subtyping15. To address this, Wang et al.12 refined the classification system, retaining the proneural, mesenchymal, and classical subtypes while excluding the neural subtype due to contamination by normal cellular components. More recent studies have proposed further insights. Wu et al.16 introduced a classification based on pharmacological response and RNA expression in cancer cell lines, defining proneural, oxidative phosphorylation, and mesenchymal subtypes. White et al.17 proposed TME-based subtypes—TMElow, TMEmedium, and TMEhigh with prognostic relevance. Moreover, single-cell sequencing (scRNA-seq) has revealed multiple transcriptional states within GBMs, reflecting distinct biological processes such as hypoxia and cell cycle regulation18. Neftel et al.19 demonstrated that these transcriptional states converge into four tumor cell subpopulations, underscoring the transcriptional complexity. However, transcriptome-based subtyping remains limited in its ability to predict patient survival and treatment response11.

With advances in technology, the integration of multimodal data has brought insights into tumor stratification10. Generally, radiology, pathology, DNA, RNA, and protein could reflect the anatomical, cellular, genetic, transcriptional, and functional levels of disease, respectively. The integration of multimodal profiling enables us to better understand the linkages between complex disease phenotypes and biological mechanisms. Indeed, cellular-level microscopic phenotypes that arise from gene mutations, aberrant signaling, and abnormal expression have profound impacts on crucial cellular processes, such as cell proliferation, inflammatory response, and angiogenesis20. These alterations can be captured by histological images. Similarly, macroscopic phenotypes, such as tumor shape, texture, edema, and necrosis, can be visualized using advanced imaging techniques like multiparametric MRIs. Previous studies have also shown that radiological or histopathological images can reflect the mutational status or expression level of genes21,22,23. This not only offers a bridge between radio-histopathological images and molecular omics but also serves as the theoretical basis for multimodal fusion analysis.

Here, with the integration of multimodal data from MRI-derived radiomics, whole slide images (WSI)-derived pathomics, whole-exon sequencing (WES), RNA sequencing (RNA-seq), and mass spectrometry (MS) based proteomics, we identify and validate three distinct subtypes in IDH-wildtype adult gliomas. The MOFSR package is designed for multimodal data fusion and analysis (https://github.com/Zaoqu-Liu/MOFS). Our multimodal analysis provides insight into GBM intertumoral heterogeneity in phenotypic manifestation, oncogenic signaling, immune response, and genomic alterations. Moreover, this work highlights potentially effective prognostication and therapeutic strategies for IDH-wildtype GBM patients.

Results

Identification of multimodal fusion subtypes in IDH-wildtype adult gliomas

Based on 2021 WHO Classification of CNS tumors (Fig. S1A), this study enrolled 1194 adult patients with IDH wild-type glioma, whose preoperative multiparametric MRIs (T1WI, CE-T1WI, T2WI, FLAIR, and ADC) performed image segmentation and feature extraction. Among these cases, 202 gliomas underwent RNA sequencing (RNA-seq), 180 underwent mass spectrometry (MS) for proteomics, and 122 underwent whole-exon sequencing (WES). Histological whole-slide images (WSIs) of 122 patients with corresponding radiology and sequencing data were generated via scanned Hematoxylin and Eosin (HE)-stained pathological slices (Fig. S1B). We divided the 1,194 patients into three cohorts: 122 patients with all multimodal data (FAHZZU1), 80 patients with RNA-seq and partial proteomics data (FAHZZU2), and 992 patients with only MRI data (FAHZZU3) (Fig. S1B). Details of all multimodal and clinicopathological data refer to Supplementary Data 1.

Integrating multimodal data reveals causal features that may be obscured in single-modality analyses, offering a holistic understanding of diseases. Multimodal data fusion can be categorized into early, intermediate, and late fusion based on the integration timing. Intermediate fusion is generally more advanced than early and late fusion, but it demands higher requirements for the integration algorithm7,24. In this study, we introduced a multimodal fusion subtyping (MOFS) framework (Fig. 1A). Briefly, we performed the intermediate fusion of multimodal data (FAHZZU1 cohort) by integrating 11 algorithms based on different principles (Supplementary Data 2), followed by late fusion of the results obtained from the 11 algorithms to yield the final clustering results.

Fig. 1: Multimodal data integration and clustering analysis for identifying multimodal fusion subtypes.
figure 1

A Overview of the multimodal data integration process. Feature matrices from whole-exon sequencing (WES), transcriptomic (RNA-seq), proteomic (LC-MS), pathomic (WSI), and radiomic data were integrated using 11 distinct algorithms for intermediate fusion, followed by late fusion to generate a consensus clustering. B Clustering prediction index (CPI) and GAP statistics were calculated to determine the optimal number of clusters, with K = 3 identified as the optimal clustering number. C Heatmap of the consensus matrix for 122 IDH-wildtype glioma patients, showing three distinct multimodal fusion subtypes (MOFS1, MOFS2, and MOFS3). D Principal component analysis (PCA) demonstrating distinct separation among the three identified MOFS in two-dimensional space.

Initially, the clustering prediction index (CPI)25 and GAP statistic26 were calculated for different combinations of modal variables, and the optimal clustering number was achieved when K = 3 (Fig. 1B and Supplementary Data 3). Additional support for this decision was provided by the proportion of ambiguous clustering (PAC)27 and Calinski-Harabasz index (CHI)28, which indicated that the classification was more robust with three subtypes (Fig. S2A, B). Using 11 distinct intermediate fusion algorithms with various principles29, we next performed multimodal fusion clustering on 122 IDH-wildtype glioma patients with all modalities available. To further generate the consensus clustering results, the late fusion24 was conducted on the Jaccard distance matrix between the samples, which revealed three MOFS subtypes (Fig. 1C and Fig. S2C). Principal component analysis (PCA) suggested distinct separation among three subtypes in two-dimensional spatial coordinates (Fig. 1D). The silhouette statistic was utilized to identify the samples that best represented one of three subtypes, yielding a core set of 116 cases for MOFS identification (Fig. S2D).

Radio-pathological and biological peculiarities of MOFS subtypes

Radiological and pathological examinations unveiled distinct characteristics among the three MOFS subtypes identified through multimodal fusion analysis. First, MOFS1 gliomas were characterized by unsignificant or limited enhancement in CE-T1WI MRI, paired with relatively regular cell morphology, weak atypia, and uniform cell density (Figs. 2A and S2E), indicative of a relatively less aggressive radio-pathological phenotype. Among the 34 identified MOFS1 gliomas, there are 23 histological GBM, 6 molecular GBM and 5 IDH-wildtype diffuse gliomas. Second, MOFS2 gliomas were characterized by mass-like enhancement in CE-T1WI MRI, suggesting a more invasive nature. Pathological manifestation of MOFS2 is marked by a heterogeneous cellular architecture with varied cell sizes and shapes, pronounced atypia, and high cellular heterogeneity (Figs. 2A and S2E). Among the 33 identified MOFS2 gliomas, there are 30 histological GBM, 2 molecular GBM and 1 IDH-wildtype diffuse glioma. Third, MOFS3 gliomas were distinguished by ring-like enhancement with prominent central necrosis in CE-T1WI MRI. Pathologically, these tumors exhibited cellular atypia and focal immune infiltration (Figs. 2A and S2E). All the 49 identified MOFS3 gliomas are histological GBM.

Fig. 2: Radio-pathological and Biological Peculiarities of MOFS Subtypes.
figure 2

A Pathological (H&E stain) and radiological (CE-T1WI MRI) images of MOFS1, MOFS2, and MOFS3. MOFS1: Regular cell morphology with weak atypia; no or limited significant enhancement. MOFS2: Varied cell size and significant atypia; mass-like enhancement. MOFS3: Significant atypia and immune cell infiltration; ring-like enhancement with necrotic core (n = 116). Scale bars in the subfigures of pathological images, 50 μm. The size of the pathological image patch is 1024 × 1024 pixels, with each pixel representing 0.50 microns. B Multi-omics characterizes of MOFS subtypes (n = 116). C Enriched pathway analysis using the Metascape tool, highlighting distinct biological processes and signaling pathways for each MOFS subtype.

To further investigate the biological characteristic among the MOFS subtypes, functional enrichment analyses were performed using over-representation analysis (ORA), gene-set enrichment analysis (GSEA), and single-sample gene-set testing (ssGST) on both transcriptomic and proteomic data (Supplementary Data 4). MOFS1 was enriched in pathways related to neurodevelopment, including distal axon, GABA receptor binding, long-term synaptic depression, and neuron-to-neuron synapse (Figs. 2B, C and S34). MOFS2 showed significant enrichment in proliferation-related pathways such as G1/S-specific transcription, G2/M checkpoint, E2F targets, and cell cycle (Figs. 2B, C and S34). MOFS3 was predominantly associated with TME-related pathways, including cell-extracellular matrix interaction, TNF-alpha signaling via NF-kB, immune cell activation, and interferon-gamma response (Figs. 2B, C and S34).

Notably, approximately one-third of MOFS3 samples exhibited high cell cycle activity (Figs. S3–4 and S5A, B), suggesting the presence of heterogeneity within the MOFS3 subtype. Further analysis found no significant correlation between cell cycle activity and stromal scores in this subgroup (Fig. S5C). Functional enrichment analysis indicated that the high proliferation samples within MOFS3 remained enriched in TME-related pathways, highlighting the TME-rich identity of MOFS3 rather than a proliferation-driven phenotype (Fig. S5D). Jaccard similarity analysis further supported this conclusion, showing a high degree of concordance between the high and low proliferation subgroups within MOFS3 (Fig. S5E). While high cell cycle activity was associated with poorer prognosis across all samples (P = 0.042), it showed no prognostic significance within the MOFS3 subtype (P = 0.63) (Fig. S5F). These findings highlight the stability of MOFS3 as a distinct TME-rich subtype, despite its internal heterogeneity.

Based on these findings, MOFS1 was termed the proneural subtype due to its enrichment in neurodevelopmental pathways and less aggressive nature. MOFS2 was defined as the proliferative subtype due to its significant enrichment in proliferation-related pathways and high cellular heterogeneity. MOFS3 was classified as the TME-rich subtype, reflecting its association with abundant TME components.

Evaluation of modality contributions and validation of MOFS subtypes

To further evaluate the robustness of the MOFS framework, we performed single-modality clustering and assessed the impact of excluding individual modalities from the multimodal integration. Single-modality clustering showed limited concordance with MOFS subtypes and failed to achieve significant prognostic separation (Figs. 3A and S6A). In contrast, excluding individual modalities from the multimodal framework maintained relatively high concordance with MOFS subtypes and retained statistically significant prognostic power (Figs. 3A and S6B). Notably, Kaplan-Meier survival analysis demonstrated that the full MOFS framework outperformed both single-modality clustering and partial multimodal clustering in terms of prognostic discrimination (Fig. 3B). These findings underscore the importance of integrating multiple data layers to achieve a more robust and accurate classification system.

Fig. 3: Evaluation of modality contributions and validation of MOFS subtypes.
figure 3

A Heatmap showing the clustering results derived from single modality (left panel) and clustering results when each modality was excluded from the multimodal MOFS framework (right panel). B Bar plot showing the significance of Kaplan-Meier survival analysis for different clustering strategies, represented as -log10(p-value) from log-rank tests. C Comparison of MOFS subtypes with traditional GBM classifications. D Kaplan-Meier survival curves across eight cohorts, demonstrating significant survival differences among MOFS subtypes. Statistic tests: log-rank test.

We next compared the MOFS taxonomy with the traditional transcriptome-based classifications12,13,16,17. To validate the prognostic performance and biological functions of MOFS subtypes in public cohorts, we developed an integrated classification framework based on transcriptome expression profiles. Given the lack of multimodal data in public datasets, which predominantly consist of high-quality transcriptome data, this approach inevitably sacrificed some information. To mitigate overfitting and extract as much information as possible from the available transcriptome data, we constructed and validated an ensemble classifier (Fig. S7). Our results indicated that MOFS subtypes exhibited moderate correlation with previous classification systems (Fig. 3C). This discrepancy highlights the potential for our MOFS subtyping to offer insights into GBM heterogeneity. Kaplan-Meier survival analysis demonstrated significant survival differences among the three MOFS subtypes across eight cohorts (P < 0.05), with MOFS1 associated with the best prognosis and MOFS2 with the worst (Fig. 3D). The consistency and significance of survival trends for MOFS subtypes in seven external cohorts underscore the robustness of this taxonomy. In contrast, the traditional classifications exhibited only limited performance in predicting prognosis (Fig. S89). Considering that clinical variables such as age, sex, MGMT promoter methylation status, and treatment strategies significantly influence survival outcomes, we performed a multivariate Cox regression analysis to account for these factors. Even after adjusting for these variables, MOFS1 consistently correlated with better overall survival, MOFS2 remained predictive of poor prognosis, and MOFS3 showed no significant association with survival (Supplementary Data 5A). Additionally, MOFS1 and MOFS3 patients treated with TMZ exhibited significantly improved survival (HR < 1, P < 0.05), while no survival benefit was observed for MOFS2 patients receiving TMZ compared to untreated cases (P = 0.179) (Supplementary Data 5B). This finding suggested that MOFS2 patients exhibit resistance to TMZ, which may partially account for their overall worse prognosis. As TMZ is the standard chemotherapeutic agent for GBM, the lack of benefit in MOFS2 further underscores the need for alternative therapeutic strategies in this subtype. These findings further validate the robustness and predictive value of the MOFS classification system. Additionally, functional enrichment analyses in these seven external cohorts confirmed the stability and reproducibility of the biological characteristics associated with each MOFS subtype (Fig. S10).

Genomic alteration characteristics of MOFS subtypes

To identify the genetic traits peculiar to individual MOFS subtype, the genomic landscape was characterized. The analysis revealed no significant differences in single nucleotide variations (SNVs) (Fig. S11A), insertions/deletions (INDELs) (Fig. S11B), and tumor mutation burden (TMB) (Fig. 4A). Mutations in TP53, LRP2, and MCM10 were associated with unfavorable prognosis (P < 0.05) (Fig. 4B and Supplementary Data 6). Notably, SCN5, USH2A, PLEC, and DNAH3 showed significant mutation differences among three subtypes (P < 0.05) (Figs. 4C and S11C). MOFS2 exhibited a high frequency of mutations in SCN5, USH2A, and PLEC, while DNAH3 mutations were more prevalent in MOFS1 (Fig. 4C and Supplementary Data 6).

Fig. 4: Genomic alteration characteristics of MOFS subtypes.
figure 4

A Tumor mutation burden across MOFS subtypes (n = 116, P = 0.37). Statistical test: Kruskal-Wallis test (two-sided). Data are presented as box plots, with the center line representing the median, the box indicating the interquartile range (IQR, from the 25th to the 75th percentile), and the whiskers extending to the most extreme data points within 1.5 times the IQR; points beyond this range are shown as individual outliers. B Kaplan-Meier survival curves for TP53, LRP2, and MCM10 mutations (n = 116). Statistic tests: log-rank test. C Mutational rates of SCN5A, USH2A, PLEC, and DNAH3 among MOFS subtypes (n = 116). Statistic tests: Fisher’s exact test. D Heatmap of CNV broad and focal burden across MOFS subtypes. E CNV analysis showing higher burden in MOFS2 (n = 116). Statistic tests: two-sided t test. Data are presented as box plots, with the center line representing the median, the box indicating the IQR (from the 25th to the 75th percentile), and the whiskers extending to the most extreme data points within 1.5 times the IQR; points beyond this range are shown as individual outliers. F STRAP expression levels significantly higher in MOFS2 (n = 116). Statistic tests: two-sided t test. Data are presented as box plots, with the center line representing the median, the box indicating the IQR (from the 25th to the 75th percentile), and the whiskers extending to the most extreme data points within 1.5 times the IQR; points beyond this range are shown as individual outliers. G Immunohistochemistry (IHC) results of STRAP expression in MOFS subtypes (n = 27). H ROC analysis showing STRAP as a predictor for MOFS2 subtype (AUC = 0.802). I Kaplan-Meier survival curves indicating high CNV or expression of STRAP associated with worse prognosis (n = 116). Statistic tests: log-rank test. J IHC results from tissue microarray (TMA) confirming high STRAP protein levels negatively associated with prognosis (n = 92). Statistic tests: log-rank test.

Further analysis of mutational profiles in 10 canonical pathways revealed a higher frequency of mutations in the RTK-RAS pathway, with mutations in TP53 pathway particularly evident in MOFS2 (Fig. S11D). Additionally, MOFS2 demonstrated significant copy number variations (CNVs), exhibiting a heavy CNV broad and focal burden (Fig. 4D, E). This suggests that MOFS2 is characterized by a chromosomal instability (CIN) phenotype, a hallmark of human malignancies associated with poor prognosis, tumor metastasis, and drug resistance30. The MOFS2-specific functional CNV genes were significantly associated with proliferation-related pathways (Fig. 2B), indicating a potential link between genome instability and the proliferative nature of MOFS2.

STRAP amplification and its prognostic significance in MOFS2

To further explore the relationship between CNVs and gene expression in MOFS subtypes, 1023 subtype-specific functional CNV genes were identified (Supplementary Data 7). Cox analysis revealed that STRAP, SCFD2, FIP1L1, and EXOC1 were risk factors in MOFS2, while KIF21A was a protective factor (P < 0.05) (Fig. S12). Notably, STRAP amplification was observed exclusively in MOFS2 (P < 0.0001) (Fig. S12), with significantly higher STRAP expression levels compared to other subtypes (P < 0.0001) (Fig. 4F). This finding was further validated by immunohistochemistry (IHC) results of glioma tissues (Fig. 4G).

ROC analysis indicated that STRAP could accurately predict the MOFS2 subtype (AUC = 0.802) (Fig. 4H). Furthermore, STRAP was predominantly associated with the dismal prognosis of MOFS2, with no significant association observed in the other subtypes (Fig. S12). Kaplan-Meier analysis further demonstrated that high CNV or expression of STRAP was associated with the worst prognosis (P < 0.05) (Fig. 4I). IHC results from tissue microarray (TMA) confirmed that high protein levels of STRAP were negatively associated with prognosis (P = 0.00015) (Fig. 4J and Supplementary Data 8). Functional analysis of the top 500 genes positively correlated with STRAP demonstrated significant enrichment in proliferation-related pathways (Supplementary Data 9). The above demonstrated that STRAP was specifically overexpressed and amplified in MOFS2, suggesting its potential role in promoting the proliferative phenotype of MOFS2.

MOFS3 tumors conveyed rich immune infiltration and sensitive immunotherapy efficacy

MOFS3 tumors were characterized by lower tumor purity but significantly higher immune and stromal components compared to the other subtypes (P < 0.0001) (Fig. 5A–C). Our analysis revealed that immune cells and immunomodulators were predominantly more abundant in MOFS3, reinforcing its TME-rich features (Fig. 5D). MOFS1, identified as the proneural subtype, exhibited higher infiltration levels of neurons, astrocytes, and oligodendrocytes (P < 0.0001) (Fig. 5E–G). To systematically assess the immunotherapeutic potential of the three subtypes, we constructed an immunogram for the cancer-immunity cycle (CIC)31, which underscores the dynamic and multifaceted nature of intratumoral immunity (Fig. 5H). Despite comparable tumor antigenicity among three subtypes, likely due to similar tumor mutation burdens (TMB) (Fig. 4A), MOFS3 stood out with heightened activation of other immune pathways (Fig. 5H). This points to a potentially greater benefit from immunotherapy for MOFS3 tumors, given their enriched immune contexture. To further substantiate this, we analyzed transcriptome expression profiles from GBM patients who received anti-PD-1 immunotherapy32. Our findings indicated that responders exhibited higher MOFS3 activity, while non-responders were more associated with MOFS2 activity (Fig. 5I, J). Notably, most of immunotherapy responders were classified as MOFS3, compared to only 25% in MOFS2 (P = 0.09) (Fig. 5K). This disparity underscored the differential immunotherapy responsiveness across MOFS subtypes and highlights MOFS3 as particularly amenable to immune checkpoint blockade.

Fig. 5: Immune infiltration and immunotherapy response in MOFS subtypes.
figure 5

AC Tumor purity, ImmuneScore, and StromalScore across MOFS subtypes, showing lower tumor purity but higher immune and stromal components in MOFS3 (n = 116). Statistic tests: two-sided t test. Data are presented as box plots, with the center line representing the median, the box indicating the interquartile range (IQR, from the 25th to the 75th percentile), and the whiskers extending to the most extreme data points within 1.5 times the IQR; points beyond this range are shown as individual outliers. D Heatmap of immune cells and immunomodulators, indicating higher abundance in MOFS3, highlighting its TME-rich features (n = 116). EG Infiltration levels of neurons, astrocytes, and oligodendrocytes, with higher levels observed in MOFS1 (n = 116). Statistic tests: two-sided t test. Data are presented as box plots, with the center line representing the median, the box indicating the IQR (from the 25th to the 75th percentile), and the whiskers extending to the most extreme data points within 1.5 times the IQR; points beyond this range are shown as individual outliers. H Immunogram of the cancer-immunity cycle, showing high activation of immune pathways in MOFS3, suggesting greater immunotherapeutic potential. I Activity of MOFS subtypes in GBM patients who received anti-PD-1 immunotherapy, with higher MOFS3 activity in responders. J Comparison of MOFS subtype activity between immunotherapy responders and non-responders (n = 17). Statistic tests: two-sided t test. Data are presented as box plots, with the center line representing the median, the box indicating the IQR (from the 25th to the 75th percentile), and the whiskers extending to the most extreme data points within 1.5 times the IQR; points beyond this range are shown as individual outliers. K Distribution of MOFS subtypes among responders and non-responders to anti-PD-1 therapy, with MOFS3 showing higher response rates. L Kaplan-Meier survival curves stratified by stroma abundance within MOFS3, showing significant prognostic differences between high and low stroma groups. Statistic tests: log-rank test. M ROC analysis of stromal markers, with S100A4 demonstrating predictive ability for stroma abundance at both mRNA and protein levels. N IHC results of S100A4 expression, indicating higher levels in MOFS3 (n = 15).

Stroma refined the prognostic stratification of MOFS3

MOFS3 tumors exhibited not only significant enrichment in various immune pathways but also a higher abundance of stromal components (P < 0.0001) (Fig. 5C). Moreover, endothelial cells and pericytes were also found to be significantly more abundant in MOFS3 than in MOFS1 and MOFS2 (Fig. S13A). These findings suggested that MOFS3 tumors contain a higher proportion of non-malignant cellular components. Further survival analyses revealed a strong association between stromal abundance and prognosis within MOFS3 (P < 0.01), a relationship not observed in the other subtypes (Fig. S13B). Prior findings indicated that MOFS3 displayed no significant prognostic value in our cohort (P = 0.57) (Fig. S13C). Intriguingly, upon integration of stroma into the MOFS taxonomy, different abundances of stroma within MOFS3 exhibited significant prognostic significance (P < 0.05) (Fig. 5L). More specifically, MOFS3 tumors with low stromal content demonstrated a median survival rate similar to MOFS1 (P = 0.832), whereas those with high stromal content resembled MOFS2 in terms of prognosis (P = 0.974, Fig. 5L).

To assess the predictive capacity of stromal biomarkers, ROC analysis was conducted. The canonical stromal marker S100A4 showed relatively accurate predictive ability at both the mRNA and protein levels (RNA: AUC = 0.72; protein: AUC = 0.83) (Fig. 5M). IHC analysis also indicated that MOFS3 tumors were characterized by higher expression levels of S100A4 (Fig. 5N). Furthermore, subsequent survival analysis of the tissue microarray (TMA) results confirmed the association between high levels of stromal content and worse prognosis (P = 0.042) (Fig. S13D and Supplementary Data 8). These findings underscore the importance of stromal components in refining the prognostic stratification within the MOFS3 subtype.

Development of MRI classifier for non-invasively predicting MOFS

To enhance the clinical applicability of our MOFS classification system, we leveraged readily accessible, non-invasive radiological imaging to predict MOFS subtypes. We initially filtered 22 quantitative imaging features derived from MRI scans to construct a deep neural network (DNN) model optimized through elastic backpropagation (Fig. 6A and Supplementary Data 10). Hyperparameter tuning led to the development of a deep neural network (DNN) model comprising two hidden layers.

Fig. 6: Neural network radiomics classifier for predicting MOFS subtypes.
figure 6

A Workflow for constructing a deep neural network (DNN) model using 22 radiomic features from MRI images, optimized through elastic backpropagation. B Confusion matrices showing DNN model accuracy on FAHZZU1 training, FAHZZU1 testing, and FAHZZU2 validation cohorts. C Kaplan-Meier survival analysis of predicted MOFS subtypes in the FAHZZU3 cohort, demonstrating significant survival differences (n = 992, P = 0.00025). Statistic tests: log-rank test. D Web tool interface for predicting MOFS subtypes using radiomic features.

The DNN was trained using the FAHZZU1 dataset, with 70% of the samples allocated to the training set and 30% to the testing set. Additional testing was conducted using the FAHZZU2 and FAHZZU3 datasets. On the FAHZZU1 training set, the model achieved a perfect area under the curve (AUC) of 1 for each MOFS subtype. On the FAHZZU1 testing set, the AUCs were 0.9, 0.968, and 0.889 for MOFS1, MOFS2, and MOFS3, respectively. For the FAHZZU2 dataset, where MOFS labels were predicted using an RNA-based ensemble classifier, the model achieved AUCs of 0.862, 0.958, and 0.898 for MOFS1, MOFS2, and MOFS3, respectively (Fig. S14A). The slight reduction in AUC values for this cohort may stem from differences in label precision, as the RNA-seq-based predictions lack the comprehensive biological context provided by the original multimodal clustering approach. These variations underscore the potential impact of modality on the accuracy of subtype assignments. The confusion matrix further confirmed the robust performance of the model, showing accuracies of 1, 0.917, and 0.825 in the FAHZZU1 training set, testing set, and FAHZZU2 validation set, respectively (Fig. 6B). Due to the lack of molecular multi-omics data in FAHZZU3 cohort, MOFS labels were unavailable. Despite this limitation, survival analysis of the predicted MOFS subtypes in FAHZZU3 demonstrated significant prognostic differences consistent with prior findings (P = 0.00025), supporting the predictive relevance of the DNN classifications (Fig. 6C). The MRI classifier demonstrates a reasonable level of concordance with the RNA-based MOFS ensemble classifier, capturing the general trends of MOFS1, MOFS2, and MOFS3 subtypes (Fig. S14B). This suggests that imaging-derived features can approximate transcriptomic-driven subtypes to a significant extent. However, some discrepancies highlight the inherent limitations of relying solely on a single modality, such as MRI or RNA, to fully capture the complexity of multimodal subtypes. To facilitate practical application by researchers and clinicians, we developed an accessible tool that allows users to input imaging feature data and obtain MOFS subtype predictions (Fig. 6D).

Discussion

The 2016 and 2021 WHO Classification of CNS tumors have resulted in a major improvement of the classification of adult diffuse gliomas base on IDH mutations4,33, in which the latest edition have prioritized IDH mutations over histological features4. Due to the poor prognosis and high prevalence, IDH-wildtype GBM represents a particularly noteworthy subpopulation34. Previous subtyping systems emerged earlier than the latest edition of the WHO Classification, which only focus on histologically determined GBM irrespective of IDH mutations, and neglect molecularly determined GBM. In addition, these studies derived only from transcriptome data, and resulted in limited prognostic value8,12,13.With the advancement in artificial intelligence, it is imperative to integrate multimodal data to subtype IDH-wildtype gliomas with high prevalence and much greater homogeneity. In this study, we integrated radio-pathology and proteogenomics to systematically investigate the heterogeneity of IDH wild-type gliomas. Our findings demonstrated that MOFS subtypes offer superior prognostic value compared to traditional classification systems, provide insights into the biological heterogeneity of GBM, and suggest specific therapeutic strategies for different subtypes.

MOFS1 is a proneural subtype with relatively favorable prognosis, endowed with elevated neurodevelopmental activity, and abundant neurocyte infiltration. This subtype also emerged as a transcriptional subtype proposed by Phillips et al.8 in 2006 and Wang et al.12 in 2017, which also displayed relatively better prognosis and neurodevelopmental properties. However, the transcriptional subtyping systems proposed by Phillips et al.8 and Wang et al.12 encompassed patients harboring IDH-mutant GBM and therefore lag behind the 2021 WHO classification of CNS tumors, which classify previously defined IDH-mutant GBM as IDH-mutant astrocytomas, Grade 44. In this study, we employed multimodal data from IDH wild-type glioma patients to identify the proneural subtype (MOFS1), which demonstrated moderate correlation with traditional classification systems. This suggests that patient selection and data modality have a significant impact on study results during the course of the study. Notably, MOFS1 gliomas are comprised of 23 histological GBM (102 in total), 6 molecular GBM (8 in total) and 5 IDH-wildtype diffuse gliomas (6 in total). This composition suggests MOFS1 have identified a proportion of histologically GBM with relatively better survival, and nearly all the molecular GBM and IDH-wildtype diffuse gliomas. Although molecular GBM were determined by TERT promoter mutations/ EGFR amplification/ whole chromosome 7 gain and whole chromosome 10 loss, and were reclassified as IDH-wildtype GBM in the latest WHO classification4, there prognosis were found significantly better than IDH-wildtype histologically determined GBM35. This result accords with ours in that the less aggressive MOFS1 embraces the majority of molecular GBM. IDH-wild-type lower-grade diffuse gliomas are indistinguishable from molecular GBM in the aspect of radiology and pathology, and MOFS1 include nearly all the IDH-wildtype lower-grade diffuse gliomas, suggesting these rare entities may also share similar biological traits and clinical outcomes with molecular GBM. Genetically, MOFS1 was enriched for the superior mutational frequency of DNAH3, a gene implicated in microtubule motility and ATP binding. A previous WES study identified DNAH3 as a high-risk variant in breast cancer36.

MOFS2 is a proliferative subtype with unfavorable prognosis, characterized by dismal prognosis, superior proliferative activity, genome instability, and high tumor purity. MOFS2 mainly includes 30 histological GBM, which may be the most aggressive subpopulation of histologically GBM. Compared to other subtypes, MOFS2 exhibited notable CNV, with its subtype-specific CNV genes significantly enriched in proliferation-related pathways. This implies that the proliferative traits of MOFS2 might be driven by gene CNV. Consequently, we further explored the clinical significance and functional features of the MOFS2-specific gene STRAP. This gene was exclusively amplified and overexpressed in MOFS2, and correspondingly, prognostic analysis revealed that STRAP held prognostic value solely in MOFS2, with high expression or amplification suggesting poor prognosis for these patients. Therefore, the subtype-specific gene STRAP may be an essential component of the proliferative phenotype of MOFS2, and targeting this gene may improve the clinical outcomes of MOFS2 patients. Additionally, further analysis revealed that immunotherapy non-responders displayed higher MOFS2 activity, possibly linked to superior tumor purity and diminished immune components.

MOFS3 is composed of a proportion of histologically GBM that characterized by a TME-rich subtype with intermediate prognosis, featured by abundant immune and stromal components. Immune infiltration analysis demonstrated that MOFS3 displayed high levels of immunomodulators (e.g., PD-1 and PD-L1) and CD8 + T cell infiltration, indicating its potential for immunotherapy benefits, which was validated in a GBM cohort treated with anti-PD-1 immunotherapy. Furthermore, MOFS3 also enriched luxuriant stromal contents, which served as a risk prognostic factor sorely in MOFS3. Intriguingly, upon integration of stroma abundance into the MOFS taxonomy, different abundances of stroma within MOFS3 exhibited significant prognostic value. More specifically, the low and high stroma groups demonstrated a median survival rate similar to MOFS1 and MOFS2, respectively. The findings suggest that further stratification of MOFS3 subtypes by high/low stroma can provide prognostic significance. To facilitate clinical feasibility, we identified the classic stromal marker S100A4, which could accurately predict stroma levels and identify MOFS3 patients.

Despite the encouraging results, our study has limitations. First, the reliance on transcriptomic data for classifier development, due to the lack of multi-omics data in public cohorts, may have limited the capture of the full molecular complexity of GBM. Future studies should aim to incorporate more comprehensive multi-omics datasets to enhance the classifier’s accuracy and robustness. Second, while the radiology classifier demonstrated strong predictive performance, its validation in larger and external cohorts is essential to confirm its generalizability. Assigning a single MOFS class to an entire tumor, based on dominant imaging features, oversimplifies the spatial heterogeneity often observed in GBM. Incorporating spatial resolution to differentiate tumor regions could improve prognostic accuracy, particularly in post-treatment recurrence scenarios, where distinguishing treatment effects from true tumor recurrence remains challenging. Moreover, Al-Dalahmah et al.37 utilized scRNA-seq data to describe prognostic GBM tissue states based on cellular composition, offering an alternative perspective that captures some aspects of the MOFS framework. A deeper integration of single-cell RNA sequencing with multimodal approaches may complement the MOFS system and further refine its prognostic and biological insights. Lastly, due to the absence of MGMT methylation status data in the FAHZZU cohort, survival analyses could not control for this crucial prognostic factor. The reliability of survival analysis results will be affected to a certain extent.

By leveraging artificial intelligence in multimodal integration for oncology, the MOFS classification system may represent a significant advancement in the understanding of GBM heterogeneity, offering superior prognostic value and informing precision oncology. The development of a neural network radiology classifier further enhances the clinical translatability of our findings, providing a non-invasive tool for predicting MOFS subtypes. Our study underscores the importance of integrating multi-omics data in cancer classification and paves the way for more personalized and effective GBM treatments.

Taken together, this study integrated radio-histology and proteogenomics to refine three subgroups in IDH-wildtype gliomas with prognostic and therapeutic opportunities. The multifarious biological and clinical peculiarities of the MOFS taxonomy improve the understanding of GBM heterogeneity and facilitate clinical stratification and individualized management. STRAP is significantly associated with the prognosis and proliferative phenotype of MOFS2 patients, thereby representing a potential therapeutic target for this subtype. The abundance of stroma serves as a vital prognostic index, which could reassess the survival risk of MOFS3 patients. To further facilitate researchers and clinical practitioners, we developed the MRI-based classifier for predicting MOFS subtypes. We believe this high-resolution taxonomy could facilitate more effective management of patients with IDH wild-type GBM.

Methods

Data and sample collection

The study was approved by The Human Scientific Ethics Committee of the First Affiliated Hospital of Zhengzhou University (FAHZZU; Approval No. 2019-KY-176 and 2023-KY-1028). Informed consent has been obtained from the patients for all fresh tumor specimens used in this study. This study retrospectively collected data on IDH wild-type glioma patients who underwent radical resection at FAHZZU between 2015 and 2021. Inclusion criteria were: age ≥18 years; primary glioma; integrated diagnosis of IDH wild-type GBM and lower-grade diffuse gliomas were reclassified according to the 2021 WHO classification4 (Fig. S1A); no previous radiation or chemotherapy before admission; complete clinical data and follow-up information; no serious systemic abnormalities before surgery; preoperative MRI data including T1WI, CE-T1WI, T2WI, FLAIR, and ADC maps obtained from DWI with good image quality and no significant differences; clear HE-stained pathological slices with high-quality scanned images; and well-preserved pathological tissues. Exclusion criteria were: history of brain surgery or trauma; previous radiation or chemotherapy before surgery; and presence of artifacts on MRI that would affect lesion observation or delineation.

A total of 1194 IDH wild-type glioma patients with complete and qualified MRI data were included in the study. Among these cases, fresh surgical tumor specimens were collected from 202 patients. These specimens were immediately frozen in liquid nitrogen and stored at −80 °C for tissue sequencing. Among them, 202 tissues underwent RNA sequencing (RNA-seq), 180 underwent mass spectrometry, and 122 underwent whole-exon sequencing (WES). Histological whole-slide images (WSIs) of the 122 patients with all radiomics and sequencing data were obtained via scanning HE-stained pathological slices (Fig. S1B). Furthermore, 5 adjacent brain tissues were collected from the tumor margin and 19 peripheral blood samples were taken before surgery as normal controls for WES. This study designated 122 samples with all modal data as the FAHZZU1 cohort, 80 samples with transcriptome or mass spectrometry data as the FAHZZU2 cohort, and 992 samples with only MRI data as the FAHZZU3 cohort (Fig. S1B).

MRI scanning and imaging feature extraction

Patient MRI images were acquired during routine examination using a 3.0 T MRI scanner (Siemens Magnetom Skyra/Trio TIM; GE Discovery MR750; Philips Ingenia). Sequences included: axial and sagittal T1-weighted imaging (T1WI), axial T2-weighted imaging (T2WI), axial T2-weighted fluid-attenuated inversion recovery (FLAIR) imaging, as well as axial, sagittal, and coronal post-contrast T1-weighted imaging (CE-T1WI) immediately after intravenous injection of a 0.1 mmol/kg dose of gadolinium-based contrast agent. Apparent diffusion coefficient (ADC) maps were obtained from axial diffusion-weighted imaging (DWI). The acquisition parameters for each sequence were as follows:

  1. a.

    T1WI and CE-T1WI: Repetition time (TR) 220–1750 ms; echo time (TE) 2.3–24 ms; echo train length (ETL) 1–12; slice thickness 5 mm; averages/excitations 1; flip angle (FA) 70°–111°; field of view (FOV) 220 × 192–240 × 240 mm2; matrix 256 × 162–320 × 256 mm2.

  2. b.

    T2WI: TR 1873–5390 ms; TE 70–117 ms; ETL 16–32; slice thickness 5 mm; averages/excitations 1; FA 90°–142°; FOV 220 × 192–240 × 240 mm2; matrix 320 × 238–512 × 512 mm2.

  3. c.

    FLAIR: TR 4500–8400 ms; TE 85–150 ms; inversion time (TI) 1670–2250 ms; ETL 1–38; slice thickness 5 mm; averages/excitations 1; FA 90°–150°; FOV 220 × 192–240 × 240 mm2; matrix 256 × 179–256 × 256 mm2.

  4. d.

    DWI: Images were processed by the corresponding post-processing workstation, and ADC images were calculated from DWI acquired at b-values of 0 and 1000 s/mm2. Sequence parameters included: TR 2121–6000 ms; TE 77–119 ms; ETL 1–82; slice thickness 5 mm; averages/excitations 1; FA 90°; FOV 220 × 220–240 × 240 mm2; matrix 152 × 114–192 × 192 mm2. ADC maps for all imaging planes were generated on a voxel-by-voxel basis using a single-exponential model.

First, the N4ITK algorithm was employed to correct bias field distortions for all sequences. After isotropic voxel resampling to 1 × 1 × 1 mm³ through trilinear interpolation, multi-sequence MRI rigid registration for each patient was performed using the axial resampled CE-T1WI as a template, and mutual information as similarity measure. This process was completed using the 3D Slicer software, generating registered images rT1WI, rCE-T1WI, rT2WI, rFLAIR, and rADC. Histogram matching was used for gray-level normalization on rT1WI, rCE-T1WI, rT2WI, and rFLAIR. We set the histogram level to 1024 and the number of matching points to 10 to achieve a finer match while preserve more details. A deputy chief physician in neuroradiology with over 10 years of experience in head MRI diagnosis manually delineated the tumor region of interest (ROI) on the axial plane of rFLAIR, rT2WI, and rCE-T1WI images using ITK-SNAP software, obtaining the tumor volume of interest (VOI). The VOI was defined as the enhanced area, non-enhanced area, and necrotic area of the tumor. The VOI contour was drawn based on FLAIR images, while rT2WI and rCE-T1WI were used for cross-checking the tumor extent and fine-tuning the tumor contour. Z-score normalization was applied within the VOI for all sequences to adjust the ROI intensity to have a mean of 0 and a standard deviation of 1. This radiologist and a deputy chief physician in neurosurgery with over 10 years of work experience randomly selected 100 patients within the group for VOI redrawing using a simple random sampling method. Interclass correlation coefficients (ICC) were used to evaluate intra-rater reliability analysis for the test-retest dataset and inter-rater reliability analysis for the multiple description dataset, retaining features with ICC ≥ 0.75. The obtained VOI was then overlaid with co-registered rT1WI, rCE-T1WI, rT2WI, rFLAIR, and rADC images.

PyRadiomics was used to extract three categories of features, including first-order intensity statistics, shape descriptors, and higher-order texture features. Five basic matrices were employed to define texture features: the gray-level co-occurrence matrix (GLCM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), gray-level dependence matrix (GLDM), and neighborhood gray-tone difference matrix (NGTDM). In this study, imaging features were extracted from three types of images: original images, wavelet images, and Gaussian Laplace images. A PyRadiomics parameters file was provided in Github repository to enhance the reproducibility of feature extraction (https://github.com/Zaoqu-Liu/MOFS). Ultimately, 5929 features were extracted from the five MRI sequences, retaining 4271 features with ICC ≥ 0.75.

Hematoxylin and eosin (H&E) histological slide scanning and feature analysis

Pathology slides were scanned at ×20 magnification using a digital pathology scanner (KF-PRO-120-HI) to obtain the original whole slide images (WSI). Subsequently, the original WSI underwent color space conversion, tissue segmentation, patch selection, and feature extraction. Specifically, the WSI at the 5x resolution was converted from RGB to Lab color space, and Otsu’s algorithm was then applied to calculate a segmentation threshold for segmenting the tissue from the WSI. The obtained tissue image was tiled into many 1024×1024 patches at ×20 magnifications, where these patches were adjacent to one another covering the WSI. A Python package Yottixel was used to select the optimal patches for further analysis38. Finally, CellProfiler (v4.2.5) software was used to extract features from each selected patch.

Whole exome sequencing (WES) and analysis

Tumor tissue and adjacent brain tissue DNA were extracted from samples using the QIAamp Fast DNA Tissue Kit (Qiagen). Blood samples were collected in tubes containing EDTA and centrifuged at 1600 × g for 10 min at 4 °C within 2 h of collection. Peripheral blood lymphocyte (PBL) pellets were stored at −20 °C until further use, and PBL DNA was extracted using the RelaxGene Blood DNA System (Tiangen Biotech Co., Ltd., Beijing, China). DNA quantification was performed using the Qubit 3.0 Fluorometer and Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, Inc., Waltham, MA, USA). DNA collected from tissue and PBL samples were fragmented using dsDNA Fragmentase enzyme (New England BioLabs, Inc., Ipswich, MA, USA), followed by size selection of DNA fragments (150–250 bp) using Ampure XP beads (Beckman Coulter, Inc., Brea, CA, USA). The KAPA Library Preparation Kit (Kapa Biosystems, Inc., Wilmington, MA, USA) was employed for the construction of DNA fragment libraries. Cleanup steps were performed using Agencourt AMPure XP beads (Beckman Coulter, Inc., Brea, CA, USA). After DNA fragmentation, end repair and 3’ A-tailing were conducted, followed by exon capture using the Agilent SureSelect Human All Exon V6 kit. The Qubit 3.0 Fluorometer and Qubit dsDNA HS Assay Kit were utilized to assess the purity and concentration of DNA fragments. Fragment length was measured using the DNA 1000 kit (Agilent Technologies, Inc., Santa Clara, CA, USA) on a 4200 Bioanalyzer (Agilent Technologies, Inc., Santa Clara, CA, USA). DNA libraries with 150 bp end sequences were sequenced using the Illumina Novaseq 6000 system. Raw data were converted to FASTQ files, and adapter and low-quality reads were trimmed using Trimmomatic (v0.39). We achieved a median coverage depth of 112x for tumor specimens and 128x for non-tumor specimens.

GATK (v4.2) tools were used to identify single nucleotide variants (SNVs) and insertions or deletions (INDELs). Paired-end WES reads were mapped to the human reference genome (hg38) using BWA-mem (v0.7.17). BAM files were further processed by reordering, sorting, marking duplicates, and adding read groups using Picard (v2.24.2). Base quality score recalibration was performed using the BaseRecalibrator module in GATK, followed by the assessment of cross-sample contamination using the GetPileupSummaries and CalculateContamination modules. Somatic variants were detected by MuTect2 and annotated using ANNOVAR, with patient-matched normal DNA sequencing reads serving as reference. Candidate somatic variants were distinguished based on the following filtering criteria: Variants outside of exonic regions and splice sites were excluded; Variants with a variant allele fraction (VAF) ≥ 5% and at least 2 supporting variant reads in tumor samples were retained; Variants with a mutation allele frequency (MAF) ≥ 5% in at least one database, including 1000 Genomes, ESP6500, gnomAD, and ExAC, were removed. Normal samples were sequenced using the same scheme, each sample was reduced to 4%, and then pooled as reference. To obtain high-quality and reliable somatic variants, we employed stringent downstream filtering criteria: Variants outside of exonic regions and splice sites were excluded; Variants with a VAF ≥ 5%, at least 5 supporting variant reads in tumor samples, and variants with a VAF in the tumor that was more than five times the VAF in the normal sample were retained; Variants with more than 100 occurrences in COSMIC (v92) were retained; Variants with a MAF ≥ 1% in at least one variant database (1000 Genomes, ESP6500, gnomAD, and ExAC) were removed; Variants predicted as benign in at least two of the following tools: MutationAssessor, MutationTaster2, Polyphen2, and SIFT, were removed. Somatic CNVs were inferred by CNVkit (v0.9.9) based on BAM files generated during the somatic mutation detection process, using the default circular binary segmentation algorithm. Segment-level log2 ratios were calculated and transformed as input for the GISTIC2.0 software to identify significantly amplified or deleted chromosomal regions in the tumors. CNV amplifications and deletions were defined using a ± 0.3 log2 ratio threshold.

RNA sequencing (RNA-seq) and analysis

Total RNA was extracted from tissue samples using the TRIzol Reagent Kit (Ambion, Invitrogen, USA). RNA concentration and integrity were assessed using the Qubit RNA Assay Kit, Qubit 2.0 Fluorometer (Life Technologies), and Agilent 2100 Bioanalyzer (Agilent Technologies). Samples with an RNA integrity number greater than 5 were included in the study. Libraries were prepared from samples with high RNA integrity, no contaminants, and sufficient RNA quantity. Poly-T oligonucleotide magnetic beads were used to purify RNA from total RNA. RNA was fragmented in NEBNext First Strand Synthesis Reaction Buffer (5X) using divalent cations at elevated temperatures. cDNA synthesis, end repair, A-tailing, and NEBNext Adaptor ligation were performed using the NEBNext Ultra RNA Library Prep Kit. Library fragments were purified with AMPure XP (Beckman Coulter, Beverly, USA), selecting cDNA fragments of 150–200 bp in length. Library quality was assessed using the Agilent Bioanalyzer 2100. Libraries were sequenced on the Illumina HiSeq X Ten platform, generating 150 bp paired-end reads. Sequencing data were filtered using Trimmomatic software to remove adaptors and low-quality sequences, followed by data quality assessment using FastQC. STAR (v2.7.6a) was used to align sequences to the reference genome (hg38). Gene expression values were calculated using RSEM (v1.3.3) based on the GENCODE (v35) gene annotation file. HTSeq v0.6.0 was used to count the number of reads aligned to each gene, and gene expression levels were quantified as FPKM (Fragments Per Kilobase of exon model per Million mapped fragments) and TPM (Transcripts Per Kilobase of exon model per Million mapped reads).

Mass spectrometry

Samples were removed from −80 °C storage, and an appropriate amount of tissue was weighed and placed into a liquid nitrogen pre-chilled mortar. Liquid nitrogen was added, and the tissue was thoroughly ground into a powder. Lysis buffer (1% Triton X-100, 1% protease inhibitor, 1% phosphatase inhibitor, 3 μM TSA, 50 mM NAM) was added to each sample at four times the volume of the powder, followed by ultrasonic lysis. Samples were centrifuged at 4 °C, 12,000 × g for 10 min to remove cell debris, and the supernatant was transferred to a centrifuge tube. Protein concentration was determined using a BCA assay kit.

Equal amounts of protein from each sample were digested with trypsin, and the volume was adjusted using lysis buffer. One volume of pre-chilled acetone was added, vortexed, and then four volumes of pre-chilled acetone were added, followed by precipitation at −20 °C for two hours. Samples were centrifuged at 4500 × g for 5 min, and the supernatant was discarded. The pellet was washed twice with pre-chilled acetone. After air-drying the pellet, it was resuspended in 200 mM TEAB, and trypsin was added at a 1:50 ratio (protease: protein, w/w) for overnight digestion. Dithiothreitol (DTT) was added to a final concentration of 5 mM, and samples were reduced at 56 °C for 30 min. Iodoacetamide (IAA) was added to a final concentration of 11 mM, and samples were incubated in the dark at room temperature for 15 min.

Samples were separated using an Agilent 300Extend C18 column (4.6 × 250 mm), with detection at 214 nm, a column oven temperature of 35 °C, and 95% buffer A for 30 min to equilibrate the column. After baseline stabilization, the fractionation gradient method was initiated, and peptide samples were loaded onto the high-performance liquid chromatography (HPLC) fractionation column. Samples were collected at 1-min intervals, with fractions 11 to 46 combined into 12 groups and vacuum-dried. Peptides were dissolved in mobile phase A and separated using an EASY-nLC 1200 ultra-HPLC system. Mobile phase A consisted of a 0.1% formic acid and 2% acetonitrile aqueous solution, while mobile phase B consisted of a 0.1% formic acid and 90% acetonitrile aqueous solution. The gradient was set as follows: 0–96 min, 6%–25% B; 96–114 min, 25%–35% B; 114–117 min, 35–80% B; and 117–1200 min, 80% B, with a flow rate maintained at 500 nl/min. Separated peptides were ionized in the NSI ion source and data were collected using the Orbitrap Exploris 480 mass spectrometer.

Liquid chromatography (LC) parameters were consistent with those used during library construction. Peptides were separated using the ultra-high-performance liquid chromatography system and analyzed using the Orbitrap Exploris 480 mass spectrometer. Precursor ions and their fragment ions were detected and analyzed using the high-resolution Orbitrap. FAIMS compensation voltage (CV) settings were −40 V, −55 V, and −70 V. The primary mass scan range was set at 350–1350 m/z with a resolution of 120,000; the secondary scan resolution was set at 30,000. The secondary data acquisition mode was set to DIA mode, which followed a primary scan with 20 m/z window peptide ions entering the HCD collision cell using 32% collision energy for fragmentation, and subsequent secondary mass analysis. The automatic gain control (AGC) for the secondary spectrum was set at 600%.

Collection and processing of GBM data from public databases

In this study, we aimed to collect as much glioblastoma multiforme (GBM) sequencing data as possible from public databases to validate conclusions and enrich research content. We gathered 16 GBM datasets from the Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/) and the Chinese Glioma Genome Atlas (CGGA; http://www.cgga.org.cn/), including GSE72951 (GPL14951), GSE43289 (GPL570), GSE43378 (GPL570), GSE7696 (GPL570), GSE13041 (GPL570 and GPL96), GSE15824 (GPL570), GSE33331 (GPL570), GSE74187 (GPL6480), GSE83300 (GPL6480), GSE4271 (GPL96 and GPL97), GSE4412 (GPL96 and GPL97), CGGA-array (GPL4133), and CGGA-RNAseq (Illumina HiSeq). Transcriptomic expression profiles and clinical data of the samples were obtained from the series matrix files uploaded by the authors. We used the normalizeBetweenArrays function in the limma package to perform quantile normalization on the microarray data. Subsequently, we merged datasets from the same sequencing platform after removing batch effects, resulting in GBM-GPL570 (n = 215), GBM-GPL6480 (n = 110), GBM-GPL96 (n = 326), and GBM-GPL97 (n = 135) cohorts. The Combat function in the sva package was used to remove batch effects. Finally, each dataset was standardized using the scale function. Raw transcriptome sequencing data of GBM patients with anti-PD-1 immunotherapy information were obtained from the Sequence Read Archive (SRA, https://www.ncbi.nlm.nih.gov/sra) database, as uploaded by Zhao et al. (SRA accession: PRJNA482620)32.

Multimodal fusion for unsupervised clustering

Integrating multimodal data can reveal causal features that may be obscured in single-modality analyses, and offer a holistic understanding of the intricacy of diseases by exploring the interplay across modalities and how these relationships drive discrepancies in patient outcomes such as survival and drug response24,39. Multimodal data fusion strategies can be classified into early, intermediate, and late fusion according to the time period24. Early fusion concatenates all modalities of data into a single matrix, which can lead to the “curse of dimensionality” and variable shift in subsequent analyses, and cannot correct for the imbalance of multimodal data, which may have adverse effects on downstream analyses. Late fusion involves separately analyzing each omics layer and then integrating the results to yield consistent results and outputs. However, this approach sacrifices the complementary interactional information of multimodal data. Intermediate fusion usually involves integrating and clustering simultaneously to connect the dependency between different omics layers, identify multimodal joint clusters, and deduce patient stratification and molecular mechanisms24,40. In general, intermediate fusion is more advanced, but it demands higher requirements for the integration algorithm24. Here we performed the intermediate fusion of multimodal data (FAHZZU1 cohort) by integrating 11 algorithms based on different principles, followed by late fusion of the results obtained from the 11 algorithms to yield the final clustering results (Fig. 1A). Our process is as follows:

  1. 1.

    Data preprocessing: For mutation data, a binary matrix was generated with “0” representing wild-type and “1” indicating mutations. Copy number variation (CNV) data was analyzed in segment mean values, capturing chromosomal amplifications and deletions. RNA-seq data was preprocessed to log2(FPKM) values. Protein data was similarly normalized to log2 intensity values to ensure comparability. For pathology and radiology data, feature extraction resulted in quantitative metrics. These values represented diverse descriptors, including first-order statistics, texture, and shape-based features for MRI and cellular-level attributes for pathology.

  2. 2.

    Selection of clustering variables: To determine the optimal number of clustering variables, we first calculated the median absolute deviation (MAD) of variables in each modality layer including radiology, WSI, transcriptomics, and proteomics. We then selected the top-n variables from each layer, and combined them into 2640 variable combinations (Supplementary Data 3). The cluster prediction index (CPI)25 and GAP statistic26 were calculated for each combination, and the optimal number of clusters and input features for the final clustering were determined based on the combination with the highest sum of CPI and GAP. The CPI calculation was implemented using the IntNMF R package. The GAP statistic was calculated using the mogsa R package.

  3. 3.

    Intermediate fusion of multimodal data was performed using 11 algorithms based on different principles, including CIMLR, CPCA, iClusterBayes, IntNMF, LRAcluster, MCIA, NEMO, PINSPlus, RGCCA, SGCCA, and SNF29 (Supplementary Data 2).

  4. 4.

    The clustering results are converted into a binary matrix41. Each cluster is represented by a separate column, and each sample is assigned a value of 1 in the column corresponding to its cluster and 0 in all others. Jaccard index was calculated using binary results from the 11 algorithms to evaluate the similarity between samples.

  5. 5.

    Clustering of cluster analysis (COCA)24 was used to obtain consensus results from the 11 algorithms based on the Jaccard distance matrix. Specifically, 70% of the samples were randomly selected in each iteration, and this process was repeated 10,000 times to generate a consensus matrix.

  6. 6.

    Proportion of ambiguous clustering (PAC)27 and Calinski and Harabasz index (CHI)28 were used to evaluate the fitness of clustering numbers.

  7. 7.

    Silhouette coefficients42 were calculated for each cluster, and samples with a silhouette coefficient below 0.4 were removed to obtain a core sample set.

Functional enrichment analysis

This study integrated C2-CP, C5-GO, and Hallmark gene sets from the MSigDB database (http://www.gsea-msigdb.org/gsea/msigdb/). Three methods were used for functional enrichment analysis of transcriptomic or proteomic data from different subtypes, including single-sample gene-set testing (ssGST)11, over-representation analysis (ORA)43, and gene-set enrichment analysis (GSEA)44. ssGST analysis was performed using the yaGST package to obtain normalized enrichment scores (NES) for each pathway in each sample. Then, differential analysis was conducted on each pathway for different subtypes, and pathways with FDR < 0.001 and NES difference >1 were considered significantly enriched. ORA analysis was conducted using default parameters in the Metascape tool. GSEA analysis was performed using the clusterProfiler package, and pathways with FDR < 0.001 were considered significantly enriched.

Genomic alteration analysis

Using the maftools package, mutation data were processed and single nucleotide variations (SNVs), insertions/deletions (INDELs), and tumor mutation burden (TMB) were calculated for each sample. Genes with mutation frequencies greater than 5% across all samples were retained. Broad and focal CNV burden were defined as the sum of CNVs occurring in chromosome arms and focal segments, respectively. Due to the limited number of proteins detected by MS in this study, mRNA expression profiles were associated with CNVs to identify functional CNV genes. Pearson correlation was calculated between the mRNA expression and CNV variation scores for each gene, and genes with FDR < 0.05 and Pearson coefficient > 0.3 were retained (n = 3888). Subsequently, CNV amplification and deletion were defined using a critical value of ±0.3, and genes with alteration rates > 5% were retained. If a certain CNV variation frequency of a gene was relatively high, it was considered as the dominant variation for that gene. Genes with dominant variations more than two times greater than non-dominant variations were retained (n = 2168). Fisher’s exact test was used to test the CNV differences between the three subtypes for each gene, and genes with FDR < 0.05 were retained (n = 1023).

Transcriptome and proteome expression profiling analysis

Differential gene expression between the three subtypes was compared using Kruskal-Wallis test, and genes with an FDR < 0.05 were retained. Subsequently, Wilcoxon rank sum test was used to further compare the expression differences between two groups, and genes with an FDR < 0.05 were considered as subtype-specific genes.

Development of the MOFS ensemble classifier

Given the abundance of high-quality GBM transcriptome data in public databases, we developed an integrated classification framework based on transcriptome expression profiles to identify MOFS subtypes in external cohorts. The development process was based on the FAHZZU1 cohort as a training set, as illustrated in Fig. S7:

  1. 1.

    Logistic regression and receiver operating characteristic (ROC) analysis were performed on all genes for each subtype. Genes with an FDR < 0.05 and an area under the ROC curve (AUC) > 0.7 were retained for each subtype.

  2. 2.

    The Lasso algorithm was used for feature selection and dimension reduction, and genes with non-zero Lasso coefficients were used as input variables for modeling.

  3. 3.

    The classifier was developed using an ensemble of 17 algorithms, including GST, adaptive boosting (AdaBoost), decision trees (DT), elastic net (Enet), gradient-boosted decision trees (GBDT), k-nearest neighbors (KNN), Lasso, linear discriminant analysis (LDA), naive Bayes (NBayes), neural network (NNet), principal component analysis (PCA), random forest (RF), ridge regression, stepwise logistic regression (StepLR), singular value decomposition (SVD), support vector machine (SVM), and XGBoost. The output of each algorithm was the probability of the three subtypes, and the sum of the probabilities for the three subtypes equaled 1. Algorithms that showed median survival trends inconsistent with the training set were considered ineligible.

  4. 4.

    For each subtype, the average discriminant probability of all qualified algorithms was used as the final subtype probability, and the subtype with the highest probability was the final classification result (Fig. S7A).

  5. 5.

    The classifier was initially trained on all samples with the true MOFS labels. Within the FAHZZU1 cohort, we performed a 60:40 random split to generate two independent subsets: FAHZZU1-test1 and FAHZZU1-test2. The resulting confusion matrices for both subsets consistently demonstrated high classification accuracy across the three MOFS subtypes, with a robust alignment between predicted and actual subtype labels (Fig. S7B-C). Subsequently, we assessed the model’s robustness under two distinct normalization schemes, TPM (Transcripts Per Million) and FPKM across multiple datasets. The confusion matrices for both the FAHZZU1 (Fig. S7D) and CGGA-RNAseq (Fig. S7E) cohorts further affirmed that the classifier exhibits comparable accuracy. To ensure the model’s fairness and reliability, we then performed stratified analyses, factoring in a range of clinical and treatment-related variables. The results from these analyses consistently revealed balanced performance across all demographic and treatment subgroups. Taken together, these results demonstrate the stability of our ensemble MOFS classifier.

Tumor microenvironment (TME) analysis

The ESTIMATE package45 was utilized to evaluate the abundance of immune components, stromal components, and tumor purity in the GBM expression data. The single-sample Gene Set Enrichment Analysis (ssGSEA) algorithm46 was used to calculate the infiltration abundance of immune and stromal cells based on markers from previous single-cell studies of GBM9,47,48. The ssGSEA-based enrichment scores are relative indicators of gene signature activity across samples, rather than reflecting absolute cell counts or histological observations. The immunoregulatory gene set was downloaded from the TISIDB database (http://cis.hku.hk/TISIDB/)49, which includes five categories: antigen presentation, immune co-stimulation, immune checkpoint, chemokines, and receptors. Differences in TME components, cell abundances, and immunoregulatory factors were compared among different subtypes.

To better characterize the cancer-immunity cycle (CIC), we constructed an Immunogram50 consisting of eight aspects: tumor antigenicity, T cell chemotaxis and infiltration, T cell immunity, tumor cell recognition, T cell priming and activation, immune stimulatory factors, immune inhibitory molecules, and cytotoxicity. Tumor antigenicity was represented by log2(TMB). Cytotoxicity was calculated based on the formula proposed by Rooney et al.51, and the other pathways were calculated using published gene sets for the immune cycle and the GSVA package. When plotting the Immunogram, the score of each of the eight immune pathways for each patient was transformed into a Z score. If M represents the mean value of the score and SD represents the standard deviation of the score, the final score for each patient was calculated as 3 + 1.5 × (Score-M)/SD50.

Tissue microarray

From SUPERBIOTEK company (Shanghai Superbiotek Pharmaceutical Technology, Shanghai, China), two glioma tissue microarrays (NGL1001) were purchased. The clinical data of tumor patients were obtained from the company’s official website.

Immunohistochemical staining

Immunohistochemistry (IHC) experiments were performed using anti-STRAP (Cat No. 18277-1-AP, Proteintech; 1:200) and anti-S100A4 (Cat No. 16105-1-AP, Proteintech; 1:200). Staining percentages were scored as 1 (1–25%), 2 (26–50%), 3 (51–75%), or 4 (76–100%), while staining intensity was scored from 0 (no signal color) to 3 (light yellow, brown, and dark brown). The final IHC score was calculated by multiplying the percentage of positively stained cells and the score for nuclear staining intensity.

A deep neural network model based on MRI features to predict MOFS subtypes

In clinical practice, radiology images have advantages such as convenience, low cost, and non-invasive acquisition over molecular omics data. To promote the clinical translation of our work, this study utilized a resilient backpropagation-based neural network algorithm, further improving the clinical practicality of the research. The process was as follows:

  1. 1.

    Feature selection: For each MOFS subtype, MRI imaging features with a univariate logistic regression P value less than 0.01 were retained. Then, the Bootstrapping method was used to randomly extract 70% of the samples from all samples for logistic regression, which was repeated 1000 times. Genes with a resampling process maintained a significance level above 95% (P < 0.05) were retained. Next, the Lasso algorithm was used for further dimensionality reduction and model simplification, retaining input variables with non-zero Lasso coefficients as the input variables for modeling.

  2. 2.

    Hyperparameter optimization: We divided the FAHZZU1 cohort into training and testing sets in a 7:3 ratio. We used the neuralnet package to construct the neural network model, with parameters including learning rate, loss function, activation function, number of hidden layers, and number of nodes in each layer. We performed hyperparameter optimization via grid search, selecting the parameter combination with the highest accuracy on the testing set as the final model.

  3. 3.

    Model validation: Validation was performed using the confusion matrix and ROC analysis on the training set, testing set, FAHZZU2 validation set, and FAHZZU3 validation set.

Statistical analysis

All data processing, statistical analyses, and graphing were performed using R software (version 4.2.2). The Wilcoxon rank-sum test or t test was used to compare continuous variables between two groups, while the Kruskal-Wallis test or analysis of variance was used to compare continuous variables among three groups. The Fisher exact test was used for categorical variables. The correlation between two continuous variables was evaluated using Spearman or Pearson correlation coefficient. Survival analysis and Kaplan-Meier curve plotting were performed using the survival and survminer packages. The Benjamini-Hochberg method was used to correct the FDR value obtained from multiple comparisons of P values. All statistical tests were two-sided, with P < 0.05 indicating statistical significance.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.