Fig. 4: Multilevel Multimodal Interpretation for GMLF.

a Modality-level importance attributions across all patients in the hold-out test dataset are analyzed using a SHAP-based interpretation approach on a modality-level proxy model. b SHAP-based modality-level importance attribution for a representative patient (SAEAMD-0BS5RI-A1). c Comparison of prediction scores between responder and non-responder groups for the three individual unimodal branches of our multimodal framework GMLF: Neural Embeddings (NE), Cell-type and Morphology (CM), and Gene Expression (GE), and the overall prediction score from GMLF for predicting response to NAC. P-values in the boxplot subfigures were computed using the Mann-Whitney U test, with “*” indicating P-values < 0.05. d Gene (per alias) importance attributions across all patients in the hold-out test dataset are determined by applying SHAP to a proxy model that inputs the gene expression feature vector alongside predictions from the two GNN branches. The top 20 are presented. e Gene set enrichment analysis of the selected top 111 genes selected according to their SHAP-based gene importance attributions. Statistical significance is assessed by the hypergeometric test, using the overall investigated gene list as a background. f Visualization of node importance for the cell type and morphology branch overlaid on the original H&E slide for slide SADREE-0BGNRK-1A, correctly predicted as complete response (pCR). g Representative patches around the top 10th quantile of nodal importance associated with non-pCR (top row) and pCR (bottom row), annotated with HoVer-Net-estimated cell types for the same slide as (f). h Analysis of cell-type specific distributions based on the most contributive patches - i.e., the top 25% extremes of patch importance per slide. Boxplots for the average patch-level cell counts or tumor-stromal ratios for no pCR (red) or pCR (blue) predictive patches normalized by the average patch-level cell-type specific attribute of the entire WSI, with each point representing a distinct slide. The dotted line represents the average patch-level attribute (cell count or tumor-stromal ratio) for a given slide, indicating no enrichment for a particular cell type.