Fig. 1: Experimental design, study overview and cohort characterization.

a A suspected case of colorectal cancer (CRC) prompts a biopsy or surgical resection to obtain a tissue sample. This sample is then digitized into a Whole Slide Image (WSI) for analysis by a clinical Deep Learning system, which has the potential to pre-screen for MSI and POLE cases, pending external validation and regulatory approval as a medical device. b Our Deep Learning pipeline starts with tessellation of Whole Slide Images (WSIs) into smaller, relevant tiles while discarding non-informative background areas. We then extract n feature vectors from n color-normalized tiles, which range in size from 100,000 to 50,000 pixels across three color channels. These vectors are compressed into a more compact feature space and processed using a two-layer, eight-head Vision Transformer (ViT) architecture. Within this system, a ‘class token’ is simultaneously trained to generate the final MSI prediction. To aid pathological evaluation, we create heatmaps that visualize the areas of focus determined by the ViT’s attention mechanisms. c Molecular characterization of the TCGA (The Cancer Genome Atlas) cohort with respect to MSI and MSS (MSI-L/MSS). Combinations of microsatellite status and POLE/POLD1 (“d”: driver) mutations are shown (orange: POLE driver mutation, blue: MSI-H, yellow: MSI-L/MSS). d Molecular characteristics of the APHP (Assistance Publique–Hôpitaux de Paris, resection and biopsy) cohorts with respect to MSI and MSS cases. Combinations of microsatellite status and POLE/POLD1 (“d”: driver) mutations are shown (orange: POLE driver mutation, blue: MSI-H, yellow: MSS). The icons on all panels are obtained from www.flaticon.com.