Fig. 1

Overview of the workflow of the proposed methodology. Stage I focuses on extracting \(256 \times 256\) non-overlapping small patches from larger \(4000 \times 4000\) field-of-view (FOV) images, followed by resizing and generating cellular density maps using the TILSeg-MobileViT segmentation model. As part of the post-processing, the cellular density maps are reconstructed to the original \(4000 \times 4000\) larger FOV by stitching the smaller patches together. Stage II classifies these larger FOV images into three TILs categories by leveraging joint embeddings from the raw H&E images and their corresponding cellular density maps.