Fig. 1: Study design.

Dataset use and workflow from algorithm development through validation and deployment. Whole slide images (WSI) from three different cohorts were used for model development: 407 from The Cancer Genome Atlas (TCGA) consortium, 3161 from BLC3001 (NCT03390504) and 184 from BLC2002 (NCT03473743) from two erdafitinib trials7,8. A subset of 350 samples (150 FGFR +, 200 FGFR-; enriched for FGFR+ to achieve a ~93% statistical power) from the BLC3001 cohort, the trial with closest population to the deployment setting, and 188 samples from ANNAR (NCT03955913)6, the deployment trial, were left out for Retrospective Validation after packaging the algorithm into a deployable device and onboarding on deployment platform. There were no patients used in both Development and Retrospective Validation. An additional cohort with 361 slides from multiple tumor tissues (i.e., PAN-Tumor) from a data vendor was used to evaluate performance of the algorithm on solid tumors as exploratory analysis after deployment of the tool.