Fig. 1: Datasets and study design for HER2 status and Trastuzumab response classification.

a Datasets generated and used for training and testing the models. b Number of tiles in each class. The whole TCGA-BRCA slides as an independent test set were used for testing (we only showed proportion of tiles corresponding to only tumor regions here). For the response model, we only used the tumor regions to train and test the model which the proportion of tiles in each class are depicted here. c The main steps for preprocessing of slides and training the model. Our pathology team performed quality checks and annotated the ROIs in every slide to distinguish HER2+ tumor regions, HER2− tumor regions, and other non-tumor regions. In the preprocessing step, slides were tiled into 512 × 512 pixel windows, and background tiles were removed. Data were randomly split into 70% for training and 30% for testing for both Yale cohorts. The TCGA-BRCA cohort was used to independently validate the HER2 status prediction model. Data augmentation and color normalization were utilized to increase reproducibility. Classes were balanced with down- and up- sampling. Tiles were randomly sorted and converted into TFRecords to train the inception v3-based model. Test data was used to assess model performance. Predictions were visualized on WSIs with heatmaps.