Figure 3

Using TraCE to detect shortcuts in deep predictive models. In this experiment, we synthetically introduced a nuisance feature (overlaid the text PNEUMONIA in the top-left corner) into all images from the abnormal group, and used this data to train the predictive model. Given the entirely data-driven nature of machine-learned solutions, there is risk of inferring a decision rule based on this irrelevant feature in order to discriminate between normal and abnormal groups. (a–d) Here, we used randomly chosen query images from the normal class and generated counterfactuals for the abnormal class. In each case, we show the query image, the counterfactual explanation from TraCE and the absolute difference image between the two; (e, f) Here, we introduced the nuisance feature into CXR images from the abnormal group and synthesized counterfactuals for the normal class. We observe that TraCE can effectively detect such shortcuts—counterfactuals for changing the diagnosis state are predominantly based on manipulating the text on the top-left corner of the query images.