Fig. 3: Attention analysis.

a, Attention allocated to different types of input from a patient with COPD, that is, the radiograph, ChiComp, LabTest and demographics. b, Relative importance of laboratory test items. c, Comparison of the importance of sex and age in making a diagnostic decision. d, Visualization of the attention assigned to individual pixels in the radiograph. Left: input chest X-ray. Right: pixels with different attention values. e, The impact of cross attention on the relevance and importance of high-ranking words (from chief complaints) and image patches (from radiographs) in the pulmonary disease identification task. Specifically, we define high-ranking words and patches as those whose tokens have top 25% cosine similarity scores with the CLS token. f, Normalized importance of every word in the chief complaint. g, Visualization of the distribution of attention between every image patch and each of the top 3 ranked words. The colour bars in d and g illustrate the confidence of IRENE about a pixel being abnormal, where a bright colour stands for high confidence and a dark colour denotes low confidence.