Fig. 1: Adversarially trained models are robust against adversarial attacks.

Adversarial perturbations with increasing strength (ϵ) were generated via an projected gradient descend (PGD) attack. To demonstrate the impact of adversarial attacks on state-of-the-art classifiers, we trained a ResNet-50 models with a large chest X-ray dataset (CheXpert) containing nearly 200,000 X-rays. a Original unmanipulated chest radiograph. b Adversarial noise with ϵ = 0.002. c Manipulated chest radiograph (original radiograph + noise), i.e. adversarial example. d The standard model was easily misled by small adversarial perturbations (b) that are not perceptible to the human eye (c) and accuracy in classifying the disease dropped drastically when allowing more pronounced perturbations. e Only a limited amount of performance degradation was observed when applying adversarial attacks on the model trained adversarially (ϵ during training was set to 0.005).