Fig. 1: CNN models achieve melanoma discrimination equivalent to or exceeding dermatologists across known and new benchmarks. | npj Digital Medicine

Fig. 1: CNN models achieve melanoma discrimination equivalent to or exceeding dermatologists across known and new benchmarks.

From: Stress testing reveals gaps in clinic readiness of image-based diagnostic artificial intelligence models

Fig. 1

a Model A performs comparably to mean of dermatologists (gray circles) and previously published algorithms14,15 (orange diamonds) by ROC curves on the external MClass-D and MClass-ND benchmarks and our VAMC-T benchmark. No previous algorithm has been evaluated on VAMC-T. b AUROC is shown for each ensemble model and each benchmark, with darker shades corresponding to higher values. Labels show AUROC values and 95% confidence intervals, with highest per test dataset in bold. ROC curves from (a) are boxed. Differences in AUROC between models were not statistically significant. Abbreviations: AUROC area under the receiver operating characteristic curve, CNN convolutional neural network, D Dermoscopic, ISIC International Skin Imaging Collaboration, MClass Melanoma Classification Benchmark, ND Non-dermoscopic, PH2 Hospital Pedro Hispano, ROC Receiver operating characteristic, UCSF University of California, San Francisco, VAMC-C Veterans Affairs Medical Center clinic, VAMC-T Veterans Affairs Medical Center teledermatology.

Back to article page