Fig. 5: Test results for adverse events bleeding and perforation. | npj Digital Medicine

Fig. 5: Test results for adverse events bleeding and perforation.

From: How machine learning on real world clinical data improves adverse event recording for endoscopy

Fig. 5

a The test results for adverse events bleeding and perforation (AUC-ROC and AUC-PR) are displayed. The model was trained on a training set (n = 1990) with labels generated by a large language model and tested on a manually labeled test set (n = 500). Direct error bars cannot be computed for this process, as random subsampling would require manual labels for all cases. b Estimated error values using only labels generated by a large language model are shown. Labels generated by a large language model are used for both training (n = 1990) and testing (n = 500). This process is repeated over 100 iterations using random subsampling, with a different split of training and test data in each iteration. Performance metrics (AUC-ROC, AUC-PR, and dummy classifier) are calculated as mean values, with the error bars representing the standard deviations shown in the plot.

Back to article page