Table 2 Performance of deep learning NLP models to characterize statin nonuse from unstructured clinical notes in persons with ASCVD.

Task	Dataset	Precision*	Recall*	F1 score*	AUC*
Binary classification of statin use	10-fold cross-validation (N = 1,393)	0.88 (0.86–0.90)	0.82 (0.77-0.87)	0.85 (0.83–0.87)	0.94 (0.93–0.95)
	Test set (N = 349)	0.87 (0.82–0.91)	0.82 (0.76–0.88)	0.84 (0.81–0.88)	0.94 (0.93–0.96)
Two-step classifier* for statin nonuse reasons	10-fold cross-validation (N = 800)	0.63 (0.59–0.65)	0.62 (0.54–0.72)	0.62 (0.59–0.64)	0.84 (0.81–0.85)
	Test set (N = 200)	0.68 (0.63–0.75)	0.69 (0.60–0.79)	0.68 (0.62–0.75)	0.88 (0.86–0.91)
Multilabel classification of statin nonuse reasons (simple mutlilabel model)	10-fold cross-validation (N = 800)	0.60 (0.58–0.64)	0.61 (0.56–0.66)	0.59 (0.56–0.63)	0.85 (0.83–0.87)
	Test set (N = 200)	0.64 (0.61–0.70)	0.66 (0.60–0.73)	0.64 (0.58–0.71)	0.86 (0.82–0.89)

^*The two-step classifier represents the predicted probabilities of multiple classifiers (each reason for statin nonuse versus others) reconciled by a Random Forest.
ASCVD atherosclerotic cardiovascular disease, NLP natural language processing.

Quick links

Search