Fig. 5: Comparison of the performances between Fu-LLM (finetune_qwen2_7b) and the SVM models (SVM_TFIDF, SVM_TFIDF_wo_aug, SVM_Word2Vec and SVM_Word2Vec_wo_aug) in the study dataset. | Nature Communications

Fig. 5: Comparison of the performances between Fu-LLM (finetune_qwen2_7b) and the SVM models (SVM_TFIDF, SVM_TFIDF_wo_aug, SVM_Word2Vec and SVM_Word2Vec_wo_aug) in the study dataset.

From: A large language model for clinical outcome adjudication from telephone follow-up interviews: a secondary analysis of a multicenter randomized clinical trial

Fig. 5: Comparison of the performances between Fu-LLM (finetune_qwen2_7b) and the SVM models (SVM_TFIDF, SVM_TFIDF_wo_aug, SVM_Word2Vec and SVM_Word2Vec_wo_aug) in the study dataset.The alternative text for this image may have been generated using AI.

NPV negative predictive value; PPV positive predictive value; SVM support vector machine; SVM_W2V, SVM_Word2Vector. Within-group differences of the overall agreement, sensitivity and specificity between Fu-LLM and the SVM models were assessed using Cochran’s Q statistic. For comparison of NPV and PPV to those of the SVM models, the χ2 test was applied. The statistical tests were two-sided with significance set at p < 0.05. p had been adjusted by Bonferroni correction. Source data are provided as a Source data file.

Back to article page