Fig. 3: Comparison of Framework Performance.
From: A GPT-4o-powered framework for identifying cognitive impairment stages in electronic health records

USE Framework: Keyword-based sentence extraction with Universal Sentence Encoder (USE) embeddings and XGBoost classification. DementiaBERT Framework: Keyword-based sentence extraction with DementiaBERT embeddings (fine-tuned on dementia-related clinical language) and XGBoost classification. Hybrid Framework: GPT-4o-generated summaries with DementiaBERT embeddings and XGBoost classification. GPT-4o-Powered Framework: an End-to-end GPT-4o approach using GPT-4o-generated summaries and GPT-4o classification. a Comparison of Weighted Cohen’s Kappa Scores of the Four Models: Bar plot of weighted Cohen’s kappa scores for four models across 10 cross-validation folds. Each bar represents the kappa score for a specific model on each fold. b Multi-Metric Evaluation of the Four Models Performance: Table summarizing the performance of each model across three evaluation metrics: Cohen’s kappa score, Spearman’s Rank Correlation, and Baccianella’s adapted MSE. Mean and standard deviation values are provided over 10 folds. c Box Plot of the Weighted Cohen’s Kappa Scores of the Four Models Stratified by Sex: Comparison of kappa scores across the four models, stratified by sex (Male and Female), with p-values indicating statistical tests for differences in performance between male and female groups.