Fig. 2: Relationship between training cases per DRG and prediction accuracy by DRG-LLaMA. | npj Digital Medicine

Fig. 2: Relationship between training cases per DRG and prediction accuracy by DRG-LLaMA.

From: DRG-LLaMA : tuning LLaMA model to predict diagnosis-related group for hospitalized patients

Fig. 2

Results from DRG-LLaMA -7B with a maximum input token size of 512. a Scatter plot of top-5 prediction accuracy versus DRG ranks by number of training cases. Y-axis is top-5 prediction accuracy of each DRG label. X-axis is the rank of the 723 DRGs by their number of training cases, where DRG ranked 1st has the most training cases, and DRG ranked 723rd has the least training cases. Black dots indicate individual DRGs. The solid line represents smoothing spline estimated relationship (generalized cross-validation score: 0.055). The gray shaded area denotes a 95% Bayesian confidence interval for the smoothing spline estimated function. As expected, DRG-LLaMA ’s performance declined in less frequent DRGs. b Boxplot of training cases per DRG with groups of different prediction accuracy. DRGs are grouped by range of top-5 prediction accuracy as shown in X-axis. Y-axis is the number of training cases per DRG. The green line represents the median value; the box limits show the interquartile range (IQR) from the first (Q1) to third (Q3) quartiles; the whiskers extend to the furthest data point within Q1-1.5*IQR (bottom) and Q3+1.5*IQR (top). DRG groups with better prediction performance generally have a greater number of training cases, although there is a large variance in the number of training cases within the best-performing group.

Back to article page