Fig. 2: Comparison of accuracy and efficiency of conventional, LLM-only, and LLM-assisted methods in extracting data and assessing ROB using Moonshot-v1-128k.
From: Language models for data extraction and risk of bias assessment in complementary medicine

This figure presents a comparison of the accuracy (correct rate) and efficiency (time spent) of three methods for data extraction and risk of bias (ROB) assessment: conventional, LLM-only, and LLM-assisted. For data extractions, the conventional method had an estimated accuracy of 95.3% and took 86.9 min per RCT. The LLM-only method achieved an accuracy of 95.1% and took only 96 s per RCT, while the LLM-assisted method had the highest accuracy at 97.9% and took 14.7 min per RCT. For ROB assessments, the conventional method had an estimated accuracy of 90.0% and took 10.4 min per RCT. The LLM-only method achieved an accuracy of 95.7% and took only 42 s per RCT, while the LLM-assisted method had the highest accuracy at 97.3% and took 5.9 min per RCT. These results demonstrate that LLM-assisted methods can achieve higher accuracy than conventional methods while being substantially more efficient.