Fig. 2: Accuracy, error rate, and hallucination rate for each named entity with the GPT4o 2POP method.

The figure shows the accuracy (orange), error rate (pink), and hallucination rate (yellow) for each of the 49 mCODE named entities extracted by GPT4o using the 2POP method on 1,000 synthetic clinical notes. Higher error and hallucination rates were observed for complex biomedical entities, while demographic information like patient gender and patient birth rate was extracted with near-perfect accuracy.