Fig. 4: The accuracy of each model and method on all named entities.

The figure presents a heatmap of accuracy scores for 49 mCODE named entities across different model-method pairs. Darker shades indicate higher accuracy, with GPT4o combined with the 2POP method showing consistently strong performance.