Fig. 5: Performance of LLMs in challenging cases.

Challenging cases (N = 39); The Challenging benchmark can be found in Supplementary Data 1 along with LLMs’ reasoning and classification.

Challenging cases (N = 39); The Challenging benchmark can be found in Supplementary Data 1 along with LLMs’ reasoning and classification.