Extended Data Fig. 3: Comparing model outputs on open-ended question answering, example 1.
From: A multimodal generative AI copilot for human pathology

An example question in PathQABench-Public regarding uveal melanoma, for which the response by PathChat is ranked higher (considered more preferable by expert pathologists) than other models as it clearly, correctly, and fully addresses the query. The other models give incorrect locations that the image is from, give an incorrect description of the image, or are so general as to be unhelpful. Scale bar is 200 µm.