Extended Data Fig. 3: Evaluation of GPT-4 in recovery of GO-CC and GO-MF names.
From: Evaluation of large language models for discovery of gene set function

a, Cumulative number of GO-CC term names recovered by GPT-4 (y-axis) at a given similarity percentile (x-axis). 0 = least similar, 100 = most similar. Blue curve: semantic similarities between GPT-4 names and assigned GO-CC term names. Grey dashed curve: semantic similarities between GPT-4 names and random GO-CC term names. The red dotted line marks that 642 of the 1000 sampled GO-CC names are recovered by GPT-4 at a similarity percentile of 95%. b, As for panel a, but for GO-MF terms rather than GO-CC. The red dotted line marks that 757 of the 1000 sampled GO-MF names are recovered by GPT-4 at a similarity percentile of 95%.