Extended Data Fig. 5: Comparison between GeneAgent and conventional GSEA method on the NeST (n = 50) and the MsigDB (n = 56) datasets. | Nature Methods

Extended Data Fig. 5: Comparison between GeneAgent and conventional GSEA method on the NeST (n = 50) and the MsigDB (n = 56) datasets.

From: GeneAgent: self-verification language agent for gene-set analysis using domain databases

Extended Data Fig. 5

a, ROUGE scores obtained by GeneAgent and GSEA. The standard deviation (SD) on each bar was calculated using 9-fold cross-validation based on batch size (bs) sampling, with bs = 20 for both NeST and MSigDB. The ROUGE scores corresponding to each batch size are also presented in the figure. The central value of the error bars represents the mean score across all samples. The results are presented as mean ± SD. b, Distribution of similarity scores obtained by GeneAgent and GSEA. The middle points represent the mean values; bounds of the inner boxes of each violin plot represent the upper and lower percentiles; and whiskers represent the minimum and maximum points within all data samples. The statistic significant p-values are calculated by a one-tailed T test with 95% confidence intervals. The specific p-value for 106 gene sets is 4.5 Ɨ 10āˆ’5. The GSEA results are reproduced by the g:Profiler API.

Source data

Back to article page