Extended Data Fig. 5: Benchmarking of embedding quality against existing methods on the COVID-19 dataset.

a, Adjusted Rand Index (ARI). b, Normalized Mutual Information (NMI). c, Cell type ASW. d, Isolated cell type F1 score. e, Isolated cell type ASW. f, Graph cLISI score. g, SCIB overall bio-conservation score. h, Silhouette score. i, Davis-Bouldin index (DBI); a lower DBI signifies better clustering. j, Label score. From left to right, the benchmarking methods are UNAGI, GraphSCC, scGEN, scGGAN, scGPT, Geneformer, scGNN, scVI, Seurat and SCANPY. The boxes represent the interquartile ranges (IQRs), and the solid lines indicate the medians. The whiskers extend to points within 1.5 IQRs of the lower and upper quartiles. The experiments in panels a-j run with different seeds (n = 10).