Table 1 Performance of different simulators in generating realistic scRNA-seq data using the PBMC-CTL dataset

From: GRouNdGAN: GRN-guided simulation of single-cell RNA-seq data using causal generative adversarial networks

Simulator

Cosine distance

Euclidean distance

MMD

RF AUROC

miLISI

GRouNdGAN

0.00057

182

0.026

0.54

1.891

scGAN24

0.00095

222

0.031

0.59

1.888

scDESIGN225

0.00100

229

0.065

0.76

1.736

SPARsim26

0.00104

235

0.309

0.95

1.625

Control

0.00019

99

0.012

0.50

1.909

  1. The metrics are calculated between a simulated dataset of 1000 cells and the held-out test set of 1000 real cells (see Supplementary Data 1 – Sheet 2 for training set performance). Each gene in the imposed GRN of GRouNdGAN is regulated by 15 TFs (constructed using GRNBoost2 from the experimental training set). For the first three metrics, a value closer to zero is preferred, for RF AUROC a value closer to 0.5 is preferred, and for miLISI a value closer to 2 is preferred. For the first two metrics, the values correspond to the distance of the mean centroids of the real and simulated cells. The RF AUROC of control corresponds to perfect performance (of a random classifier). The other control metrics are calculated using the two halves of the real test dataset. Best performance values (excluding control) are in bold-face.