Extended Data Table 4 Benchmarking cell embeddings, using scGraph

From: Limitations of cell embedding metrics assessed using drifting islands

Method

HVG

Brain

Breast

COVID

Eye

Gut (F)

Heart

Lung (F,D)

Lung (F,O)

Lung

Pancreas

Skin

Harmony

 

0.168

0.739

0.770

0.405

0.538

0.763

0.511

0.284

0.700

0.520

0.465

Harmony

0.427

0.736

0.804

0.515

0.696

0.552

0.570

0.356

0.781

0.431

0.694

Scanorama

 

0.239

0.645

0.776

0.522

0.706

0.628

0.594

0.263

0.351

0.439

0.559

Scanorama

0.250

0.694

0.760

0.534

0.635

0.554

0.622

0.201

0.309

0.291

0.465

BBKNN

 

0.091

0.644

0.775

0.524

0.596

0.684

0.579

0.314

0.685

0.563

0.626

BBKNN

0.166

0.658

0.771

0.456

0.736

0.627

0.693

0.550

0.689

0.445

0.690

scVI

 

0.065

0.632

0.719

0.393

0.650

0.316

0.493

0.478

0.704

0.378

0.387

scVI

0.254

0.690

0.752

0.314

0.649

0.588

0.499

0.453

0.674

0.506

0.567

scANVI

 

0.116

0.647

0.757

0.408

0.626

0.350

0.567

0.552

0.672

0.390

0.386

scANVI

0.396

0.735

0.763

0.517

0.600

0.569

0.585

0.509

0.678

0.394

0.436

scGen

0.217

0.600

0.779

0.436

0.526

0.606

0.331

0.161

0.337

0.354

0.692

scPoli

 

-

-

0.573

0.431

0.588

0.679

0.394

0.572

0.588

0.311

0.462

scPoli

0.295

0.519

0.672

0.360

0.594

0.706

0.455

0.590

0.518

0.401

0.422

Geneformer

 

-

-

-

0.524

0.747

0.449

0.604

0.265

0.479

-

0.540

scGPT

-

-

0.535

0.256

0.447

0.487

0.552

0.388

0.390

-

0.378

Author’s

 

0.295

0.689

-

0.284

0.641

0.702

0.500

-

0.640

-

0.472

Islander

 

-0.071

-0.032

0.361

-0.335

0.098

0.013

-0.011

0.022

-0.061

0.234

-0.093

  1. “F”, “D” and “O” represents fetal, donor and organoid, respectively. “-” means the embeddings are not available, due to memory limitations (>500 G in RAM) or unavailability of raw counts or ensembl ids (used in Geneformer and scGPT). We bold the highest and underline the lowest scores for each dataset.