Table 1 Overview of the reference tasks for benchmarking strategies for cross-species integration of scRNA-seq data

From: Benchmarking strategies for cross-species integration of single-cell RNA sequencing data

Task name

Pancreas_hs_mm45

Hippocampus_hs_mu_ss33

Embryo_xt_dr34,35

Heart_hs_mf1,3

Heart_hs_mf_mm1,3,36

Heart_hs_mf_mm_xl1,3,36,46

Heart_hs_mf_mm_xl_dr1,3,5,36,46

Species

Homo sapiensa, Mus musculus

Homo sapiens, Macaca mulatta, Sus scrofa

Xenopus tropicalis, Danio rerio

Homo sapiens, Macaca fascicularis

Homo sapiens, Macaca fascicularis, Mus musculus

Homo sapiens, Macaca fascicularis, Mus musculus, Xenopus laevis

Homo sapiens, Macaca fascicularis, Mus musculus, Xenopus laevis, Danio rerio

Technology

inDrop

snRNA-seq

inDrops

scRNA-seq with 10× Genomics 3′ V3.1 (H.sapiens), snRNA-seq with DNBelab C Series Single-Cell Library Prep Set (M.fascicularis)

same with 4, scRNA-seq with 10× Genomics 3’ V2 (M.musculus)

same with 5, microwell-seq (X.laevis)

same with 6, microwell-seq (D.rerio)

Number of cells per batch

Homo sapiens: 8,402; Mus musculus: 1,875

Homo sapiens: 9,170, 8,214, 7,907, 6,893, 6,091, 5,555; Macaca mulatta: 19,092, 8,678, 8,337; Sus scrofa: 13,168, 12,869, 10,814

Xenopus tropicalis: 123,632; Danio rerio: 36,627

Homo sapiens: 12,747; Macaca fascicularis: 10,465

same with 4, Mus musculus: 7,402

same with 5, Xenopus laevis: 21,427

same with 6, Danio rerio: 3,193

Number of analysed O2O homologs

11,248

12,557

5188

12,998

11,202

6609

4267

Number of analysed O2M and M2M homologs

664

24

1749

398

199

99

82

Challenge presented

Basic performance

Large data size, complex cell type structure

Whole-body data, challenging homology, developmental trajectory

Cross-study and cross-technology integration

Cross-study and cross-technology integration

Evolutionarily distant species, challenging homology

Evolutionarily distant species, challenging homology

  1. The “Number of cells per batch” refers to the cells from each batch that have passed the quality control criteria in the original literature and were included in the integration. The “Number of analysed O2O homologs” and “Number of analysed O2M and M2M homologs” pertain to the genes that were analysed in the integration tasks satisfying the following criteria: (1) quantified in all datasets involved, (2) can be mapped to an ENSEMBL gene ID, and (3) have homology annotation across all studied species. To identify one-to-one orthologs and in-paralogs, including one-to-many and many-to-many orthologs, we used the ENSEMBL multiple species comparison tool (version 106) and accessed it via biomaRt (v2.46.3). O2O, one-to-one; O2M, one-to-many; M2M, many-to-many.
  2. aThe scientific names of species are in italic formatting.