Fig. 6: Integrating heart tissue from human, long-tail macaque, mouse, xenopus and zebrafish.
From: Benchmarking strategies for cross-species integration of single-cell RNA sequencing data

a Unscaled integrated score, species mixing score and biology conservation score of all strategies for the four heart tasks. The batch metrics and biology conservation metrics are not min-max scaled per task to enable cross-task comparison. Each point shows the average score across three homologous methods of each integration algorithm and error bars indicate the standard deviation. b The divergence time among the studied species in millions of years. c The isolated label F1 score of different strategies in the 4 heart tasks. Nearest neighbour-based (NN-based) results (n = 9) are by algorithms including fastMNN, SeuratV4 CCA and SeuratV4 RPCA and the other results (n = 18) are by non-NN-based algorithms. The bar in the boxplot shows the arithmetic mean, lower and upper hinges correspond to the first and third quartiles and whiskers extend from the hinge to the largest value no further than 1.5 * interquartile range from the hinge. There are no outliers. d The ALCS of SeuratV4 CCA O2O integrated data per species of each heart task. High ALCS indicates a strong loss of cell type distinguishability due to overcorrection (see Methods for details). e UMAP visualisation of SeuratV4 CCA O2O integrated data in heart tasks, coloured by species and cell type. Unscaled SM and BC scores are also shown. Iso F1, isolated label F1 score; NN, nearest neighbour; ALCS, accuracy loss of cell type self-projection; SM, species mixing score; BC, biology conservation score; O2O, only use one-to-one orthologs; hs, Homo sapiens, human; mf, Macaca fascicularis, long-tailed macaque; mm, Mus musculus, mouse; xl, Xenopus laevis, African clawed frog; dr, Danio rerio, zebrafish.