Extended Data Fig. 2: Effect of prior knowledge and data size on integration performance.
From: Multi-omics single-cell data integration and regulatory inference with graph-linked embedding

a, Decrease in overall integration score at different prior knowledge corruption rates for integration methods that rely on prior feature relations (n=8 repeats with different corruption random seeds). b, Overall integration score, and c, FOSCTTM with different schemes of connecting peaks and genes as prior regulatory knowledge, for integration methods that rely on prior feature relations (n=8 repeats with different model random seeds). ‘Combined±0’ is the standard scheme where peaks overlapping gene body or promoter regions are linked. ‘Promoter±150k’ means that peaks are linked to genes if they locate within 150kb from the gene promoter, weighted by a power-law function that models chromatin contact probability42,43. d, Overall integration score of different integration methods on subsampled datasets of varying sizes (n=8 repeats with different subsampling random seeds). The error bars indicate mean ± s.d.