Extended Data Fig. 6: “ChromLinker” analysis scheme integrates CITE-seq atlas populations with TEA-seq and then infers GRN.
From: A unified multimodal single-cell framework reveals a discrete state model of hematopoiesis in mice

a, Input data (CITE-seq and TEA-seq) and their underlying components. b, CITE-seq clusters defined by scTriangulate. c, harmonypy integration of CITE-seq labels to TEA-seq (RNA). d, UMAP representation of clusters captured by TEA-seq gates (HSC-MPP and MultiLin gates). e, TEA-seq BAM files (ATAC) were split according to pseudobulk cluster definitions. f, Peaks were called on individual pseudobulk cluster BAM files. g-h, Peaks were tested for association with genes within pre-defined TADs (g) using Pearson correlation of pseudobulk TEA-seq ATAC accessibility profile to pseudobulk CITE-seq gene expression across the 57 cluster profiles (h). i, A set of ~100,000 peaks significantly correlated to gene expression values (p-value < 0.001). j, ChromBPNet bias models were generated using the total merged peak set and the combined 10X Cell Ranger output BAM files from the MultiLin sort gate. k, ChromBPNet bias-factorized models were successfully generated for 32 of the 57 pseudobulk profiles. l, Contribution scores were calculated for each of the 32 models. m, TF-MoDISco was used to cluster the contribution score seqlets and identify CWM patterns, which were annotated with known transcription factor DNA-binding motifs using the CIS-BP2 database. n, The dynamic peak set was scanned and scored for the CWM profiles identified by TF-MoDISco. o-p, A merged database of seqlets was generated (o) and within TADs, the dot product of the contribution scores were correlated to gene expression (p). q-r, Significantly correlated seqlets (r > 0.4) were identified for each gene (q), annotated by their underlying transcription factors to generate a pairwise correlation matrix (r). s, These connections were filtered to significant connections to build an initial gene regulatory network. t, The connections were scored for each cluster and aggregated to generate activity scores: Z-score integrating target gene expression, transcription factor expression, and regulatory contribution of the transcription factor to its putative target genes in each of the 32 clusters.