Fig. 6: HALO uncovers chromatin lineage-priming in AT2 cells collected from SSc-ILD lungs.
From: HALO: hierarchical causal modeling for single cell multi-omics data

A UMAP visualization of pulmonary epithelial cells constructed from concatenated RNA and ATAC representations, colored by identified cell types. This UMAP embedding was used for later panels in this figure. B Differential abundance test using Milo to compare cluster populations in control and SSc-ILD samples. C RNA velocity analysis showing inferred differentiation trajectories of pulmonary epithelial cells to two terminal states. D UMAP visualization of pulmonary epithelial cells, colored by pseudotime. E ATAC latent representations (3, 4, 5, 14, 15) visualized on UMAP embedding. F Hypergeometric based enrichment test of cell type-specific super enhancer (alveolar epithelial or airway epithelial) for ATAC latent representations (coupled 3, 4, 5 and decoupled 14, 15). The dashed line indicates a p-value of 0.05, and above the dashed line indicates significant enrichment. G Motif enrichment for ATAC decoupled representations 14 and 15, where TFs essential for proximal and distal airway patterning during lung development are colored blue and red, respectively. P-values are calculated using hypergeometric test. H ChromVAR motif activity, local peaks, and gene expression of SOX4 in alveolar epithelial cells (clusters 0, 1, 3) from SSc and normal (NOR) lung, as well as cluster population ratio, sorted by pseudotime. I Volcano plot visualization of SSc-associated changes in gene expression and ChromVAR motif activity in cluster 3 (AT1). J Gene-peak pairs of SOX4 during AT2-to-AT1 differentiation (clusters 0, 1, 3) in alveolar epithelium, sorted by pseudotime. K Scatter plot showing Granger causal relations of CASC15 gene expression and local peaks of SOX4 in Chr6: 21478948-21597288. The X-axis is the time lag (number of cells sorted by pseudotime), Y-axis is −log(p-values). L. Scatter plot showing Granger causal relations of CASC15 gene expression and AT1-specific super enhancer region (Chr6: 21587154-21601721). The X-axis is the time lag (number of cells sorted by pseudotime), Y-axis is −log(p-values). We utilized the likelihood ratio test for the Granger causality-based regulation inference for (K and L). M The loops denote significant connections between local peaks and RNA expression of SOX4. Connections between the expression of SOX4 and peaks are identified by HALO's gene-peak-matching algorithm. N Transition probability matrix of AT2 cell (clusters 0, 1) differentiation trajectories under different conditions (NOR vs. SSc), calculated using optimal transport. The curves in (H and J) show the median across replicates, while the shaded bands indicate a custom central range, spanning the 40th to 60th percentiles. SSc-ILD related data are available at GEO accession number GSE302151. Source data are provided as a Source Data file.