Fig. 1: Haplotype-aware single-cell multiomics to functionally characterize SVs. | Nature Biotechnology

Fig. 1: Haplotype-aware single-cell multiomics to functionally characterize SVs.

From: Functional analysis of structural variants in single cells using Strand-seq

Fig. 1

a, Leveraging Strand-seq, scNOVA performs SV discovery and then, using phased NO tracks, identifies functional effects of SVs locally (via evaluation of haplotype-specific NO) and globally (clone-specific NO). Orange, Strand-seq reads mapped to the Watson (W) strand; green, reads mapped to the Crick (C) strand. b, Strand-seq-based NO tracks in NA12878 reveal nucleosome positions well-concordant with bulk MNase-seq, depicted for a chromosome 12 locus with relatively regular nucleosome positioning92. Red, NO tracks mapping to haplotype 1 (H1); blue, H2; black, combining phased and unphased reads; gray, MNase-seq. The y axis depicts the mean read counts at each bp in 10 bp bins. c, Correlated NO at consensus DNase I hypersensitive sites33 for NA12878. d, Averaged nucleosome patterns at CTCF binding sites34 in NA12878, using pseudobulk Strand-seq and MNase-seq. e, FCs of haplotype-resolved NO in gene bodies plotted for chromosome X and chromosome 7 (a representative autosome) in NA12878. FCs of haplotype-resolved RNA expression measurements are shown to the right. f, Pseudobulk haplotype-phased NO track of exons of the representative chromosome X gene SH3KBP1 based on Strand-seq. Boxplots comparing H1 and H2 use two-sided Wilcoxon rank sum tests followed by Benjamini–Hochberg multiple testing (FDR) correction (boxplots defined by minima = 25th percentile – 1.5 × interquartile range (IQR), maxima = 75th percentile + 1.5 × IQR, center = median and bounds of box = 25th and 75th percentile; n = 47 single cells). Bar charts show haplotype-specific RNA expression of SH3KBP1 (two-sided likelihood ratio test followed by FDR correction; n = 4 biological replicates; data are presented as mean values ± s.e.m.). g, Inverse correlation of NO at gene bodies and gene expression. NO is based on pseudobulk Strand-seq libraries from RPE-1. Gene bodies were scaled to the same length. h, Cell-typing based on NO at gene bodies (AUC = 0.96). Cell line codes: Blue, RPE-1; Purple, BM510; Magenta, C7; LV, latent variable. i, Receiver operating characteristics for inferring altered gene activity by analyzing NO at gene bodies, using pseudobulk Strand-seq libraries from in silico cell mixing.

Back to article page