Extended Data Fig. 1: Properties of ASAPseq library for in vitro model.
From: Profound phenotypic and epigenetic heterogeneity of the HIV-1-infected CD4+ T cell reservoir

(A) UpSet plot of unique cell barcodes that were collected from each modality (ATAC versus ADT) and whether or not the barcode was associated with proviral reads (HIV). Barcodes that passed ATAC and ADT quality checks (see Methods) were used for downstream analyses. (B) Flow cytometry plot of cell culture before conducting ASAPseq analysis. Top two plots (from left to right) indicate gating strategy. Value in the highlighted box for the bottom plot is the percent of total live singlets that are p24 + . (C) Reported mapped and unmapped read-segments by chromosome (as determined by samtools idxstats) from alignment of ASAPseq dataset of uninfected PBMC (Mimitou et al., 2021) to chimeric reference genomes with HXB2 (left) or SUMA (right). HIV genomes were added as a separate chromosome during creation of the chimeric reference genome. (D) (top) Sequenced regions that are aligned by bwa mem to the proviral genome (SUMA) and recovered by hiv-haystack. Each row is a cell and each column is a base pair spanning the proviral genome. Regions in orange indicate actual reported coverage while regions in blue indicate inferred coverage if provirus was intact as paired-end sequencing can only obtain at most 50 bp from either end of the genomic/transposed fragment if the genomic fragment is > 50 bp. Many LTR alignments can be ambiguous and it is unclear whether the actual read is in the 3’ LTR or 5’ LTR. The primary alignment from bwa-mem is recorded here. (middle) Proportion of coverage is reported across all cells spanning the entire proviral genome. (bottom) Genome map of SUMA. (E) UMAP representation of the ATAC component with numeric labeling prior to manual annotation.