Fig. 4: Cell-type-specific variant function.

A Sequencing tracks representing chromatin accessibility near the CCVs marked by rsIDs and vertical pink lines. Each track represents the aggregated snATAC signal of each cell type, normalized by the total number of reads in the regions (y-axis scale 0–71). Arrow depicts the transcriptional direction of TERT. B Normalized transcriptional activity of 145 bp sequences encompassing rs7726159 (upper) and rs7725218 (lower) tested by massively parallel reporter assays in A549 lung cells. TPM: tags per million, alt: alternative allele (lung cancer risk-associated), ref: reference allele, fwd: forward, rev: reverse. Center lines show the medians; box limits indicate the 25th and 75th percentiles; whiskers extend 1.5 times the interquartile range from the 25th and 75th percentiles, outliers are represented by dots. Density is reflected in the width of the shape. Number of tags combined from five biological replicates for statistical testing are n = 125, 120, 120, and 110 for rs7726159 and n = 105, 110, 105, and 115 for rs7725218, from left to right. FDR values were calculated by the Wald test and corrected by the Benjamini-Hochberg procedure. C The sequencing tracks of chromatin accessibility, CCVs, and gene transcriptional direction using the same style as (A) (y-axis scale 0–490). D Position weight matrix of IRF8 motif shown as the height of the motif logos with the genomic location at the bottom. The variant position (rs3769823) within the motif is indicated with a red box. E The normalized mRNA expression of IRF8 across 7 cell types with the highest IRF8 expression. The color and size of each dot correspond to the scaled average expression level and fraction of cells expressing IRF8, respectively. F The upper part displays the average footprint profile of the IRF8 across all detected peaks in each cell type. Three cell types with the highest average footprint profiles for IRF8 motif are shown in shades of pink and the remaining cell types in gray. Tn5 insertion bias was corrected by subtracting the Tn5 signals from the average footprint signals. The lower part shows the expected Tn5 enrichment based on distance from motif. Source data are provided as a Source Data file.