Extended Data Fig. 2: Novel FIRE ACRs comprised of short FIRE elements are bona fide regulatory elements. | Nature Plants

Extended Data Fig. 2: Novel FIRE ACRs comprised of short FIRE elements are bona fide regulatory elements.

From: The regulatory potential of transposable elements in maize

Extended Data Fig. 2: Novel FIRE ACRs comprised of short FIRE elements are bona fide regulatory elements.

(A) Schematic describing short-read simulation and mappability calculation. We generated 2.1 billion fragments evenly distributed across the B73 reference genome chromosomes 1-10 (see Methods). For each simulated fragment, 50 bp paired-end reads were generated (indicated with thick black arrows). Each read matched exactly the reference sequence from which it was generated. These simulated reads were then mapped back to the genome using BWA. The ‘fraction mapped’ for a given region or window was calculated as the number of correctly mapped reads with mapq score > 0 divided by the total number of simulated reads with the outer end (Tn5 insertion) falling in the region. Mapq scores are indicated by blue and red boxes, incorrectly mapped simulated read shows X in red box (top row). Mappability of regions was determined as percentage of correctly mapped reads with mapq>0. (B) Histograms of mappability as in (A) for all 21,318,473 non-overlapping 100 bp windows in the maize genome (top panel, grey), 51,817 ATAC ACRs (middle panel, gold), and 106,867 FIRE ACRs (bottom panel, purple). Low mappability explains only in part why Fiber-seq detects many more ACRs than ATAC-seq. (C) FIRE ACRs comprised of short FIRE elements are not detected by ATAC-seq. Correlation between FIRE accessibility scores and Tn5 insertions/ base (chromatin accessibility as measured by ATAC-seq) for FIRE ACRs comprised of FIRE elements of indicated length (see inset for legend). Left, LOWESS curves fitted to FIRE ACRs in respective length categories. Right, plots showing individual values for FIRE ACRs belonging to the five length categories. (D) FIRE accessibility score by Tn5 insertions/base (that is, ATAC accessibility score) for ACRs stratified into 12 categories. Each dot represents an ACR with the labelled row and column properties. As the row categories are overlapping, ACRs were sorted hierarchically as follows: all ACRs with low FIRE accessibility score were included in the ‘low FIRE acc. score’ rows; ACRs with FE length < 200 bp and high FIRE accessibility score were included in the ‘FE length <200’ rows; ACRs with mappability < 80% and both high FIRE accessibility score and FE length >=200 bp were included in the ‘Unmappable’ rows. (E) FIRE ACRs that do not overlap with ATAC ACRs show similar patterns of the m6A signal (top) and the 5mCpG signal (bottom) as FIRE ACRs that overlap with ATAC ACRs. Shifted control regions do not display these properties. FIRE element length underlying FIRE ACRs is indicated as in (C). (F) FIRE ACRs that do not overlap with ATAC ACRs show a similar distribution across genomic compartments as FIRE ACRs that overlap with ATAC ACRs. Statistical analyses and p-values for Extended Data Fig. 2F are in Supplementary Table 1.

Back to article page