Extended Data Fig. 7: dpCoA-CapZyme-seq work flow and quality control.

a, The detailed workflow for dpCoA-CapZyme-seq library construction. CIP was first used to remove most of 5′ p-RNA from poly(A) RNA. dpCoA-RNAs were then decapped by AtNUDT11, resulting in RNAs with 5′-monophosphate ends, which were then ligated to a 5′ adaptor oligonucleotide. Reverse transcription was performed using random primers containing a known sequence handle, enabling the construction of a library. Cartoons of the objects were created with BioRender.com/huw159s. b, Agarose gel electrophoresis was used to assess the quality of Arabidopsis thaliana total RNA, confirming that AtNUDT11 exhibits no RNA degradation activity. RNase T1 was used as a positive control to indicate RNA degradation. Data are from one independent experiment. c, Pipeline of dpCoA-CapZyme-seq data analysis. The resulting data were processed by segmenting the Arabidopsis genome into 10-bp bins, with the number of 5′ end reads in each bin counted. Bins exhibiting significantly enriched read counts (fold change > 2, P value < 0.01) in AtNUDT11-treated group compared to the mock-treated group were identified. The 5′ end sites with RPM > 1 in the enriched bins were ultimately identified as dpCoA-RNA sites, and assigned as the TSSs for dpCoA-RNA. d, Principal Component Analysis of RPM of all bins across all samples. PC1 and PC2 define the x and y axes, respectively. e, Correlation heatmap showing the Pearson coefficients of replicates in the mock-treated and AtNUDT11-treated samples. f, Genome-wide distribution of 5′ end reads for each replicate of the mock and AtNUDT11-treated samples. The y-axis represents the log2RPM values of 5′ end reads in each bin, while the x-axis indicates the positions along the five chromosomes. “Chr”: chromosome.