Extended Data Fig. 2: PRIM-seq data processing.
From: Genome-wide mapping of RNA-protein associations through sequencing

(a) A cartoon showing the decoding of the protein-end and the RNA-end reads. As the sequencing reads from both ends are always from 5′ to 3′, the protein-end reads are always antisense sequences and the RNA-end reads are alway sense strand sequences. (b) The contingency table for testing the independence of a RNA (RNA A) and a protein (Protein B) from PRIM-seq data. Xij are the read counts. A Chi-square test is constructed from this contingency table for each pair of RNA and protein. (c) Flowchart of PRIMseqTools for processing PRIM-seq data. Adaptor sequences were trimmed (Adaptor trimming) and low-quality reads were removed (Quality filtering). The resulting read pairs were mapped to Refseq genes (Mapping). The read pairs with the two ends mapped to two different genes are retained (Identification of chimeric read pairs) and deduplicated. Non-duplicated chimeric read pairs with one end mapped to the sense strand of a gene and the other end mapped to the antisense strand of a protein-coding gene (RNA/protein end assignment) were used as the input for the Chi-square test (Statistical test).