Supplementary Figure 5: The CrY2H-seq analysis pipeline.
From: CrY2H-seq: a massively multiplexed assay for deep-coverage interactome mapping

(a) Reads are first mapped to a custom genome composed of Arabidopsis TF ORF sequences, S. cerevisiae genome, Gal4 AD and Gal4 DB domain sequences amplified by AD and DB primers (primer region), and empty plasmid sequence (not pictured). Overall alignments were as follows: 74.8% Arabidopsis, 16% primer region, 4.8% not aligned, 4.2% Yeast, and 0.2% empty plasmid. (b) Paired-end high-quality mapped reads are paired by read IDs and clonal fragments are removed. (c) Fragments are filtered for DNA strandedness to remove reads mapping to only one ORF. Examples of one protein interaction fragment (blue, green, and orange fragment) and a fragment mapping to only one ORF (purple fragment) are shown. Fragments are filtered by fragment size based on a 300bp size range (~220 bp - ~520 bp). (d) Fraction of fragments remaining after each filtering step in the analysis pipeline. Average and standard deviation were calculated from the ten replicate screen datasets. (e) Fragments for unique ORF combinations are totaled only if read1 maps to a different ORF than read2. Paired-end reads mapping to three example protein interaction PCR products are shown. (f) A basal fragment cutoff is applied for reasons described in Supplementary Figure 4 and Online Methods. This removes any interaction product with less than 3 fragments from the data, as shown by the carry-over of interaction products showing more than 2 fragments (peach/lime and blue/green interaction product) and absence of the blue/red interaction product which had only 2 fragments. (g) Datasets from each screen are normalized by calculating the median total number of interaction fragments and calculating a scale factor based on the fold difference between the screen and median total interaction fragments. Unique interaction fragments are then multiplied by the calculated scale factor and rounded down to the nearest integer. An example is shown for screen 8. See Online Methods for more details.