Extended Data Fig. 2: Summary statistics for the dual PhoQ–PhoP library.
From: Engineering orthogonal signalling pathways reveals the sparse occupancy of sequence space

a, Schematic summary of library design. b, Left, histogram of the read counts for the pre-selection PhoQ–PhoP library. The vast majority of reads are unique, which indicates that the library size is larger than Illumina sequencing coverage and no variants are overrepresented. Right, histogram of the read counts for one replicate of the PhoQ–PhoP library after overnight growth in low Mg2+ conditions. c, Counts of cells sorted into each bin, for each replicate and growth condition. d, The cells sorted into each bin were grown overnight, diluted back to mid-exponential phase, shifted to medium with 10 μM Mg2+ and their YFP levels were verified by flow cytometry. n = 2 independent biological replicates. e, As in d, but with cells retained in medium with 50 mM Mg2+. n = 2 independent biological replicates. f, Scatter plots displaying the correlations between the bin frequencies of individual variant pairs measured in independent replicates. Only 106 data points are shown for clarity. R2 values indicate the Pearson correlation coefficients, calculated using all data points. g, As in f, but displaying only the 10,595 variants with sufficient sequencing coverage and fit quality (Methods) to be included in the analysis. h, Scatter plot displaying the Pearson correlation between the YFP fold induction measured by sort-seq and that measured individually by flow cytometry for 32 individual variant pairs. i, Sequence logos summarizing the amino acid frequencies at each position varied in the pre-selected library (top), set of pairs with >20-fold induction (middle) and all native histidine kinase–response regulator pairs (bottom). The residues found at these positions in wild-type PhoQ and PhoP are listed below. j, The FACS gating strategy for isolating single live cells for quantification of YFP expression.