Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Co-evolution of interacting proteins through non-contacting and non-specific mutations

Abstract

Proteins often accumulate neutral mutations that do not affect current functions but can profoundly influence future mutational possibilities and functions. Understanding such hidden potential has major implications for protein design and evolutionary forecasting but has been limited by a lack of systematic efforts to identify potentiating mutations. Here, through the comprehensive analysis of a bacterial toxin–antitoxin system, we identified all possible single substitutions in the toxin that enable it to tolerate otherwise interface-disrupting mutations in its antitoxin. Strikingly, the majority of enabling mutations in the toxin do not contact and promote tolerance non-specifically to many different antitoxin mutations, despite covariation in homologues occurring primarily between specific pairs of contacting residues across the interface. In addition, the enabling mutations we identified expand future mutational paths that both maintain old toxin–antitoxin interactions and form new ones. These non-specific mutations are missed by widely used covariation and machine learning methods. Identifying such enabling mutations will be critical for ensuring continued binding of therapeutically relevant proteins, such as antibodies, aimed at evolving targets.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Comprehensive identification of neutral and enabling mutations for the toxin–antitoxin system ParE3–ParD3.
Fig. 2: Deep mutational scanning reveals mutational tolerance and interface-disrupting substitutions in ParE3–ParD3.
Fig. 3: Beneficial, interaction-restoring mutations can be far from the deleterious mutation they rescue.
Fig. 4: Non-specific enabling mutations outnumber specific mutations and can be far from the deleterious mutation as well as the interface.
Fig. 5: Natural sequences and models trained on these provide insufficient information to predict enabling mutations.
Fig. 6: Non-specifically enabling mutations expand mutational paths to maintain old and evolve new interactions.

Similar content being viewed by others

References

  1. Green, A. G. et al. Large-scale discovery of protein interactions at residue resolution using co-evolution calculated from genomic sequences. Nat. Commun. 12, 1396 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Cong, Q., Anishchenko, I., Ovchinnikov, S. & Baker, D. Protein interaction networks revealed by proteome coevolution. Science 365, 185–189 (2019).

  3. Ovchinnikov, S., Kamisetty, H. & Baker, D. Robust and accurate prediction of residue–residue interactions across protein interfaces using evolutionary information. eLife 3, e02030 (2014).

  4. Hopf, T. A. et al. Sequence co-evolution gives 3D contacts and structures of protein complexes. eLife 3, e03430 (2014).

    Article  PubMed Central  Google Scholar 

  5. Kondrashov, A. S., Sunyaev, S. & Kondrashov, F. A. Dobzhansky–Muller incompatibilities in protein evolution. Proc. Natl Acad. Sci. USA 99, 14878–14883 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Morcos, F. et al. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc. Natl Acad. Sci. USA 108, E1293–E1301 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Sulkowska, J. I., Morcos, F., Weigt, M., Hwa, T. & Onuchic, J. N. Genomics-aided structure prediction. Proc. Natl Acad. Sci. USA 109, 10340–10345 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Cheng, R. R., Morcos, F., Levine, H. & Onuchic, J. N. Toward rationally redesigning bacterial two-component signaling systems using coevolutionary information. Proc. Natl Acad. Sci. USA 111, E563–E571 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Aakre, C. D. et al. Evolving new protein–protein interaction specificity through promiscuous intermediates. Cell 163, 594–606 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Lite, T. L. V. et al. Uncovering the basis of protein–protein interaction specificity with a combinatorially complete library. eLife 9, e60924 (2020).

  12. McClune, C. J., Alvarez-Buylla, A., Voigt, C. A. & Laub, M. T. Engineering orthogonal signalling pathways reveals the sparse occupancy of sequence space. Nature 574, 702–706 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. McMahon, C. et al. Yeast surface display platform for rapid discovery of conformationally selective nanobodies. Nat. Struct. Mol. Biol. 25, 289–296 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Schoof, M. et al. An ultrapotent synthetic nanobody neutralizes SARS-CoV-2 by stabilizing inactive Spike. Science 370, 1473–1479 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Damen, L. A. A. et al. Construction and evaluation of an antibody phage display library targeting heparan sulfate. Glycoconj. J. 37, 445–455 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Zupancic, J. M. et al. Directed evolution of potent neutralizing nanobodies against SARS-CoV-2 using CDR-swapping mutagenesis. Cell Chem. Biol. 28, 1379–1388 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Aramli, L. A. & Teschke, C. M. Single amino acid substitutions globally suppress the folding defects of temperature-sensitive folding mutants of phage P22 coat protein. J. Biol. Chem. 274, 22217–22224 (1999).

    Article  CAS  PubMed  Google Scholar 

  18. Baroni, T. E. et al. A global suppressor motif for p53 cancer mutants. Proc. Natl Acad. Sci. USA 101, 4930–4935 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Berroteran, R. W. & Hampsey, M. Genetic analysis of yeast Iso-1-cytochrome c structural requirements: suppression of Gly6 replacements by an Asn52 → Ile replacement. Arch. Biochem. Biophys. 288, 261–269 (1991).

    Article  CAS  PubMed  Google Scholar 

  20. Bloom, J. D., Labthavikul, S. T., Otey, C. R. & Arnold, F. H. Protein stability promotes evolvability. Proc. Natl Acad. Sci. USA 103, 5869–5874 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Bloom, J. D. & Glassman, M. J. Inferring stabilizing mutations from protein phylogenies: application to influenza hemagglutinin. PLoS Comput. Biol. 5, e1000349 (2009).

  22. Gong, L. I., Suchard, M. A. & Bloom, J. D. Stability-mediated epistasis constrains the evolution of an influenza protein. eLife 2, e00631 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  23. Brown, N. G., Pennington, J. M., Huang, W., Ayvaz, T. & Palzkill, T. Multiple global suppressors of protein stability defects facilitate the evolution of extended-spectrum TEM β-lactamases. J. Mol. Biol. 404, 832–846 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Fane, B., Villafane, R., Mitraki, A. & King, J. Identification of global suppressors for temperature-sensitive folding mutations of the P22 tailspike protein. J. Biol. Chem. 266, 11640–11648 (1991).

    Article  CAS  PubMed  Google Scholar 

  25. Huang, W. & Palzkill, T. A natural polymorphism in β-lactamase is a global suppressor. Proc. Natl Acad. Sci. USA 94, 8801–8806 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Hudson, W. H. et al. Distal substitutions drive divergent DNA specificity among paralogous transcription factors through subdivision of conformational space. 113, 326–331 (2015).

  27. Joyet, P., Declerck, N. & Gaillardin, C. Hyperthermostable variants of a highly thermostable alpha-amylase. Biotechnol. 10, 1579–1583 (1992).

    CAS  Google Scholar 

  28. Marciano, D. C. et al. Genetic and structural characterization of an L201P global suppressor substitution in TEM-1 β-lactamase. J. Mol. Biol. 384, 151–164 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. McKeown, A. N. et al. Evolution of DNA specificity in a transcription factor family produced a new gene regulatory module. Cell 159, 58–68 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Poteete, A. R., Rennell, D., Bouvier, S. E. & Hardy, L. W. Alteration of T4 lysozyme structure by second-site reversion of deleterious mutations. Protein Sci. 6, 2418–2425 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Shortle, D. & Lin, B. Genetic analysis of staphylococcal nuclease: identification of three intragenic ‘global’ suppressors of nuclease-minus mutations. Genetics 110, 539–555 (1985).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Tsai, A. Y. M., Itoh, M., Streuli, M., Thai, T. & Saito, H. Isolation and characterization of temperature-sensitive and thermostable mutants of the human receptor-like protein tyrosine phosphatase LAR. J. Biol. Chem. 266, 10534–10543 (1991).

    Article  CAS  PubMed  Google Scholar 

  33. Yang, R. et al. Second-site suppressors of HIV-1 capsid mutations: restoration of intracellular activities without correction of intrinsic capsid stability defects. Retrovirology 9, 30 (2012).

  34. Zheng, J., Guo, N. & Wagner, A. Selection enhances protein evolvability by increasing mutational robustness and foldability. Science 370, eabb5962 (2020).

  35. Ortlund, E. A., Bridgham, J. T., Redinbo, M. R. & Thornton, J. W. Crystal structure of an ancient protein: evolution by conformational epistasis. Science 317, 1544–1548 (2007).

  36. Starr, T. N., Picton, L. K. & Thornton, J. W. Alternative evolutionary histories in the sequence space of an ancient protein. Nature 549, 409–413 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Klein, F. et al. Somatic mutations of the immunoglobulin framework are generally required for broad and potent HIV-1 neutralization. Cell 153, 126–138 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Angelini, A. et al. Directed evolution of broadly crossreactive chemokine-blocking antibodies efficacious in arthritis. Nat. Commun. 9, 1461 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  39. Madan, B. et al. Mutational fitness landscapes reveal genetic and structural improvement pathways for a vaccine-elicited HIV-1 broadly neutralizing antibody. Proc. Natl Acad. Sci. USA 118, e2011653118 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Ivankov, D. N., Finkelstein, A. V. & Kondrashov, F. A. A structural perspective of compensatory evolution. Curr. Opin. Struct. Biol. 26, 104–112 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Starr, T. N. & Thornton, J. W. Epistasis in protein evolution. Protein Sci. 25, 1204–1218 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Diss, G. & Lehner, B. The genetic landscape of a physical interaction. eLife 7, e32472 (2018).

  43. Fowler, D. M. & Fields, S. Deep mutational scanning: a new style of protein science. Nat. Methods 11, 801–807 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Fowler, D. M. et al. High-resolution mapping of protein sequence–function relationships. Nat. Methods 7, 741–746 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. McLaughlin, R. N., Poelwijk, F. J., Raman, A., Gosal, W. S. & Ranganathan, R. The spatial architecture of protein function and adaptation. Nature 491, 138–142 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Whitehead, T. A. et al. Optimization of affinity, specificity and function of designed influenza inhibitors using deep sequencing. Nat. Biotechnol. 30, 543–548 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Otwinowski, J., McCandlish, D. M. & Plotkin, J. B. Inferring the shape of global epistasis. Proc. Natl Acad. Sci. USA 115, E7550–E7558 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Poelwijk, F. J. Context-dependent mutation effects in proteins. Methods Mol. Biol. 1851, 123–134 (2019).

    Article  CAS  PubMed  Google Scholar 

  49. Schmiedel, J. M. & Lehner, B. Determining protein structures using deep mutagenesis. Nat. Genet. 51, 1177–1186 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Tareen, A., Posfai, A., Ireland, W. T., Mccandlish, D. M. & Kinney, J. B. MAVE-NN: learning genotype–phenotype maps from multiplex assays of variant effect. Preprint at bioRxiv https://doi.org/10.1101/2020.07.14.201475 (2020).

  51. Atwal, G. S. & Kinney, J. B. Learning quantitative sequence–function relationships from massively parallel experiments. J. Stat. Phys. 162, 1203–1243 (2016).

    Article  Google Scholar 

  52. Sarkisyan, K. S. et al. Local fitness landscape of the green fluorescent protein. Nature 533, 397–401 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Pokusaeva, V. O. et al. An experimental assay of the interactions of amino acids from orthologous sequences shaping a complex fitness landscape. PLoS Genet. 15, e1008079 (2019).

  54. Rollins, N. J. et al. Inferring protein 3D structure from deep mutation scans. Nat. Genet. 51, 1170–1176 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Poelwijk, F. J., Socolich, M. & Ranganathan, R. Learning the pattern of epistasis linking genotype and phenotype in a protein. Nat. Commun. 10, 4213 (2019).

  56. Hopf, T. A. et al. Mutation effects predicted from sequence co-variation. Nat. Biotechnol. 35, 128–135 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Riesselman, A. J., Ingraham, J. B. & Marks, D. S. Deep generative models of genetic variation capture the effects of mutations. Nat. Methods 15, 816–822 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Hecht, M. H. & Sauer, R. T. Phage lambda repressor revertants. Amino acid substitutions that restore activity to mutant proteins. J. Mol. Biol. 186, 53–63 (1985).

    Article  CAS  PubMed  Google Scholar 

  59. Ortlund, E. A., Bridgham, J. T., Redinbo, M. R. & Thornton, J. W. Crystal structure of an ancient protein: evolution by conformational epistasis. Science 317, 1544–1548 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Russ, W. P. et al. An evolution-based model for designing chorismate mutase enzymes. Science 369, 440–445 (2020).

    Article  CAS  PubMed  Google Scholar 

  61. Jiang, X.-L., Dimas, R. P., Chan, C. T. Y. & Morcos, F. Coevolutionary methods enable robust design of modular repressors by reestablishing intra-protein interactions. Nat. Commun. 12, 5592 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Mutalik, V. K. et al. Precise and reliable gene expression via standard transcription and translation initiation elements. Nat. Methods 10, 354–360 (2013).

    Article  CAS  PubMed  Google Scholar 

  63. Khlebnikov, A., Datsenko, K. A., Skaug, T., Wanner, B. L. & Keasling, J. D. Homogeneous expression of the PBAD promoter in Escherichia coli by constitutive expression of the low-affinity high-capacity araE transporter. Microbiology 147, 3241–3247 (2001).

    Article  CAS  PubMed  Google Scholar 

  64. Stiffler, M. A., Subramanian, S. K., Salinas, V. H. & Ranganathan, R. A protocol for functional assessment of whole-protein saturation mutagenesis libraries utilizing high-throughput sequencing. J. Vis. Exp. 113, e54119 (2016).

  65. Warren, D. J. Preparation of highly efficient electrocompetent Escherichia coli using glycerol/mannitol density step centrifugation. Anal. Biochem. 413, 206–207 (2011).

    Article  CAS  PubMed  Google Scholar 

  66. Magoc, T. & Salzberg, S. L. FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957–2963 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Rognes, T., Flouri, T., Nichols, B., Quince, C. & Mahé, F. VSEARCH: a versatile open source tool for metagenomics. PeerJ 4, e2584 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  68. Bloom, J. D. Software for the analysis and visualization of deep mutational scanning data. BMC Bioinforma. 16, 168 (2015).

    Article  Google Scholar 

  69. Bank, C., Hietpas, R. T., Wong, A., Bolon, D. N. & Jensen, J. D. A Bayesian MCMC approach to assess the complete distribution of fitness effects of new mutations: uncovering the potential for adaptive walks in challenging environments. Genetics 196, 841–852 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  70. Stan Development Team. Stan Modeling Language Users Guide and Reference Manual, v. 2.26. (Stan Development Team, 2021).

  71. Riddell, A., Hartikainen, A. & Carter, M. PyStan v. 3.0.0 (2021).

  72. Kingma, D. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).

  73. Abadi, M. et al. Tensorflow: a system for large-scale machine learning. In Proc. 12th USENIX Symposium on Operating Systems Design and Implementation 265–283 (USENIX Association, 2016).

  74. Hagberg, A. A., Schult, D. A. & Swart, P. J. Exploring network structure, dynamics, and function using NetworkX. In Proc. 7th Python in Science Conference (eds. Varoquaux, G. et al.) 11–15 (2008).

  75. Tareen, A. & Kinney, J. B. Logomaker: beautiful sequence logos in Python. Bioinformatics 36, 2272–2274 (2020).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank members of the Laub and Marks laboratories, A. Batchelor, C. McClune, J. Ingraham, A. Schoech and I. Cvijovic for helpful discussions. We thank A. Murray, N. Gauthier, T. Okubo, S. Sinai and N. Youssef for feedback on the manuscript and M. Stiffler for sharing protocols before publication. This work was supported by the Howard Hughes Medical Institute (M.T.L.), National Institutes of Health grant no. R01CA260415 (D.S.M.), Chan Zuckerberg Initiative CZI2018-191853 (D.S.M.), Ashford PhD fellowship (D.D.), Boehringer Ingelheim Funds PhD fellowship (D.D.), Fanny and John Hertz Fellowship (E.N.W.), National Institutes of Health NLM training grant no. T15LM007092 (A.G.G.), National Institutes of Health grant no. T32GM007753 (T.-L.V.L.), Jane Coffin Childs Memorial Fund for Medical Research fellowship (B.W.) and National Institutes of Health grant no. K99GM135536 (B.W.).

Author information

Authors and Affiliations

Authors

Contributions

D.D., D.S.M. and M.T.L conceived the project and wrote the paper. D.D. designed and performed experiments, analysed data and built the quantitative models. A.G.G. performed covariation analysis for ~350 protein–protein interactions. B.W. helped with library transformations. T.-L.V.L. created the combinatorial antitoxin mutant library. E.N.W. suggested helpful tips on Bayesian modelling. D.S.M. and M.T.L. supervised the project.

Corresponding author

Correspondence to Michael T. Laub.

Ethics declarations

Competing interests

D.S.M. is an advisor for Dyno Therapeutics, Octant, Jura Bio, Tectonic Therapeutics and Genentech and a cofounder of Seismic. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Ecology & Evolution thanks Hsin-Hung Chou and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Orthogonal validation of growth rate inference, structural explanation for antitoxin mutation effects, and covariational signal between toxin–antitoxin ParE3/ParD3.

a, Comparison of growth rates inferred by high-throughput vs. individual growth measurement. X axis error bars indicate + /− 2x standard deviation derived from n = 10 or n = 11 technical plate reader replicates (see Methods). Y axis error bars indicate 95% posterior highest density interval. The Pearson correlation coefficient (r) is indicated. b, Raw log-read ratio reproducibility between replicates (+1 pseudocount) for all single and double mutants. The Pearson correlation coefficient (r) is indicated. c, Mean mutation effect of residues in the C-terminal α-helix 3 of the ParD3 antitoxin indicates that residues facing the toxin are more susceptible to mutations that disrupt the ParD3–ParE3 interaction, producing negative Δgrowth rate values. d, Mean mutation effect in the N-terminal oligomerization region of the antitoxin highlights residues susceptible to disrupting the ParD3–ParE3 interaction when mutated. Cartoon illustrates arrangement of ParE3–ParD3 octamer observed in the co-crystal structure (PDB: 5CEG). One of the 4 antitoxin monomers is coloured by the mean mutation effect. e, Top 10 toxin–antitoxin covarying residue pairs indicated for reference. f, The 90% precision cutoff yields 29 toxin–antitoxin covarying residue pairs (black in upper, right quadrant) of which 28 pairs fall within toxin–antitoxin interface residues that are < 6 Å minimum atom distance (ochre dots) in the ParE3-D3 crystal structure (PDB ID: 5CEG).

Extended Data Fig. 2 Titration of toxin and antitoxin expression levels, and sensitive identification of toxin substitutions which do not disrupt toxicity.

a, Cartoon illustration of the expression system. IPTG induces antitoxin, arabinose induces toxin. b, Growth rate of cells harbouring wild-type toxin ParE3 without antitoxin at different arabinose induction levels in arabinose-titratable E. coli strain BW27783. c, d, Growth rate of cells harbouring wild-type toxin–antitoxin ParE3/ParD3 under different antitoxin induction levels modulated with IPTG and 0.00012% arabinose induction (c) or 0.0008% arabinose induction (d). e, Distribution of ∆growth rates(T*-T) for all toxin single substitutions under different arabinose inducer concentrations, with positive ∆growth rate(T*-T) values indicating loss of toxin function. The set of ‘most toxic’ toxin substitutions (n = 310) is coloured in light blue, the set of ‘toxic’ substitutions (n = 781) is coloured in green (see Methods). Other classes of substitutions are indicated. The dynamic range (difference between 0 and the truncated toxin mutants) shrinks, as expected, for lower expression levels that do not fully inhibit growth with the wild-type toxin, and a higher fraction of mutants show loss of toxicity (higher ∆growth rates) under lower expression conditions. The toxin substitution A28Q is highlighted (dark blue) as an example that shows no growth rate difference relative to wild-type toxin at high expression conditions, but is not as toxic as wild-type toxin at lower expression conditions. f, Schematic illustrating loss of toxicity detection using growth rate measurements in different expression regimes. g, Mean ∆growth rates(T*-T) of residue positions mapped onto the ParE3 toxin structure. Values shown for 0.00012% [arabinose] inducer. h, The mean ∆growth rates(T*-T) of a residue are correlated with the relative solvent accessibility of the residue (Pearson r = −0.66). Values shown for 0.00012% [arabinose] inducer. i,j, Distribution of ∆growth rate(T*-T) for all toxin substitutions (black) or top 10 coevolving residue substitutions (purple) in the toxin in absence of antitoxin (g) or presence of antitoxin (h). Values shown for 0.00012% [arabinose] inducer, and antitoxin is induced with 10 µM IPTG. k, The ∆growth rate(T*-T) values of each substitution at any position along the toxin ParE3. Green highlights the top 10 covarying positions between toxin and antitoxin in natural homologues. Values shown for 0.00012% [arabinose] inducer.

Extended Data Fig. 3 Volcano plot visualizing significant and substantial beneficial toxin variants in different antitoxin backgrounds, and beneficial toxin variants in various antitoxin backgrounds under ‘high’ and ‘low’ antitoxin expression conditions.

a, For each deleterious antitoxin variant background, the mean posterior change in the number of doublings, ∆growth rate(T*/AT* - T/AT*), of the most toxic toxin mutants are plotted vs. their significance (-log10(p(∆growth rate<0))) of deviation from the AT* single mutation. This is based on 10,000 discrete samples of the posterior ∆growth rate(T*/AT* - T/AT*) values inferred from the hierarchical Bayesian inference model (see Methods). Vertical line: +0.5 ∆growth rate, horizontal line: p(∆growth rate>0) = 0.0001. Red indicates significant and substantial beneficial toxin substitution using this cutoff. Experiments performed under ‘high antitoxin’ expression conditions. b, The minimum atom distance from a given deleterious antitoxin residue to each beneficial toxin is plotted vs. ∆growth rate(T*/AT* - T/AT*). Experiments performed under ‘high antitoxin’ expression conditions. c, The minimum atom distance from a given deleterious antitoxin residue to each beneficial toxin is plotted vs. ∆growth rate(T*/AT* - T/AT*). Experiments performed under ‘low antitoxin’ expression conditions. d, Distance vs. ∆growth rate(T*/AT* - T/AT*) of beneficial toxin variants for all deleterious antitoxin variant backgrounds. Experiments performed under ‘low antitoxin’ expression conditions. Values for (b-d) shown for double mutants with ∆growth rate effect size > +0.5 and p(∆growth rate>0) < 0.0001.

Extended Data Fig. 4 A non-specific, nonlinear model can explain most of the observed single and double-mutant growth rates.

a, Schematic of nonlinear, non-specific model: double-mutant expected growth rates (brown) are based on the independent (non-specific) sum of underlying toxin and antitoxin mutant effects, passed through a sigmoid function (yellow). b,c, Residuals for nonlinear, non-specific model (b) or linear non-specific model of the same structure without a non-linearity (c) showing unbiased residuals for the nonlinear model, but a complete misfit of the linear model. Model built using ‘high antitoxin’ expression levels. Explained variance (R2) is indicated. Significant and substantially positively (dark green) or negatively (green) deviating mutations are shown in (b) (see Methods). d, Inferred independent toxin single-substitution effects among the set of most toxic toxin mutants demonstrating a tail of independently beneficial toxin variants. Experiment performed under ‘high antitoxin’ expression levels. e,f, Nonlinear independent model fit to growth rates measured under ‘high antitoxin’ (e) or ‘low antitoxin’ (f) expression conditions. The wild-type toxin -antitoxin pair is inferred to be differently close to the sigmoid ‘cliff’ between expression conditions. g, Cartoon illustrating different detection of single-mutant effects depending on expression conditions. h-j, Correlation of inferred single-mutant effects (h), observed single-mutant ∆growth rate(T*/AT* - T/AT) effects (i), and double-mutant deviations of observed from expected growth rates (j) from separate inference under ‘high antitoxin’ (x axis) or ‘low antitoxin’ (y axis) expression conditions.

Extended Data Fig. 5 Deviation of observed from expected double-mutant growth rates reveals toxin variants with specific or with only non-specific beneficial effects, and fraction of specific vs. non-specific toxin variants.

a, For each beneficial toxin mutation (indicated above each plot) combined with each antitoxin variant indicated on the x axis, the plot shows the growth rate relative to the wild-type toxin–antitoxin pair (mean posterior ∆growth rate(T*/AT* - T/AT)). Grey dots represent T*/AT*, error bars indicate 95% posterior highest density interval. The ∆growth rate for each antitoxin mutant combined with wild-type toxin (T/AT*) is shown (black dots) along with the ∆growth rate for T*/AT* expected under the non-specific, nonlinear model (green dots). b, Deviation of the observed (dots) from the expected double-mutant growth rates (orange line) highlights classification of specific and non-specific toxin variants. Beneficial toxin substitutions (rows, n = 32) ordered by their range of growth rate deviations across deleterious antitoxin variants as in panel b. c-g, Specific vs. non-specific enabling toxin variants under ‘high’ antitoxin expression for all enabling toxin variants grouped by deleterious antitoxin for the more stringent set of 310 ‘most toxic’ toxins (c) and less stringent set of 781 ‘toxic’ toxins (d). Orange and purple indicate mutant pairs involving non-specific and specific, respectively, rescuing mutations in the toxin. Enabling toxin variants under ‘low’ antitoxin expression at different absolute growth rate cutoffs relative to the wild-type toxin/antitoxin growth rate (e), or grouped by ‘most toxic’ (f) or ‘toxic’ (g) toxin variants. h, Inferred non-specific toxin variant effect vs. minimum atom distance to any antitoxin atom for 21 non-specifically rescuing toxin variants (orange). i, j, For specific and non-specific beneficial toxin mutants, the change in growth rate in a deleterious antitoxin mutant background, ∆growth rate (T*/AT* - T/AT*), is plotted vs. minimum atom distance to the deleterious antitoxin mutation it rescues (i) or any antitoxin atom (j) in the ‘low antitoxin’ expression condition.

Extended Data Fig. 6 Natural sequence statistics, EVcouplings or DeepSequence models are not predictive of beneficial toxin substitution effects.

a, Distribution of number of specific and non-specific beneficial toxin substitutions (purple) vs. all possible toxin variants (grey) observed in natural sequences. b, Frequency distribution of beneficial toxin and deleterious antitoxin mutant pairs in natural sequences, with 29/51 pairs never observed. c-e, Effect size of toxin variant rescue vs. frequency of variant pair in natural sequences (c), conditional frequency of toxin variant given natural sequences containing the particular deleterious antitoxin substitution (d), or enrichment of beneficial toxin variant in natural sequences containing the deleterious antitoxin substitution (e). f-g, EVcouplings model inferred site-wise toxin mutant preferences (hi) vs. toxin mutant effect inferred in suppressor scan with the Pearson correlation coefficient indicated (f), or EVcouplings pairwise T*/AT* variant preference (Jij) vs. effect size of beneficial toxin mutation effect in a deleterious antitoxin variant background (g). h, Scatterplot of observed beneficial toxin effect in deleterious antitoxin mutant backgrounds (AT*), vs EVmutation (top row) or DeepSequence (variational auto-encoder) mutation effect predictions (bottom row). Pearson correlation (r) is indicated. i, Distribution of natural sequence identity fractions across the alignment. Different histograms illustrate fraction mutated for homologues containing the full concatenated toxin and antitoxin (grey), the toxin homologues only (blue), or the antitoxin homologues only (turqouise).

Extended Data Fig. 7 Non-specific suppressor toxin ParE3 variants are as or almost as toxic as wild-type ParE3, and reproducibility of antitoxin combinatorial variant log-read ratios.

a, Growth rates of ParE3 non-specific suppressor toxin variants (blue) compared to wild-type toxin ParE3 without antitoxin (black) and wild-type toxin and antitoxin (grey) under fully inhibitory toxin expression conditions (0.00012% [arabinose]) or half-maximal inhibitory expression conditions (0.00006% [arabinose]). Dark lines represent the mean OD600, shaded regions show standard deviation of the replicates (n = 10 or n = 11). b, Raw log-read ratio reproducibility between biological replicates (+1 pseudocount) for the combinatorial antitoxin library (8000 amino acid variants) in different toxin mutant backgrounds. Specific classes of antitoxin mutants, and Pearson correlation coefficients (r) are indicated.

Extended Data Fig. 8 Bayesian hierarchical model.

a, Simplified description of the Bayesian hierarchical model. Pre- and post-selection reads for each codon are drawn from a Poisson distribution. The log-ratios of these Poisson parameters are not fixed between synonymous codons but are instead drawn from a normal distribution, whose mean forms the amino acid mutant growth rate of interest. This model allows for different synonymous codons to inform each other as well as the amino acid mutant growth rate without being completely fixed. b, Full plate diagram description of the hierarchical Bayesian model capturing both replicates. Replicate index i takes values 1 or 2, amino acid index m takes on values ranging from 1-2040 (20*102) for the toxin or 1-1840 (92 * 20) for the antitoxin, codon index n takes on values ranging from 1-6426 (63*102) for the toxin or 1-5796 (63*92) for the antitoxin. Circles indicate random variables, grey circles represent observed random variables. c, Description of variables, likelihood function and priors used. The likelihood function incorporates maximum entropy distributions for the observed variables, and the priors incorporate computationally tractable, vague priors for the amino acid substitution growth rates. The relative priors on the standard deviation of replicate σ_repn vs. synonymous variant σ_synm reflect our prior belief that replicate experiment noise is larger than synonymous mutant noise. σ_bi and r_scale have improper priors.

Extended Data Fig. 9 Validation of Bayesian growth rate inference on synthetic datasets.

a, Three different true synthetic growth rate distributions used for simulating pre- and post-selection codon variant read count data. Synthetic growth rate distributions were chosen from observed toxin single-mutant growth rate distributions in 3 different antitoxin backgrounds, spanning the range of distributions observed. b,c, Inferred growth rates using the Bayesian hierarchical model (b) show less bias and incorporate uncertainty estimates compared to mean log-read ratio summary of pre-and post-selection read counts (+1 pseudocount) (c). Error bars in panel b reflect the 95% highest density posterior intervals, with the measure of centre being the mean posterior growth rate. d, Model uncertainties accurately reflect deviations of inferred true growth rates. Percentage of true synthetic amino acid growth rates falling into a certain highest density interval among all 2040 simulated toxin amino acid variants.

Extended Data Fig. 10 Posterior predictive checks show that the Bayesian hierarchical model can capture observed data statistics for both replicate experiments, whereas a non hierarchical model cannot.

a,b, A non-hierarchical model, in which all synonymous codon variants have the same growth rate (a), cannot explain the observed data. (b) The observed standard deviation of log-read ratios for synonymous wild-type toxin codon variants (red) (n = 278) fall outside of the non-hierarchical model’s expectations (grey). c, The synonymous amino acid mutant standard deviations within a replicate (y axis) are higher than codon mutant standard deviations between replicates (x axis). Light green indicates binned average. d, Bayesian hierarchical model allows for growth rate variation between synonymous codon mutants by drawing these from a Gaussian distribution. e-g, Observed data statistics fall within the hierarchical Bayesian model’s expected values. (e) The observed standard deviation of synonymous wild-type toxin codon mutant log-read ratios (red) fall within the model simulated values (stdev(log(c_post1k/c_pre1k) or stdev(log(c_post2k/c_pre2k) for biological replicate 1 or 2 respectively), see model code). Compare to panel (b) for the non-hierarchical model. (f) For each codon mutant, the hierarchical Bayesian model allows for simulating pre- and post-selection read counts (log(c_posti,n/c_prei,n), see ED Fig. 9), including log-read ratios, using the posterior parameter distribution. For each codon mutant, we calculate the p-value statistic (ie. the fraction of simulated samples falling below the observed log-read ratio). (g) Distribution of posterior simulated p-values for various statistics, demonstrating that no observed data statistic is biased to fall outside of the posterior simulated statistics.

Supplementary information

Reporting Summary

Peer Review Information

Supplementary Tables

Supplementary Tables: 1, Spatial distances of rescuing toxin substitutions to the antitoxin; 2, Strains created in this study; 3, Primers used in this study.

Supplementary Data

Location of beneficial toxin substitutions on the crystal structure.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ding, D., Green, A.G., Wang, B. et al. Co-evolution of interacting proteins through non-contacting and non-specific mutations. Nat Ecol Evol 6, 590–603 (2022). https://doi.org/10.1038/s41559-022-01688-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41559-022-01688-0

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing