Main

Microbial symbioses prevail in biological kingdoms, in which diverse relationships encompass parasitism, commensalism and mutualism1,2. Among them, the most intimate symbioses are found among mutualistic ones, in which the host and the symbiont constitute an integrated biological entity and suffer disadvantages without the partnership3,4. Originally, such microbial symbionts must have had no relationship with their host organisms, plausibly having existed as environmental microorganisms. It is of fundamental evolutionary interest how ordinary free-living microorganisms have become indispensable mutualists. How many and what mutations are required for the evolution of mutualism? How quickly does the evolution of mutualism proceed? To address these questions, experimental evolutionary approaches may provide valuable insights5,6,7,8,9,10,11,12.

Recently, an experimental evolutionary system consisting of an insect, Plautia stali, as host and a bacterium, Escherichia coli, as symbiont was established, which brought about unprecedented insight into an early stage of the evolution of mutualistic symbiosis13,14. The stinkbug P. stali possesses a midgut symbiotic organ full of a specific bacterial symbiont of the genus Pantoea, which is essential for growth and survival of the host insect15,16,17,18. The famous model bacterium E. coli is a component of the mammalian gut microbiome and has no previous relationship with the insect19,20. However, when symbiont-deprived newborn nymphs of P. stali were experimentally inoculated and maintained with a hypermutating E. coli strain, multiple evolutionary lines showed a significantly improved adult emergence rate and body colour within several months to a year, indicating rapid and recurrent evolution of ‘mutualistic’ E. coli13. Analysis of the independently evolved mutualistic E. coli lines revealed that single loss-of-function mutations on cyaA and crp genes that convergently disrupt the carbon catabolite repression (CCR) global transcriptional regulator system, which is involved in bacterial metabolic switching in response to nutritional stresses21,22, are responsible for the mutualistic host phenotypes. These results revealed that elaborate mutualistic symbiosis can evolve very easily and rapidly through a single gene mutation13.

The finding that only a single gene disruption is sufficient for making E. coli an insect mutualist is certainly striking, but it should be noted that disruption of the CCR pathway globally affects the expression levels of over 500 downstream genes encoded on the E. coli genome23. Hence, we expected that the causative genes directly responsible for the improved host phenotypes must be identified among the downstream E. coli genes under CCR regulation, although the nature of such genes was totally unknown. Here we report the identification of a bacterial enzyme gene under CCR regulation whose disruption underpins the evolution of insect–bacterium mutualism not only in the laboratory but also in nature.

Results

No CCR disruption in natural symbionts of P. stali

In Japanese populations of P. stali, six Pantoea-allied symbiotic bacteria, Pantoea spp. A, B, C, D, E and F (abbreviated as Sym A and Sym B for uncultivable ones, and Sym C, Sym D, Sym E and Sym F for cultivable ones), are present15. While all the symbiotic bacteria can support normal growth and reproduction of P. stali, genome sequencing revealed that their genomes consistently retain the intact CCR pathway genes cyaA and crp (Supplementary Table 1a). These observations strongly suggested that the CCR disruption, which was observed in the laboratory evolution of mutualism with hypermutating E. coli13, is not involved in the evolution of mutualistic symbionts in natural populations of P. stali.

Survey of downstream E. coli genes affected by CCR disruption

Considering that disruption of the CCR global transcriptional regulator system generally affects expression levels of hundreds of genes encoded on bacterial genomes23, it seemed plausible that some genes downstream of the CCR pathway may actually be responsible for the mutualistic phenotypes of the CCR-disruptive E. coli mutants. In this context, we focused on 58 CCR-regulated genes that were identified to be commonly down- or upregulated in two independent mutualistic E. coli evolutionary lines, CmL05 and GmL07, identified in our previous study13. These genes consisted of a diverse array of functional genes such as transporter genes for non-glucose sugars, carbohydrate metabolism genes, quorum sensing genes, extracellular matrix production genes, transcription factor genes and others (Extended Data Fig. 1).

Elevated tryptophan levels after the evolution of mutualism as well as CCR disruption in E. coli

To gain insight into metabolic aspects of the mutualistic evolutionary and mutant E. coli lines, we conducted comparative transcriptomic and metabolomic analyses of the E. coli-infected P. stali. A promising clue came from quantitative analysis of free amino acids in P. stali infected with the CCR-disruptive E. coli mutants ΔcyaA and Δcrp, in which a specific essential amino acid, tryptophan, showed over ten times higher levels in the haemolymph and symbiotic organ compared with the insects infected with the wild-type control E. coli strain ΔintS (Fig. 1a,b and Extended Data Fig. 2). Of the 58 candidate genes, 2 genes were related to tryptophan: tnaA encoding tryptophanase and tnaB encoding a component of tryptophan transporter (Fig. 1c,d and Extended Data Fig. 1). We obtained deletion mutants of these genes, ΔtnaA and ΔtnaB, and inoculated the E. coli mutants into P. stali. Tryptophan levels in the haemolymph and symbiotic organ were significantly elevated in the ΔtnaA-infected insects but not in the ΔtnaB-infected insects (Fig. 1a,b and Extended Data Fig. 2). The elevated tryptophan levels in the ΔtnaA-infected insects were comparable to (1) those in insects infected with the CCR-disruptive mutants ΔcyaA and Δcrp, (2) those in insects infected with the mutualistic evolved E. coli strain CmL05 (ref. 13) and (3) those in normal symbiotic insects with the natural Pantoea symbiont Sym A (Fig. 1a,b, and Extended Data Figs. 2 and 3a,b). A tryptophanase assay confirmed that not only ΔtnaA but also ΔcyaA, Δcrp and CmL05 lost the tryptophanase activity whereas ΔintS and ΔtnaB were tryptophanase positive (Extended Data Fig. 3c–e).

Fig. 1: Improved phenotypes of P. stali infected with the tryptophanase-disrupted ΔtnaA mutant of E. coli.
Fig. 1: Improved phenotypes of P. stali infected with the tryptophanase-disrupted ΔtnaA mutant of E. coli.
Full size image

a,b, Effects of knockout mutants of E. coli, ΔcyaA, Δcrp, ΔtnaA and ΔtnaB, on tryptophan levels in haemolymph (a) and the symbiotic organ (b). c,d, Transcriptomic data on the downregulation of the tnaA gene (c) and tnaB gene (d) after the evolution of mutualism in the evolutionary E. coli lines CmL05 and GmL07 (Extended Data Fig. 1). e,f, Effects of knockout mutants of E. coli, ΔcyaA, Δcrp, ΔtnaA and ΔtnaB, on adult emergence rates (e) and body colour (f). Note that Sym A, the natural symbiont of P. stali15, comprises a mutualistic positive control, whereas ΔintS E. coli represents a non-beneficial negative control13. For the box plots, centre lines, limits and dots show medians, first and third quartiles, and data points, respectively. Different alphabetical letters (a, b, c and d) indicate statistically significant differences (pairwise Wilcoxon rank–sum test with Hommel’s correction: P < 0.05, two sided). Biological replicate numbers are indicated on the graphs. g, External appearance of the adult insects infected with Sym A, ΔcyaA, Δcrp, ΔtnaA, ΔtnaB and ΔintS obtained in the study. CPM, count per million; Trp, tryptophan.

Source data

Improved host phenotypes induced by tryptophanase disruption in E. coli

The ΔtnaA-infected insects showed significantly higher adult emergence rates and remarkably greenish body colour compared with the ΔintS-infected control insects, which were comparable to those of the ΔcyaA- and Δcrp-infected insects, and also comparable to those of the normal symbiotic insects with the natural Pantoea symbiont Sym A (Fig. 1e–g and Extended Data Fig. 4a,b). However, the ΔtnaA-infected insects showed no significant improvement in body size compared with the ΔintS-infected control insects, which was also the case for the ΔcyaA- and Δcrp-infected insects (Extended Data Fig. 4c,d). When a functional tnaA gene was introduced into the ΔtnaA E. coli strain (Supplementary Fig. 1a), inoculation of the recombinant ΔtnaA::tnaA E. coli strain resulted in reduced haemolymphal tryptophan and attenuation of the improved host performance (Extended Data Fig. 5). When the tnaA gene in the ΔcyaA E. coli strain was constitutively expressed by introduction of an ectopic promoter sequence (Supplementary Fig. 1b), inoculation of the recombinant ΔcyaA Pconst-tnaA E. coli strain also resulted in lower haemolymphal tryptophan and cancellation of the improved host performance (Extended Data Fig. 5). These results revealed that (1) disruption of the tryptophanase gene in E. coli, which is under CCR regulation, significantly improves the survival and body colour of infected P. stali, (2) the improved host performance due to infection with CCR-disruptive mutants, ΔcyaA and Δcrp, is attributable to downregulation of the CCR downstream target gene tnaA and (3) tryptophanase disruption is a pivotal mechanism that underpins the evolution of P. staliE. coli mutualism.

Why host performance improves through tryptophanase disruption in E. coli

Extended Data Fig. 6 summarizes the results showing the molecular mechanisms involved in the laboratory evolution of P. staliE. coli mutualism. Before the evolution of mutualism, bacterial CCR operates, tryptophanase is expressed, tryptophan is broken down and the host suffers poor performance (Extended Data Fig. 6a). After the evolution of mutualism, bacterial CCR is disrupted, tryptophanase is suppressed, tryptophan accumulates and the host shows good performance (Extended Data Fig. 6b). Why does tryptophanase disruption in symbiotic E. coli result in improved performance of the host P. stali? Considering that tryptophanase converts tryptophan into indole, pyruvate and ammonium24, we conceived two plausible hypotheses, which are not necessarily mutually exclusive. The first hypothesis is that tryptophanase disruption suppresses toxic indole production and thereby improves host fitness, on the grounds that perturbation of tryptophan metabolism mediated by gut microbiota towards the indole pathway tends to be linked to pathology and disease25. The second hypothesis is that tryptophanase disruption results in the accumulation of the essential amino acid tryptophan and thereby contributes to host fitness, given that symbiont-mediated provisioning of tryptophan is important for diverse plant-sucking insects26.

Effects of indole and tryptophan feeding

To test these hypotheses, we administered different concentrations of indole and tryptophan to P. stali nymphs infected with the tryptophanase-deficient ΔtnaA E. coli and those infected with the control ΔintS E. coli via drinking water. As indole doses were elevated, adult emergence rates declined in both the ΔtnaA-infected insects and the ΔintS-infected insects, with the level of decline less conspicuous in the ΔtnaA-infected insects than in the ΔintS-infected insects (Fig. 2a,b). These results favoured the notion that indole accumulation is detrimental to the growth and survival of P. stali. As tryptophan doses were elevated, adult emergence rates were not affected in the ΔtnaA-infected insects (Fig. 2c) but were suppressed in the ΔintS-infected insects (Fig. 2d). These results seemed unexpected at a glance considering that tryptophan is an essential amino acid. However, it should be noted that the laboratory insects were provided with highly nutritious food (raw peanuts); tryptophan feeding may thus lead to excessive tryptophan intake, and the tryptophanase-positive ΔintS E. coli may convert the excess tryptophan into toxic indole. Quantification of tryptophan and indole in haemolymph samples of these experimental insects revealed that (1) the ΔtnaA-infected insects showed little haemolymphal indole; (2) by contrast, the ΔintS-infected insects showed significantly higher levels of haemolymphal indole; (3) in the ΔintS-infected insects, indole and tryptophan feeding tended to result in elevated levels of haemolymphal indole; and (4) in the ΔintS-infected insects, tryptophan levels were consistently low (Fig. 2e,f). These results accounted for the observations that not only indole feeding but also tryptophan feeding resulted in negative fitness consequences preferentially in the ΔintS-infected insects (Fig. 2a–d). Metabolomic analysis confirmed that higher haemolymphal tryptophan levels and lower haemolymphal indole levels were observed in the insects infected with the CCR-deficient mutant ΔcyaA and the evolved E. coli strain CmL05G13, compared with the ΔintS-infected insects (Extended Data Fig. 7a–c). It was also shown that some tryptophan-derived metabolites, such as 5-hydroxytryptamine, kynurenine, 3-hydroxykynurenine, indole-3-acetic acid and indole-3-carboxylic acid, tended to show higher haemolymphal levels in the insects infected with the CCR-deficient mutant and evolutionary E. coli strains, ΔcyaA and CmL05G13, compared with the ΔintS-infected insects (Extended Data Fig. 7d–k).

Fig. 2: Effects of oral administration of indole and tryptophan on P. stali infected with tryptophanase-disrupted and control E. coli.
Fig. 2: Effects of oral administration of indole and tryptophan on P. stali infected with tryptophanase-disrupted and control E. coli.
Full size image

a,b, Effects of indole feeding via drinking water on growth and survival of P. stali infected with tryptophanase-disrupted ΔtnaA E. coli (a) and control ΔintS E. coli (b). c,d, Effects of tryptophan feeding via drinking water on growth and survival of P. stali infected with tryptophanase-disrupted ΔtnaA E. coli (c) and control ΔintS E. coli (d). e,f, Effects of indole and tryptophan feeding via drinking water on haemolymphal indole levels (e) and tryptophan levels (f) of P. stali infected with tryptophanase-disrupted ΔtnaA E. coli and control ΔintS E. coli. For box plots, centre lines, limits and dots show medians, first and third quartiles, and data points, respectively. Different alphabetical letters (a, b, c and d) indicate statistically significant differences (pairwise Wilcoxon rank–sum test with Hommel’s correction: P < 0.05, two sided). Biological replicate numbers are indicated on the graphs.

Source data

Effects of tryptophan overproduction by E. coli

In addition to the feeding experiments, we examined the effects of upregulated tryptophan production by genetically manipulated E. coli. When a tryptophan-overproducing E. coli mutant, ΔtrpR, which is disruptive of the trp operon repressor trpR27, was inoculated into P. stali, the ΔtrpR-infected insects showed significantly improved adult emergence rates and body colour compared with the control ΔintS-infected insects, whereas the levels of improvement were not comparable to those of the ΔtnaA-infected insects (Extended Data Fig. 8a,b). Notably, despite the tryptophan overproduction by ΔtrpR E. coli, the ΔtrpR-infected insects did not show elevated tryptophan levels in haemolymph (Extended Data Fig. 8c,d). It seems plausible, although speculative, that the tryptophan production by ΔtrpR E. coli is at such a level that the host insects promptly use up the limited essential amino acid for their growth and development. These results corroborated the notion that E. coli-derived tryptophan contributes to the improvement of host fitness.

Absence of the tryptophanase gene in natural symbiotic bacteria of stinkbugs

Given that tryptophanase disruption makes E. coli mutualistic to P. stali in the laboratory, it is of interest whether natural symbiotic bacteria of stinkbugs retain the tnaA gene or not. First, we inspected six genomes of natural Pantoea-allied symbionts Sym A, B, C, D, E and F of P. stali, in which no tnaA gene was found (Fig. 3 and Supplementary Table 1a). Next, we determined seven genomes of Pantoea-allied Sym C isolated from seven additional Ryukyu Island populations of P. stali, from which no tnaA gene was detected (Fig. 3 and Supplementary Table 1a). Next, we determined five genomes of Pantoea-allied Sym C of other stinkbugs collected at Ryukyu Islands, namely, three local isolates from Axiagastus rosmarus, one isolate from Lampromicra miyakona and one isolate from Scutellera amethystina, all of which encoded no tnaA gene (Fig. 3 and Supplementary Table 1a). Finally, using an inoculation and screening procedure with symbiont-free newborn nymphs of P. stali15, we screened and isolated environmental bacteria capable of supporting growth of P. stali from soil samples collected at five Ryukyu Islands, namely six isolates from Ishigaki Island, two isolates from Okinawa Island, two isolates from Miyako Island, three isolates from Yonaguni Island and four isolates from Tokunoshima Island. All the environmental bacterial isolates potentially symbiotic to P. stali were phylogenetically placed in the genus Pantoea and devoid of the tnaA gene in their genomes (Fig. 3 and Supplementary Table 1a). Enzymatic assay confirmed that the natural symbiotic bacteria of P. stali as well as the environmental bacterial isolates potentially symbiotic to P. stali consistently lack tryptophanase activity (Extended Data Fig. 9a–d). Inoculation of these bacterial isolates into symbiont-free newborn nymphs of P. stali verified that they can support growth and survival of the host stinkbugs, with the stinkbug-derived isolates tending to induce better host performance than the soil-derived isolates (Extended Data Fig. 10). These results suggested that the lack of the tnaA gene may be related to the ability of Pantoea-allied bacteria to establish symbiosis with P. stali.

Fig. 3: Molecular phylogenetic relationship of natural and potential mutualistic symbionts of P. stali and other stinkbugs.
Fig. 3: Molecular phylogenetic relationship of natural and potential mutualistic symbionts of P. stali and other stinkbugs.
Full size image

The maximum likelihood phylogeny is inferred from amino acid sequences of 106 concatenated essential single-core genes (35,647 aligned amino acid sites). Statistical support values for each clade are shown at the node in the order of maximum likelihood and Bayesian analyses. Collection localities are shown in the map of the mainland and Ryukyu Islands of Japan. The colours of the bacterial taxon labels and the squares on the map correspond to the symbiont categories depicted at the bottom left (Supplementary Table 1a). The presence or absence of the tnaA gene is shown beside the taxon labels.

Source data

Pantoea ananatis with the tryptophanase gene are incapable of symbiosis with P. stali

Of 105 Pantoea genomes retrieved from the GenBank database, 78 genomes lacked the tnaA gene while 27 genomes retained the tnaA gene, with the majority, 19 genomes, affiliated to P. ananatis (Supplementary Table 1b). We obtained 4 strains of P. ananatis from culture collections, which were all confirmed to be tnaA and tryptophanase positive (Extended Data Fig. 9a–e). When they were inoculated into symbiont-free newborn nymphs of P. stali, few adult insects emerged (Extended Data Fig. 9f), indicating that the tnaA-carrying P. ananatis strains are incapable of establishing symbiosis with P. stali.

Tryptophanase disruption improved the symbiotic performance of P. ananatis

Is the inability of P. ananatis to establish symbiosis with P. stali relevant to the tryptophanase gene on the bacterial genome? We generated a knockout mutant of P. ananatis JCM6986 by homologous recombination targeting the tnaA gene (Fig. 4a). PCR detection confirmed deletion of the tnaA gene (Fig. 4b), and an enzymatic assay verified the loss of tryptophanase activity in the mutant (Fig. 4c). When the ΔtnaA mutant of P. ananatis was inoculated into symbiont-free newborn nymphs of P. stali, the nymphal survival and adult emergence rate significantly improved (Fig. 4d,e), although the adult emergence rate was only less than 10% on average (Fig. 4e). These results indicated that the non-symbiotic Pantoea strain becomes, although partially, mutualistic to P. stali by loss-of-function mutation of the tryptophanase gene.

Fig. 4: Knockout of tryptophanase gene tnaA in P. ananatis and effects on its capability for symbiosis with P. stali.
Fig. 4: Knockout of tryptophanase gene tnaA in P. ananatis and effects on its capability for symbiosis with P. stali.
Full size image

a, Knockout scheme of the tnaA gene by homologous recombination. b, PCR check of homologous recombination. c, Enzymatic assay of tryptophanase disruption. d, Effects on the survival curve. Line points, limits and dots show means, standard deviations and data points, respectively. Statistical analysis was conducted on the 36th day data using Welch’s two-sample test (P = 3.73 × 10−4, two sided). e, Effects on the adult emergence rate. For box plots, centre lines, limits and dots show medians, first and third quartiles, and data points, respectively. Statistical analysis was conducted using a Wilcoxon rank–sum test (P = 0.00041, two sided). Biological replicate numbers are indicated on the graphs. FRT, flippase recognition target; NC, negative control; WT, wild-type P. ananatis.

Source data

Natural Pantoea symbiont reduced symbiotic performance when transformed with functional tryptophanase

Finally, we artificially introduced a functional tna operon of P. ananatis into the genome of Sym F, a natural symbiont of P. stali that is cultivable and able to support host growth and survival15, by homologous recombination targeting the presumably non-functional transposon-related gene intB (Fig. 5a and Supplementary Fig. 2). The intB::tnaAB symbiont transformant showed significant tryptophanase activity (Fig. 5b), verifying that the introduced tna operon is functioning in the transformed symbiont strain. When symbiont-free newborn nymphs of P. stali were inoculated with the tryptophanase-producing intB::tnaAB symbiont, their adult emergence rate and body colour were negatively affected compared with those inoculated with the control ΔintB symbiont strain (Fig. 5c–f). In these insects, infection with the tryptophanase-producing intB::tnaAB symbiont resulted in lower tryptophan levels and higher indole levels than infection with the control ΔintB symbiont strain (Fig. 5g,h). These results indicated that the absence of the tryptophanase gene may contribute to the mutualistic properties of the natural symbiont of P. stali.

Fig. 5: Transformation of natural symbiont Sym F with the functional tna operon and effects on its capability for symbiosis with P. stali.
Fig. 5: Transformation of natural symbiont Sym F with the functional tna operon and effects on its capability for symbiosis with P. stali.
Full size image

a, Introduction scheme of the tna operon by homologous recombination. b, Enzymatic assay of the expression and functioning of the introduced tna operon. c, Effects on the adult emergence rate. d, Effects on adult body colour. e, Effects on adult female body size. f, Effects on adult male body size. g, Effects on tryptophan levels in haemolymph. h, Effects on indole levels in haemolymph. For box plots, centre lines, limits and dots show medians, first and third quartiles, and data points, respectively. Statistical analysis was conducted by two-sided Wilcoxon rank–sum test (P values are shown on the graphs). Biological replicate numbers are indicated on the graphs.

Source data

Discussion

Our previous study showed that a single gene mutation, ΔcyaA or Δcrp, disrupting the bacterial CCR pathway makes E. coli mutualistic to P. stali13, which led to the notion that elaborate mutualistic symbiosis can evolve more easily and rapidly than conventionally envisioned. On account of the diverse phenotypic changes observed with the mutualistic E. coli strains and mutants13, we expected that multiple genes downstream of the CCR pathway may be involved in the evolution of P. staliE. coli mutualism. Unexpectedly, however, we found that a single enzyme gene, tnaA, encoding tryptophanase, which is under CCR regulation, is the major effect gene whose disruption is sufficient for establishing P. staliE. coli mutualism. This finding further corroborates the notion that elaborate mutualistic symbiosis can evolve easily and rapidly by a single gene mutation.

Plausibly, tryptophanase disruption contributes to host fitness via reduction of toxic indole and via accumulation of the potentially limited essential amino acid tryptophan. It should be noted that diverse stinkbugs rely on their gut symbiotic bacteria to provide essential amino acids and vitamins28,29,30,31, and the E. coli genome encodes the genes needed for synthesis of all these nutrients32. In plant-sucking aphids, the essential bacterial symbiont Buchnera encodes and amplifies synthetic genes for tryptophan on a plasmid33,34. Detailed physiological studies, for example those using a nutritionally defined artificial diet developed for aphids35,36, are needed for further understanding of the insect–bacterium nutritional interactions and interdependency.

Our genomic and functional investigations revealed that the loss of the tryptophanase gene is not only underpinning the laboratory evolution of P. staliE. coli mutualism but, plausibly, also involved in the evolution of bacterial mutualists of the genus Pantoea that have recurrently occurred in natural populations of P. stali and other stinkbugs15,37,38,39. Of course, a variety of genetic changes of both partners must have contributed to the establishment and maintenance of the stinkbug–bacterium mutualistic symbioses in nature, and only a part of which may be attributable to the loss of the tnaA gene of the symbiont side. On account of the consistent absence of the tnaA gene among the diverse stinkbug symbiont genomes, which encompass cultivable ones with large genome sizes to uncultivable ones with reduced genome sizes (Supplementary Table 1a,c), we hypothesize that, although speculative, tryptophanase disruption may have facilitated the evolution of stinkbug–bacterium mutualism. Tryptophanase-deficient environmental Pantoea strains may have predisposed the establishment of symbiosis with stinkbugs. Alternatively, tryptophanase disruption may tend to occur at an early stage of the stinkbug–bacterium symbiosis in the course of symbiont genome degeneration. In either case, it seems plausible that tryptophanase disruption acts as a pivotal mutation of the symbiont side that facilitates, canalizes and stabilizes the relationship towards mutualism.

By contrast, while loss-of-function mutations of cyaA and crp, which disrupt CCR regulation, were identified as responsible for the evolution of P. staliE. coli mutualism in the laboratory13, most of the Pantoea-allied natural symbiotic bacteria associated with P. stali and other stinkbugs, particularly those whose genomes are not so reduced, retain the cyaA and crp genes in their genomes (Supplementary Table 1a,c). These observations suggest that, in nature, disruption of the CCR pathway is generally not involved in the evolution of gut bacterial mutualists that are indispensable for the plant-sucking stinkbugs40,41,42. Considering that CCR regulation is important for bacterial adaptation to fluctuating environments by switching the main carbon source from a depleted one to an abundant one21,22, it is conceivable, although speculative, that CCR disruption can evolve under stable environments such as laboratory conditions, but it may be generally detrimental for bacteria that are thriving under fluctuating natural environments. These observations provide an important lesson that symbiotic evolution in the laboratory does not necessarily reflect symbiotic evolution in nature.

Among diverse life forms, the tnaA gene is found in various Gram-negative bacteria, whereas it is less common in Gram-positive bacteria, archaea and eukaryotes43. In particular, tnaA is most commonly detected in Gammaproteobacteria, to which E. coli, Pantoea spp. and many insect symbionts belong44. How widely tnaA disruption is relevant to the evolution of mutualism in diverse host–microorganism symbiotic associations is currently elusive and deserves future studies.

Using the model system for the experimental evolution of symbiosis between P. stali and E. coli, we have shown that even a single bacterial gene mutation can facilitate the evolution of mutualism. In the real world, however, the processes and mechanisms of the evolution of mutualism must be much more complex, entailing multiple genes and mutations of both the host and symbiont. Integrative approaches to the evolution of mutualism conducted in this study, in which experimental evolution in the laboratory and the natural diversity of symbiosis are jointly investigated, will lead to a deeper understanding of how elaborate mutualistic symbioses have been established and maintained.

Methods

Insect samples, bacterial strains and primers used in this study

Stinkbug samples and their symbiotic bacteria used in this study are listed in Supplementary Table 1a. Genome data of Pantoea isolates and stinkbug symbionts were retrieved from DNA databases (Supplementary Table 1b,c). P. ananatis isolates JCM6986, JCM14682 and JCM15056 were obtained from the Japan Collection of Microorganisms, while strain AJ13355 was provided by Ajinomoto. E. coli strains and mutants used in this study are listed in Supplementary Table 1d. The PCR primers used in this study are listed in Supplementary Table 1e.

Insect rearing, symbiont sterilization and bacterial inoculation

For most experiments, a laboratory strain of P. stali was used. The insects were reared in clean plastic or paper containers and fed with sterilized peanuts and sterilized water supplemented with 0.05% ascorbic acid in climate chambers at 25 ± 1 °C under a long day regime of 16 h light and 8 h dark as described45. To prepare symbiont-deprived newborn nymphs, collected egg masses were soaked in 4% formaldehyde for 20 min, kept twice in sterilized water for 10 min each, air-dried in a clean bench and placed in sterile plastic Petri dishes with cotton balls. The Petri dishes were kept in an incubator at 25 °C, where symbiont-free newborn nymphs emerged. Bacteria were cultured in liquid LB medium and diluted to OD600 = 0.1. The diluted culture medium (around 1.5 ml) was applied to cotton balls in each Petri dish, through which the symbiont-free newborn nymphs orally acquired the bacterial suspension. After sucking bacteria-containing water, the first instar nymphs moulted to second instar within 4–5 days without feeding, to which several pieces of sterilized peanuts and a 1.5-ml tube of sterilized water containing 0.05% ascorbic acid were introduced. Then, 3–4 days later, the mature second instar nymphs were transferred to a new rearing cage consisting of a paper container, a plastic lid with a large hole for ventilation, draining mesh for preventing insect escape, a 25-ml bottle of sterilized water containing 0.05% ascorbic acid and sterilized peanuts. This rearing cage system, which was renewed every week, was devised to stably maintain the insect colonies in good condition for an extended period. The emerged adult insects were sexed, counted, kept in a refrigerator overnight and image scanned from their dorsal side using a scanner (EPSON GT-X980) 6 weeks after egg collection. On the basis of the scanned images, the body colour and body size of the insects were measured using the image analysing software Natsumushi v.1.10 (ref. 46).

Analysis of amino acids, tryptophan-derived metabolites and indole

Each haemolymph sample was collected using a glass capillary (1 µl, Drummond) from the neck of an ice-anaesthetized adult insect, suspended in 100 µl of 80% (v/v) methanol and stored at –80 °C until use. Each symbiotic midgut sample dissected from an adult insect was homogenized in 100 µl of 80% methanol and stored under the same conditions. After homoarginine, homophenylalanine, [15N]-tryptophan and 6-hydroxyindole were added as internal standards, each sample was centrifuged, and an aliquot of the supernatant was subjected to liquid chromatography and mass spectrometry analysis of amino acids, tryptophan-derived metabolites and indole. The detection and quantification of these metabolites were performed using a liquid chromatography and mass spectrometry system (Waters, ACQUITY UPLC H-class and Xevo G2-XS qTOF) with an electrospray ionization source. Amino acid composition was measured after propyl-chloroformate derivatization as previously described18,31. For measurement of tryptophan and related compounds, the sample aliquots were concentrated under N2 flow and resuspended in 0.05% (v/v) formic acid. Then, they were separated on a column (Waters, BEH C18, 1.7 µm, 2 mm × 100 mm) with a gradient elution of 0.05% formic acid and methanol. Each compound was selectively measured at a positive multiple reaction monitoring mode. As indole is not efficiently ionized under electrospray ionization, we derivatized it with p-dimethylaminocinnamaldehyde as described47. The derivatized indole was separated on the same analytical column and quantified at a positive multiple reaction monitoring mode.

Feeding experiments with tryptophan and indole

L-tryptophan (FUJIFILM Wako Pure Chemical Corporation) and indole (Tokyo Chemical Industry) were dissolved and serially diluted in sterilized water containing 0.05% ascorbic acid. For tryptophan, concentrations of 1.14 × 100, 10−1, 10−2 and 10−3 mg ml−1 were prepared. For indole, concentrations of 0.5 × 100, 10−1, 10−2 and 10−3 mg ml−1 were prepared. The experimental insects were reared with sterilized peanuts and supplemented water as described, but the supplemented water was renewed every 3 days or 4 days to minimize the deterioration of the supplemented reagents. The emerged adult insects were counted and subjected to measurements of haemolymphal tryptophan and indole levels 6 weeks after egg collection.

Tryptophanase activity assay

A qualitative assessment of tryptophanase activity was conducted essentially as described48. Each bacterial strain was cultured in 3 ml of LB or M9-based liquid medium at 25 °C with shaking at 180 rpm for 24 h or 48 h. Then, 100 μl of Kovács indole reagent (Sigma-Aldrich) was added to the bacterial culture, and a reddish colour indicated the presence of indole (Extended Data Fig. 3c). A quantitative assessment of tryptophanase activity during bacterial growth was conducted essentially as described49. Each bacterial strain was cultured in LB or M9 liquid medium at 25 °C with shaking at 180 rpm overnight, diluted with LB liquid medium to OD600 = 0.1 and dispensed to 24 test tubes as 1-ml aliquots. The test tubes were incubated at 25 °C with shaking at 180 rpm, from which 3 samples were taken every hour and subjected to measurement of OD600. Then, the samples were centrifuged at room temperature at 10,000 rpm for 3 min, and the supernatants were subjected to indole quantification using an Indole Assay Kit (MAK326, Sigma-Aldrich).

Complementation of the tnaA gene in the ΔtnaA E. coli strain

The kanamycin resistance gene (KmR) was deleted from the E. coli tnaA::KmR strain using flippase (FLP) recombinase-mediated excision (Supplementary Table 1d). To restore tnaA function, a tnaA–Tn5 fusion construct was generated. The tnaA coding region was amplified by PCR using the primers BW25113_tnaA_rescue_F and BW25113_tnaA_rescue_R, while the KmR gene was amplified using the primers BW25113_tnaA_rescue_nptII_F and BW25113_tnaA_rescue_nptII_R (Supplementary Table 1e). The assembled tnaAKmR cassette was re-amplified using the primers BW25113_tnaA_rescue_F and BW25113_tnaA_rescue_nptII_R (Supplementary Table 1e), and introduced into the BW25113 ΔtnaA strain via electroporation using λ Red recombination. The resulting strain was designated as BW25113 ΔtnaA::tnaA (Supplementary Table 1d). Successful integration of the construct was confirmed by PCR using the primers BW25113_tnaA-nptII_check_F and BW25113_tnaA-nptII_check_R (Supplementary Table 1e).

Construction of the ΔcyaA P_const–tnaA E. coli strain

To decouple tnaA expression from cAMP-CRP regulation, the native promoter and 5′ untranslated region of the tna operon were replaced by a synthetic constitutive promoter J23119. This promoter lacks the CRP-binding site, allowing transcription independent of intracellular cAMP levels and glucose concentration (Supplementary Fig. 1b). The kanamycin resistance gene was deleted from the cyaA::KmR E. coli strain using FLP recombinase-mediated excision (Supplementary Table 1d). The kanamycin resistance cassette containing the J23119 promoter was provided by N. Obana (University of Tsukuba) and amplified by PCR using the primers Pconst-tnaA_nptII-J23119_F and Pconst-tnaA_nptII-J23119_R (Supplementary Table 1e). The assembled fragment was introduced into the BW25113 ΔcyaA E. coli strain via electroporation using λ Red recombination. The final strain, BW25113 ΔcyaA Pconst-tnaA, was verified by PCR using the primers Pconst-tnaA_check_F and Pconst-tnaA_check_R (Supplementary Table 1e).

Knockout of the tnaA gene of P. ananatis

The presence of the tnaA gene in P. ananatis isolates was confirmed by PCR using the specific primers PA_tnaA_125F and PA_tnaA_1357R (Supplementary Table 1e). To knock out the tnaA gene in P. ananatis strain JCM6986, the primers PA_JCM6986_ΔtnaA_F and PA_JCM6986_ΔtnaA_R (Supplementary Table 1e) were used to amplify by PCR a kanamycin resistance gene (KmR) region from the E. coliintS mutant. The PCR product was purified using a QIAquick PCR Purification Kit (Qiagen). Transformation of P. ananatis JCM6986 with the plasmid pRed/ET (Gene Bridges) was performed using a Gene Pulser/MicroPulser (Bio-Rad Laboratories). Subsequently, the tnaA gene in the P. ananatis genome was replaced by the KmR cassette50. The successful insertion of the KmR was confirmed by the acquisition of kanamycin resistance in P. ananatis, and the loss of the pRed/ET plasmid was verified by the absence of tetracycline resistance. The deletion of the tnaA gene was further confirmed by specific PCR amplification using the primers PA_JCM6986_tnaA_check_F and PA_JCM6986_tnaA_check_R (Supplementary Table 1e). The resultant strain was designated as P. ananatis JCM6986 ΔtnaA (Supplementary Table 1d).

Transformation and expression of the tna operon in Pantoea sp. F

Given that deletion of the tnaC gene has been reported to result in constitutive expression of tnaAB genes51, we first disrupted the tnaC gene in P. ananatis JCM6986 by inserting a KmR region amplified by PCR using the primers PA_JCM6986_ΔtnaC_F and PA_JCM6986_ΔtnaC-RUT_R from the E. coliintS mutant (Supplementary Table 1e) into the tnaC locus (Supplementary Fig. 2). The successful deletion of tnaC was verified by specific PCR amplification with the primers PA_JCM6986_tnaC_check_F and PA_JCM6986_tnaC-RUT_check_R (Supplementary Table 1e). Next, the mutated tna operon from the resultant strain P. ananatis JCM6986 tnaC-RUT::KmR (Supplementary Fig. 2) was amplified using the primers SymF_intB/JCM6986_tnaAB_F and SymF_intB/JCM6986_tnaAB_R (Supplementary Table 1e). The intB gene of Pantoea sp. Plst-Sym F (Supplementary Table 1a) was then replaced by the mutated tna operon from the PCR fragment, resulting in the construction of the Plst-Sym F intB::tna operon tnaC-RUT::KmR (Supplementary Fig. 2). Finally, the KmR cassette was excised using FLP recombinase expressed from the plasmid pFLP3. The loss of pFLP3 was facilitated by the sacB-based suicide gene system52, thereby creating the final strain, Plst-Sym F intB::tnaAB (Supplementary Table 1d and Supplementary Fig. 2). The successful insertion of tnaAB was verified by specific PCR amplification with the primers SymF_intB_check_F and SymF_intB_check_R (Supplementary Table 1e). As a control strain in the infection experiments, the strain Plst-Sym F ΔintB (Supplementary Table 1d) was constructed by the same protocol for the strain Plst-Sym F intB::tnaAB except that the primers SymF_ΔintB_F and SymF_ΔintB_R (Supplementary Table 1e) were used to obtain a KmR region. The successful deletion of intB was verified by specific PCR amplification with the primers SymF_intB_check_F and SymF_intB_check_R (Supplementary Table 1e).

Genome sequencing and analysis

DNA samples of the uncultivable symbionts A-Plst-Sym A and B-Plst-Sym B (Supplementary Table 1a) were extracted from the symbiotic organs dissected from adult insects of P. stali. DNA samples of the cultivable bacteria were extracted from overnight cultures in LB liquid medium at 25 °C. The extraction of DNA from the materials was conducted using DNeasy Blood and Tissue Kits (Qiagen). The DNA samples were subjected to library preparation and sequencing using either the PacBio RSII or PacBio Sequel sequencing system (Pacific Biosciences of California). For the symbionts Sym A, C-Lami-ISGK-165 and C-Scam-OKNW-431 (Supplementary Table 1a), the DNA samples were additionally sequenced using the MinION system in combination with Ligation Sequencing Kit V14 and R10.4.1 flow cells (Oxford Nanopore Technologies). Base calling of Nanopore long reads was conducted using Guppy version 6.5.7 + ca6d6af (Oxford Nanopore Technologies) with minimap version 2.24-r1122 (ref. 53). De novo assembly was performed using Flye v.2.9.1 with default settings and an estimated genome size of 5.0 Mb (ref. 54). When libraries were sequenced with the PacBio RSII or Sequel system, raw PacBio reads were mapped to the draft assemblies via BLASR v.5.3.3 (ref. 55). The assembled sequences were then polished using the original PacBio reads with Arrow v.2.3.2 (ref. 56). The chromosomal genomes of A-Plst-SymA, C-Lami-ISGK-165 and C-Scam-OKNW-431 were not assembled into one circular contig owing to the presence of long segmental repeats among different chromosomal regions. For these three strains, de novo assemblies of the ONT reads with Flye were used for gap closing. The draft assemblies were polished with two rounds of medaka (v.1.4.1; https://github.com/nanoporetech/medaka). To reveal assembly errors, original sequencing reads were mapped to completing circular genomes using BWA v.0.7.17 (ref. 57). Possible sequence errors were identified using these mapped data via bam-readcont v.1.0.1 (ref. 58) and then manually inspected using IGV v.2.16.2 (ref. 59). Genome annotation was performed using DFAST v.1.2.20 (ref. 60).

Molecular phylogenetic analysis

The genome sequences of the Pantoea symbionts, environmental Pantoea strains and allied bacteria were annotated using DFAST v.1.2.20. Then, the proteome sets were analysed with publicly available ones using bcgTree v.1.2.0 (ref. 61), which automatically extracted 107 essential single-copy core genes from amino acid sequences of the whole genome data. Ambiguously aligned regions were trimmed by using Gblocks v.0.9.1b with manual inspection62. In total, 106 gene alignments (1 gene, rpmH, was excluded owing to having missing data) were concatenated and a partitioning file was generated to mark the boundaries of each gene. Corresponding amino acid substitution models were estimated by the Bayesian information criteria using ModelTest-NG v.0.2.0 (ref. 63). The maximum likelihood analysis was conducted using RAxML-NG v0.9.0 (ref. 64). Bootstrap values were obtained with 1,000 resamples. Bayesian inference analysis was conducted using MrBayes v.3.2.7a (ref. 65) with 10 million generations of Markov chain Monte Carlo runs with sampling every 100 generations. The first 25% of the samples were discarded as burn-in, and the remaining trees were used to calculate posterior probabilities. The stationarity of the runs was assessed using Tracer v.1.7.2 (ref. 66).

Transcriptomic analyses

Total RNA was extracted from homogenates of the symbiotic organ using RNAiso (Takara Bio) in combination with the RNeasy Mini Kit (Qiagen). Ribosomal RNAs of both insect and bacterial origin were removed from the total RNA using the Ribo‑Zero Gold rRNA Removal Kit (Epidemiology; Illumina). The rRNA‑depleted RNA samples were used to construct paired‑end sequencing libraries with either the SureSelect Strand‑Specific RNA Library Prep Kit (Agilent Technologies) or the TruSeq RNA Library Prep Kit v2 (Illumina). Libraries were sequenced on an Illumina HiSeq 3000 or HiSeq X platform.

Raw sequencing reads were quality trimmed and mapped to the E. coli BW25113 reference genome (accession number NZ_CP009273), and gene‑level read counts were obtained using CLC Genomics Workbench v10.0 (Qiagen). Normalization of read counts and differential gene expression analyses were performed using edgeR v3.32.1 (ref. 67).

Statistics and reproducibility

We statistically compared the effects of experimental treatments on the host phenotypes, including adult emergence rates, colour hues, and indole and tryptophan contents, using Wilcoxon rank–sum test on account of their non-Gaussian distributions. For multiple comparisons, P values were adjusted using Hommel’s method. Exact P values were provided in source data files. All statistical analyses were conducted using R version 4.4.0 (ref. 68). The number of replicates for each experiment is indicated in the respective figures. No statistical method was used to predetermine sample size. No data were excluded from the analyses. The experiments were not randomized. The investigators were not blinded to allocation during the experiments and outcome assessment.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.