Introduction

Teneurins are a family of evolutionarily conserved proteins found across bilaterian metazoa and certain choanoflagellate species1,2,3,4,5. They are type II transmembrane receptors that evolved from an unknown bacterial precursor through horizontal gene transfer2,3,6,7,8,9,10. In mammals, Teneurins are prominently expressed in the central nervous system and are crucial for neuronal development4,11,12,13,14,15. In humans, Teneurin homologs have been implicated in diseases including sensory and motor dysfunctions, neurodevelopmental and psychiatric disorders, and cancers16,17,18,19,20,21,22,23.

Teneurins consist of a small intracellular domain, a single transmembrane helix and a large extracellular region comprising eight epidermal growth factor-like repeats (EGF1-8), a cysteine-rich region, a transthyretin-like (TTR) domain, the characteristic Teneurin “superfold” and a C-terminal region containing an antibiotic-binding-like domain (ABD) and an HNH DNAse domain (Tox-GHH). The “superfold” is composed of a specialised fibronectin domain (FN-plug), a six-bladed NCL-1, HT2A, and Lin-41 (NHL) beta-propeller domain, and a large, anti-clockwise spiraling tyrosine-aspartate (YD) repeat domain shell3,4,10,24,25,26,27,28 (Fig. 1a). Bioinformatic and structural studies of Teneurins suggest that the superfold shares homology with related bacterial rearrangement hotspot (RHS)/YD-repeat containing proteins1,3,24,29,30,31,32,33,34,35,36,37. These proteins are components of bacterial toxic effectors that facilitate host invasion, pathogenesis or alternatively compete or defend against predator species32,38,39,40,41.

Fig. 1: TLPs are found across the bacterial kingdom.
Fig. 1: TLPs are found across the bacterial kingdom.
Full size image

a Teneurin domain organisation. ICD (intracellular domain), TM (transmembrane), EGF (Epidermal growth factor), TTR (transthyretin-like), FN (Fibronectin type 3), NHL (NCL-1, HT2A, and Lin-41 domain), YD-shell (Tyrosine-Aspartate repeat domain), ABD (Antibiotic binding domain), Tox-GHH (Toxin-like HNH DNAse domain). The cartoon representation of the Teneurin extracellular domain was based on the chicken Teneurin 2 (PDB id: 6FB324). b Maximum likelihood phylogenetic tree based on an alignment of the Teneurin/TLP superfold. Species names are color-coded according to the phylum: Bacillota (magenta), Actinomycetota (purple), Pseudomonadota (blue), Myxococcota (yellow), Thermodesulfobacteriota (orange), Acidobacteriota (dark purple), Choanozoa (red), Metazoa (light red). Bacterial genomes that encode multiple TLP genes are highlighted in bold. The presence of a TIP (Teneurin insertion protein) domain is indicated by a solid blue square. Sequences with a predicted signal peptide are colored according to the predicted trafficking machinery: SEC translocon SEC/SPI (red), SEC/SPII (blue), TAT translocon TAT/SPI (yellow), other (light gray). c Bacterial TLP domain organisation, colours in analogy to (a). Unlike Teneurins, bacterial TLPs lack an intracellular domain and contain additional variable adhesion domains. The TIP domain is inserted between the TTR and FN-plug domains. The C-terminal (CTD) domain is highly variable. d Predicted enzymatic functions of TLP C-termini. We used AlphaFold49 and structural homology searches to infer functionality.

Phylogenetic analyses suggest Teneurin genes arose following the fusion of a prokaryotic proteinaceous toxin containing a RHS/YD-repeat domain (the Teneurin ‘superfold” and flanking domains) with a eukaryotic transmembrane protein gene1,42,43. Following structural characterisation of the vertebrate Teneurins encoded by these genes, a wide-spread, uncharacterized family of Teneurin-like proteins (TLPs) was discovered in bacteria3,24. As bacteria do not have a nervous system, bacterial TLPs must have functions beyond those currently known for the metazoan homologs. These important developments raised new questions about the function(s) of bacterial TLPs and their evolutionary relationship to metazoan Teneurins.

Here, we show a small, but phylogenetically diverse group of bacteria encoding TLP genes in the PubMLST multispecies database. Structural analysis demonstrates that they possess Teneurin-like structural features while also revealing a highly variable C-terminal toxin module that is reminiscent of bacterial RHS/YD-repeat proteins29. We also confirm the presence of associated immunity genes, suggesting that bacterial TLPs are a distinct subtype of bacterial polymorphic toxins30. Our data show that bacterial TLPs are an accessory platform for the delivery of toxins in bacterial competition systems, a biological role that has been lost in metazoans where instead they mediate essential cell adhesion and signaling functions.

Results

Bacterial TLPs are found in a small group of species, yet widespread across the bacterial kingdom

We performed a comprehensive homology search to identify TLP genes in the PubMLST multispecies database (pubmlst.org/species-id)44, using the profile Hidden Markov Models (HMM) constructed for each of the three domains which constitute the Teneurin superfold (FN-plug, NHL, YD-shell). The search returned 139 TLP positive genomes (Supplementary data 1 and 2) spread across Gram-negative and Gram-positive bacterial groups in six phyla: Pseudomonadota, Myxococcota, Thermodesulfobacteriota, Acidobacteriota, Bacillota, and Actinomycetota (Fig. 1b, Supplementary Fig. 1a). Very few species within each phylum (less than 3%) encode a TLP gene, apart from Myxococcota, of which >20% encode at least one TLP gene (Supplementary Fig. 1b). Our phylogenetic analysis suggests that TLP sequences within the same phyla are generally conserved. Most bacterial species encode one copy of the TLP gene. However, some genomes carry up to four copies, as do mammalian genomes8. At least some of these additional copies have likely been acquired by independent gene transfer, rather than gene duplication, as they are most closely related to TLPs found in other species from the same family (Fig. 1b). The eukaryotic Teneurins cluster most closely with TLPs from the Paenibacillaceae family, which form a distinct clade (Fig. 1b).

Analysis of these TLP sequences (Supplementary data 3) shows that the superfold domain organisation remains invariant across phyla, while the flanking N-terminal and C-terminal regions are variable. Diverse adhesion domains serve as N-terminal modules (Supplementary Fig. 1c) that presumably govern how TLPs are displayed on the cell surface. We found that 96% of the TLP N-termini contain a Teneurin insertion protein (TIP) domain, which is absent in the metazoan Teneurins (Fig. 1a, b). Based on putative signal sequences, the Gram-negative TLPs, such as those within the Myxococcota phylum, are predicted to be targeted to the SEC/SPII secretion system. These TLPs also contain predicted lipoprotein domains that could facilitate attachment to the membrane surface. In contrast, putative signal sequences found in TLPs from Gram-positive bacteria are predicted to use the SEC/SPI secretion system and contain predicted S-layer or cell wall binding domains that would anchor them to the cell surface (Fig. 1b, Supplementary Fig. 1c)45,46. Protein translocation in bacteria is mediated by a variety of secretion systems47,48, and the absence of a predicted signal peptide in some TLP sequences suggests they may utilize alternative mechanisms for secretion.

Following on from the superfold, the C-terminal domain (CTD) is typically much smaller than the N-terminal region, averaging ~15 kDa, and is highly polymorphic (Supplementary Fig. 1). The hypervariability within the different CTD sequences, and lack of functional annotation, led us to perform structural predictions using AlphaFold49. We searched for structural homologs using the DALI and Foldseek programs50,51 to derive structural homology information for those models that were predicted with high confidence scores. This analysis revealed structural similarities to small proteases, inhibitors, peptidases, hydrolases, Hedgehog/INTein (HINT) toxins, nucleases, pseudo-uridine synthases, ADP-ribosyl transferases, and outer membrane proteins (OMPs) (Fig. 1d). The wide variety of enzymatic functions associated with these domains, and the similarity to toxin proteins known to be utilized in bacterial competition or host interaction suggest that these domains may function as cytotoxic effector molecules. Our results suggest that TLP genes are present in a small yet widely distributed group of bacteria, representing a distinct class of polymorphic toxin systems. Moreover, TLP genes are particularly associated with bacteria that exhibit complex social behaviors suggesting that they may contribute to intercellular cooperation.

The bacterial TLP structure is homologous to its metazoan counterparts

Previously, we identified a TLP gene3,24 within the genome of Bacillus subtilis CW14 (now re-named Bacillus inaquosorum52,53) (Fig. 2a). To provide insight into bacterial TLP molecular architecture, we expressed, purified and determined cryo-EM structures of the protein encoded by the gene (BiTLP). Using a construct that lacks the predicted N-terminal bacterial adhesion domains (residues 1-397) (Supplementary Table 1, Supplementary Fig. 2a–e, Supplementary Fig. 3a, b) we determined a map with an average resolution of 2.1 Å. We also determined the structure using a full length BiTLP construct (BiTLPFL, Supplementary Table 1, Supplementary Fig. 4a–e), which has a lower average resolution. As the additional adhesin domains were not resolved in maps of BiTLPFL, the analysis here focuses on the higher resolution map. The BiTLP model was built de novo by modifying a partially correct model generated with AlphaFold2 (Supplementary Fig. 2f)49. The resulting model reveals a similar organization to the metazoan homologs, with the RHS/YD-shell consisting of three layers of antiparallel beta sheets rotating anticlockwise along the central axis (Fig. 2b, c). The N-terminal side of the shell is sealed with the FN-plug, and the NHL domain is positioned at an angle of approximately 60° from the axis of the shell (Fig. 2c). A closer view of the map shows that the map density stops abruptly following residue L2144 (Supplementary Fig. 3c). Sequence analysis of this region suggests the presence of a PxxxxDPxG motif which is characteristic of the conserved RHS core domain cleavage site (Fig. 2a)54. In analogy to the bipartite DPxG-X18-DPxG motif found in other RHS proteins, where the two aspartic acids both play a role in catalysis, we found two DPxG motifs twenty residues apart from each other. In some species, the glycine residue for the first motif is substituted by a leucine or an arginine (Supplementary Fig. 5a). The structure of the catalytic aspartic acid motif is conserved and is located proximal to a structurally conserved arginine which has previously been shown as essential for catalysis29 (Fig. 2d, Supplementary Fig. 5b). Indeed, mutation of this arginine reduces autoproteolytic activity (supplementary Fig. 5c), consistent with what had been established previously for the RHS/YD-repeat-containing BC subunit of bacterial ABC toxins29. In mammalian Teneurins the motif is absent (Supplementary Fig. 3d). An additional chain break was observed within the FN-plug domain, in a loop located deep within the YD-shell domain, between residues E722 and S724 (Fig. 2d, Supplementary Fig. 3b, e). In the BiTLPFL structure, this chain break could be unequivocally identified as occurring between G723 and S724 (Supplementary Fig. 5d) as also confirmed by N-terminal sequencing (Supplementary data 4). Sequence analysis of this region suggests that S724 is highly conserved among TLP family members (Supplementary Fig. 3f). Autocatalytic cleavage upstream of the shell domain has been observed in other RHS/YD-repeat-containing toxins where cleavage is important for the release of the toxin34,35,37,55,56. Close inspection of the cryo-EM map density indicates a small extra density which could indicate the presence of a metal ion in this area (Supplementary Fig. 5e).

Fig. 2: The Bacillus inaquosorum TLP structure is homologous to mammalian Teneurin.
Fig. 2: The Bacillus inaquosorum TLP structure is homologous to mammalian Teneurin.
Full size image

a Schematic representation of BiTLP domain organisation. Bacterial Immunoglobulin-like domain, BIG (yellow), TTR (dark blue), TIP (aqua blue), FN-plug (forest green), NHL (lime green), YD-shell (orange), CTD (light grey). Putative autoproteolytic cleavage sites are indicated with scissors. The RHS motif is highlighted (yellow letters). The construct used for structural studies (residues 398-2237) is outlined by a grey line. b the cryogenic electron microscopy density map is colored according to the domain it represents, FN-plug (forest green), NHL (lime green), YD-shell (orange), CTD (light grey). c Cartoon representation of the BiTLP model, colored according to (b). Black arrows indicate autoproteolytic cleavage sites. The N-terminal chain end is indicated with a blue arrow. d Zoomed views. Top: the cleavage site within the RHS motif, containing the aspartyl protease catalytic residues D2116 and D2140 and the conserved R2103. Bottom: the cleavage site in the FN-plug domain is highlighted by a black arrow. The residues flanking the cleavage site are indicated (E722 and S724). C719 forms a covalent bond with C2216, linking the N-terminal fragment to the C-terminal fragment. e Cartoon representation of chicken Teneurin 2 (PDB id:6FB324), colored in analogy to the BiTLP domain organisation in (b). The C-terminal region downstream of the YD-shell is shown as a surface representation (light grey). f As (e), but showing BiTLP, in the same cartoon/surface representation and equivalent colour scheme. g Overlay of the C-terminal regions of chicken Teneurin 2 (dark grey) and BiTLP (light grey). h Propidium accumulation in E.coli cells expressing the BiTLP CTD with or without an N-terminal secretion signal peptide were imaged using HiLo microscopy. Scale bar =5um. i Quantification of propidium accumulation in E.coli cells, presented as box plot showing the median (central line), the first to the third quartile (box limit) and the minima and maxima (whisker). Data represents an average of 3 independent repeats (n = 3). Two-way ANOVA with Tukey’s multiple comparisons test was performed; adjusted p-value ****<0.0001, ns=non-significant.

Despite their presence during sample preparation, densities for the domains located upstream of the FN-plug were not observed in our maps (Fig. 2b) and they were therefore excluded from the final model. However, we observed substantial additional density (~6640 Å3) within the YD-shell (Supplementary Fig. 3g). Tracing this density revealed part of the CTD. Of the 92 amino acids (residues 2145-2237) in this domain, all but the regions 2166-2171, 2190-2194 and 2219-2237 are accounted for in our model (Supplementary Fig. 3a, g). Most of the traced CTD nestles closely to the inner surface of the YD-shell. Surprisingly, C2216 forms a disulfide bond with C719, covalently linking it to the FN-plug (Fig. 2d). In contrast to the published metazoan Teneurin structures, where an uncleaved C-terminus leads through an opening in the YD-shell to form the ABD and Tox-GHH domains outside the shell (Fig. 2e)4,10,24,25,26,27,28, there is no evidence of density outside the YD-shell and no opening in the YD-shell through which the C-terminus could exit (Fig. 2b, f). Indeed, the BiTLP CTD shares no structural or sequence homology with the Teneurin CTD (Fig. 2e–g). Bacterial TLP CTDs are relatively short in sequence (Supplementary data 5), averaging around 15 kDa. This size constraint may represent an evolutionary limitation imposed by the RHS/YD-shell inner cavity, which has a volume of approximately 44,000 ų in BiTLP. Taken together, we conclude that the CTD, while cleaved at the conserved RHS core domain cleavage site, remains entirely hidden by the YD-shell in the BiTLP example. It may remain associated with the N-terminal domains following release from the shell via the C2216-C719 disulfide.

The C-termini of TLPs harbour diverse toxic functions

The presence of autoproteolytic sites in BiTLP are reminiscent of those seen in other bacterial RHS and YD-repeat proteins which typically function as toxins29,35,36,37,40,55,56,57,58. In particular, the presence of two cleavage sites and the possibility that the N- and C-terminal fragments liberated by these events may remain associated following proteolysis suggests an evolutionary connection to RHS effectors that function as toxins in concert with bacterial type VI secretion systems. These toxins also encapsulate cleaved CTDs, which represent a variety of cytotoxic enzymes59,60,61. We therefore investigated whether BiTLP harbours a toxin in its C-terminal region. Sequence analysis of the BiTLP CTD fragment (BiTLPCTD) suggests that the fragment is rich in hydrophobic residues, with a predicted transmembrane region between residues 2165 and 2185 (Supplementary Fig. 3a). We used Coarse-Grain Molecular Dynamics (MD) to simulate the BiTLPCTD in the presence of a model membrane composed of 25% phosphatidylglycerol and 75% phosphatidylethanolamine, in analogy to the E. coli inner membrane. 10 independent simulations of 3 µs duration showed rapid association of BiTLPCTD with the membrane (Supplementary Fig. 6a). Simulation using multiple copies of BiTLPCTD leads to membrane deformation (Supplementary Fig. 6b).

We expressed BiTLPCTD with an N-terminal secretion signal peptide (sp-BiTLPCTD) in E. coli and found that expression of this protein inhibited cell culture growth (Supplementary Fig. 6c). Expression of BiTLPCTD without a signal peptide did not inhibit growth (Supplementary Fig. 6c). The observation of membrane deformation in our MD simulation encouraged us to investigate whether the CTDs affect membrane integrity. We treated cells expressing either BiTLPCTD or sp-BiTLPCTD with propidium iodide stain which does not penetrate intact cell membranes. Cells expressing sp-BiTLPCTD were permeable to the stain (Fig. 2h, i). These results, and the diverse enzymatic functions we predicted for different TLP CTDs led us to hypothesize that bacterial TLPs could utilize a variety of mechanisms to modify their targets (Fig. 3a). To investigate this further, we selected nine representative TLP CTDs from different organisms, for which structures could be predicted and functions consequently inferred (Supplementary Fig. 6d). Despite having no knowledge of the physiological targets for these putative toxins, we observed that three of the chosen CTDs inhibited E. coli cell growth in our liquid cultures as well as on soft agar (Fig. 3b). We focused on the CTD of Methylosarcina fibrata TLP (MfTLPCTD) for which AlphaFold confidently predicted an ADP-ribosyl transferase (ART) fold (Fig. 3c). ART proteins convert NAD+ to nicotinamide to transfer ADP-ribosyl groups onto target proteins and can act as virulence factors, e.g. in diphtheria and cholera toxins62,63. The ARTs in these toxins depend on catalytic HYE and RSE motifs in their active sites, respectively64. MfTLPCTD contains a HSE motif (H2319, S2364, E2405), which could constitute the catalytic triad (Fig. 3c). In agreement with this hypothesis, mutation of the conserved glutamate residue (E2405) to alanine restored cell growth in E. coli (Fig. 3d). As expected for an ART-dependent toxin, expression of wild-type MfTLPCTD, but not the mutant, led to depletion of cellular NAD+ (Fig. 3e). In agreement with a different mechanism for growth inhibition, the sp-BiTLPCTD did not deplete NAD+. Taken together, these results demonstrate that different TLP CTDs encased within the RHS/YD-shells are cytotoxic to bacteria through different mechanisms.

Fig. 3: Bacterial TLPs harbour diverse enzymatic functions.
Fig. 3: Bacterial TLPs harbour diverse enzymatic functions.
Full size image

a Schematic representation of bacterial TLP, where the YD-shell carries the putative ‘toxic’ CTD associated with different functions. Representative CTDs, selected for toxicity assays, are shown with their predicted functions: OMP (Outer membrane proteins): Paenibacillus oryzisoli, Desulfosudis oleivorans; ART (ADP ribosyl transferase): M. fibrata; HINT toxin: Hyalangium ochraceum, Acanthopleuribacter pedis; nuclease: Methylocaldum sp., Methylovolum psychrotolerans sph1; protease: Solimonas sp., Corallococcus exiguus AB032A; peptidase: Ghiorsea bivora. b Toxicity assay in E. coli cells expressing different bacterial TLP CTD as shown in (a). Left: Growth in liquid LB medium of E. coli top10 cells carrying a pBAD vector control (empty vector) or plasmids directing the expression of each CTD. Optical density at 600 nm was measured every hour for 6 hours following induction with 2% L-Arabinose. Points show mean ± SEM, n = 3 replicates. Curves are colored according to the corresponding phylum: Bacillota (magenta), Pseudomonadota (blue), Myxococcota (yellow), Thermodesulfobateriota (orange) and Acidobacteriota (dark purple). Right: Bacterial growth of serially diluted cells on soft agar after overnight incubation at 37 °C. To induce gene expression, 2% L-arabinose was added to the medium. c Predicted AlphaFold models of B.inaquosorum and M.fibrata CTDs. Each model is color-coded based on the AlphaFold confidence score. A closer view of the M.fibrata catalytic triad (H2319, S2364, E2405) responsible for NADase activity is indicated, with the conserved glutamate residue shown in red. The strands are numbered according to the conventional ART fold annotation. d Toxicity assay in E. coli Top10 expressing either the M. fibrata CTD or the catalytically inactive mutant (E2405A) on soft agar as in panel b. e Intracellular NAD+ levels measured in E. coli Top10 cells expressing the TLP CTDs of B.inaquosorum or M. fibrata (wild type or the inactive mutant E2405A). NAD+ levels were measured 1 h post-induction. Luminescence signal was recorded for 2 h and averaged over 3 repeats. Data represent a relative percentage normalised to the empty vector. Bars show mean ± SD, n = 3 independent experiments. One-way ANOVA with Tukey’s multiple comparisons test was performed; adjusted p-values ****<0.0001, *<0.01.

Bacterial genomes encode TLP-associated immunity proteins

Given that the superfold is conserved across bacterial species, we hypothesized that it could perhaps confer host immunity to the C-terminal toxin by trapping it inside the RHS/YD-shell. To test this hypothesis, we engineered chimeric constructs consisting of the BiTLP superfold fused to the CTD from M. fibrata, Acanthopleuribacter pedis or Methylocaldum sp. We found that E. coli cells were sensitive to the expression of these chimeric constructs (Supplementary Fig. 7a), in disagreement with the idea that the RHS/YD-shell confers a protective function for the host cell, at least in these experiments (Supplementary Fig. 7a). Further analysis of TLP-encoding operons revealed that an additional gene is found immediately downstream of almost all TLP open reading frames (Fig. 4a, b, Supplementary Fig. 7b, Supplementary Data 6), reminiscent of the arrangement seen in toxin-antitoxin systems60. This led us to hypothesize that these highly diverse genes could act as specific immunity proteins to protect the TLP expressing cell from the C-terminal toxin by directly binding to it, as seen in other bacterial polymorphic toxin systems60. Co-immunoprecipitation experiments supported this conclusion, showing that the M. fibrata immunity protein associates with the MfTLPCTD (Fig. 4c). Co-expression of this putative immunity protein in the NAD+ depletion assay confirms that the ART activity of MfTLPCTD is inhibited by its presence (Fig. 4d). Co-expression also restored E. coli growth on soft agar (Fig. 4e). Structural predictions using AlphaFold249 further supports a direct interaction between these ‘immunity proteins’ and the corresponding TLP C-termini found in different species (Fig. 4b, Supplementary Fig. 7c). We therefore conclude that different bacterial TLPs are co-expressed with matching immunity proteins encoded in the same operon.

Fig. 4: Bacterial TLPs are encoded as an effector/immunity pair.
Fig. 4: Bacterial TLPs are encoded as an effector/immunity pair.
Full size image

a Illustration of the presence of a matching immunity gene that neutralises the CTD ‘toxic’ activity. b Top: Analysis of the M.fibrata genome reveals a second, small gene in the genomic neighbourhood of TLP. The TLP gene (blue) is encoded on the reverse strand, followed by the cognate immunity (magenta). Surrounding genes are colored grey. Bottom: Predicted AlphaFold model of M.fibrata CTD/immunity heterodimer shown as ribbon and/or surface. c Immunoblot following pull down assay confirms the predicted interaction shown in panel b. M.fibrata CTD was co-expressed with its immunity gene in E. coli Top10. The CTD was fused with an N-terminal twinstrep-HA tag to allow immobilisation on streptavidin-agarose beads, while the immunity protein, tagged with a Flag tag, served as the prey. Both protein bands were detected using anti-HA and anti-Flag antibodies. d Intracellular NAD+ levels measured in E. coli Top10 cells expressing M.fibrata CTD, the immunity gene or both. NAD+ levels were measured 1 h post-induction. Luminescence signal was recorded for 2 h and averaged over 3 repeats. Data represent a relative percentage normalised to the CTD/Immunity complex. Bars shown mean ± SD, n = 3 independent experiments. One-way ANOVA with Tukey’s multiple comparisons test was performed; adjusted p-value ****<0.0001, ns=non-significant. e Co-expression of the matching immunity gene restores E. coli growth. In a toxicity assay using E. coli Top10, we expressed M. fibrata constructs described above (d). Bacterial cell cultures were serially diluted on soft agar and incubated overnight at 37 °C, with protein expression induced by the addition of arabinose.

Discussion

Teneurins are best understood in mammalian nervous systems, where they function as cell signalling receptors. However, less is known about their prokaryotic ancestry. Here we show that their genes emerged from a bacterial precursor that functioned as a toxin prior to the evolution of nervous systems1,42,43,65 (Fig. 5a). Bacteria commonly live in highly dense communities where cell-to-cell interactions are essential for survival and adaptation. Consequently, they have developed numerous strategies to either cooperate with, or compete against each other, including the evolution of toxin systems that respond to cell-cell interactions. The results we presented here for TLPs are characteristic of polymorphic toxins30,60,66, which are one of the most complex and dominant bacterial conflict systems67. We found TLPs particularly highly represented in Myxococcaceae of which many such as Corallococcus display a predatory lifestyle, killing and consuming a wide range of prey through the secretion of antimicrobial substances68,69. It is therefore likely that TLPs function in bacterial warfare, where immunity genes are crucial for kin and non-kin recognition (Fig. 5b). Of the polymorphic toxins, TLPs share structural similarities with RHS/YD proteins. These are often associated with the type VI secretion system and deployed by Gram negative species to kill or inhibit the growth of neighbouring cells35,36,37,38,39,55,56,58. In another example, the tripartite ABC toxin complex requires a type 10 secretion system and targets insect host cells57,70,71. Like the bacterial RHS/YD toxins, TLPs package their functional toxins within a protein shell. However, in the absence of any obvious delivery machinery, the mechanism via which they are delivered to target cells and released from the shell remains to be elucidated. In addition to their toxin function, RHS/YD proteins play roles in the social behaviour of bacteria, such as in S motility72. In analogy, we found TLP positive genomes predominantly in Paenibacillaceae and Myxococcaceae, families that exhibit complex social behaviours73,74. This leaves open the possibility that bacterial TLPs function in cell-to-cell communication within microbial communities.

Fig. 5: Teneurins emerge from a bacterial toxin precursor through horizontal gene transfer.
Fig. 5: Teneurins emerge from a bacterial toxin precursor through horizontal gene transfer.
Full size image

a Schematic tree depicting Teneurin gene transfer event (orange arrow) from an unknown bacterial precursor to the choanoflagellates-animals clade. The ancestral gene ‘bacterial TLP’ may function as toxin but later co-opted to cell adhesion in bilaterian animals. Non bilaterian animals refer to sponges, placozoa, ctenophora and cnidaria. Black cross represents the absence of Teneurins gene. b Bacterial TLP family hypothetical mechanism of action. Upon synthesis, TLPs are possibly exposed on the cell surface, enabling CTD trafficking. Released CTD, triggered by a yet to be discovered mechanism are represented as hexagonal shape, with colours representing the various enzymatic functions. Dashed arrows indicate potential action sites within another bacterial cell or on an unknown target. Solid squares represent immunity protein. c In analogy to (b), Bilaterian Teneurins mediate cell to cell adhesion in a receptor-ligand interaction. They bind other Teneurin molecules in a homophilic interaction and/or the adhesion G protein-coupled receptor Latrophilin in a heterophilic manner (PDB id:6SKA4). Bilaterian Teneurins also have an intracellular region involved in signalling.

Our phylogenetic analysis was based on the superfold sequence and suggests that Paenibacillaceae TLPs are the closest known relatives of eukaryotic Teneurins. In contrast to Teneurins which contain a C-terminal DNase fold, Paenibacillaceae TLPs contain a predicted outer membrane protein in their CTDs. It is possible that the acquisition of the Teneurin superfold and the CTD were evolutionarily decoupled24,54, or that the original prokaryotic ancestor is now extinct. The presence of a Teneurin gene in choanoflagellate species suggests that it was acquired early in metazoan evolution65,75. Choanoflagellates are free-living organisms that prey on bacteria as a food source76,77. Therefore, eukaryotic Teneurins likely arose from an independent horizontal gene transfer as a consequence of this predatory relationship76,78. Whether the gene uptake event was driven by their function as toxins, e.g to kill bacterial prey, or as cell adhesion molecules, e.g facilitating interaction with bacterial cells with the aim of capturing them78, is yet to be investigated.

There is evidence that isolated metazoan Teneurin CTDs could possess nuclease activity and regulate cell apoptosis31. However, Teneurins are best known for their functions as cell surface signaling receptors that mediate crucial cell-to-cell communication via homophilic10,27,79 and heterophilic interactions, such as with the adhesion G-protein-coupled receptor Latrophilin4,15,26,80 (Fig. 5c). Moreover, homodimerization of some RHS/YD proteins, e.g RhsP and RhsA found respectively in Vibrio and Pseudomonas species, has been reported35,36 although the functional significance of dimerization in these systems remains to be explored. While the evidence we present here for cell surface localization of TLPs is limited to functional homologies inferred from bioinformatic sequence analyses, localization of bacterial TLPs to the cell surface may also allow them to engage in cell-to-cell contact and signalling functions. Eukaryotic tissues, such as the nervous system, and bacterial quorum sensing share some similarities in how cell-cell-communication is achieved81. In biofilm communities, bacteria communicate through ion-based signals which resemble electrical signals used by neurons82. While such communication mechanisms may be redundant or accessory for bacteria, in agreement with the widespread but low incidence of TLP genes in our database, co-opting these receptors must have been essential for the evolution of animals, where Teneurins are conserved in all organisms with a centralized nervous system. Taken together, this suggests that the acquisition of the Teneurin gene represents an important event in the evolution of complex multicellular life.

Methods

Teneurin-like protein identification using hidden Markov model searches

The sequence region of protein WP_088111228.1 corresponding with the Teneurin-like protein (TLP) superfold (residues 673 to 2139) was extracted from the NCBI Protein database (https://www.ncbi.nlm.nih.gov/protein) and searched against the non-redundant protein sequence database (NR) using the NCBI BLAST website (16th Dec 2022).

The top 90 matching sequence regions with at least 36% sequence identity and 99% query coverage were extracted and aligned using MAFFT (v7.49083). The resulting multiple sequence alignment was split into 3 separate alignments corresponding with the 3 TLP superfold domains: FN-plug (region 673-928), NHL (929-1255) and YD shell (1256-2139). A single Hidden Markov Model (HMM) was constructed for each domain using hmmbuild (HMMer software package, v3.1b1, http://hmmer.org) and the three models were combined into a single HMM library.

A dataset of 481,806 bacterial genomes, comprising 16,838 different species, was identified as present in the PubMLST Multi-species database44 on 18th January 2023 (https://pubmlst.org/species-id). Each genome was downloaded from the database, subjected to a six-frame translation (EMBOSS transeq, v6.6.0.0) and the protein sequences were scanned against the HMM library using hmmscan (HMMer software84). All operations were performed on a Dell PowerEdge R815 Server with 512 Gb of RAM and 64 CPU cores.

Sequence matches with an E-value of 1e-10 or less and at least 30% domain overlap were shortlisted for manual inspection of the results. In total 143 genomes from 93 different species were identified as containing the 3 TLP superfold domains, adjacent to each other on the same reading frame and in the correct domain order.

Sequence conservation and phylogenetic tree inference

Each TLP sequence was retrieved from the homology search mentioned above. A multiple sequence alignment (MSA) of all bacterial TLP full length sequences and chosen metazoan representative was generated using MUSCLE program of the jalview package85,86. Sequences corresponding to Chitiniphilus eburneus, Olavius algarvensis associated proteobacterium Delta3 (OalgA1CA) and Mizugakiibacter sediminis were removed from final analysis given that the sequence were either partial or do not correctly align with the other TLP sequences.

The conserved region corresponding to superfold sequences including the FN-plug, NHL and the YD-shell were isolated and further cleaned using TrimAL with default preset from the NG phylogeny.fr online server87,88,89,90. The cleaned output from TrimAL was the used as a query for tree inference using IQ-TREE server with default preset and an ultrafast bootstrap of 1000×90. The maximum likelihood consensus tree was then uploaded on Itol server for visualization and annotation91.

Bacterial TLP N-terminal domain annotation and secretion pathway prediction

Full length protein sequences were subjected to signal peptide prediction using the signal P5 server92. The CD search and interporscan server were used to search form domain homology and annotate domain function93,94,95.

Structure prediction and comparison with available structures on the PDB database

Structure predictions were performed using Alphafold2 in a local version of ColabFold49,96. A complete MSA sampling using the default database was performed along 3 cycles, generating 5 structures corresponding to each bacterial TLP CTD sequence. The CTD corresponds to the region downstream of the RHS cleavage site ‘DPxG’ motif. Among each set of structures, the one presenting the highest pLDDT score was chosen for structural homology and functional inference using DALI or foldseek server50,51.

Vector and cloning

All cDNA used for the project was commercially synthesised using Genscript. Bacillus inaquosorum constructs (residues 398-2237, BiTLP) and R2103A (BiTLPR2103A) were subcloned into a petMCN vector carrying a N-terminal His tag and a TEV protease site97. In addition, BiTLP and BiTLPR2103A carry a N-terminal StrepII tag, a central FLAG tag (within the RHS associated core) and C-terminal HA tag. All bacterial TLPs CTD were cloned into a pBAD vector fused with a N-terminal twin-strep-HA tag98. BiTLPFL (residues 1-2237) was commercially synthesised and cloned by Azenta into a pProExHta vector containing an N-terminal His tag. A list of all plasmids and primers used in this study is provided in Supplementary Table 2.

Protein expression and purification

The BiTLP construct was transformed into E.coli BL21. Bacterial cells carrying the vector were selected for growth in TB media supplemented with 100 μg/μl ampicillin and protein expression was induced with 0.5 mM IPTG at 18 °C for 14 to 16 h. Bacterial pellet were harvested by centrifugation for 30 min at 4000 rpm at 4 °C. Bacterial cells were resuspended in lysis buffer (50 mM Tris pH 7.5, 300 mM NaCl, 5 mM imidazole, 0.1%Tritonx100, 5 mM mercaptoethanol). Cell lysis was performed by sonication (amplitude: 45, 30 s x 5). Crude cells were then clarified by centrifugation for 1 h at 4 °C, 30000 g. Supernatant was loaded onto a prepacked His-Trap column (GE). BiTLP protein was eluted with wash buffer containing 500 mM imidazole. Protein was incubated overnight with TEV protease at 4 °C. Following treatment, the sample was reloaded onto a pre-equilibrated His-Trap column (GE). The flowthrough containing the cleaved sample were concentrated and loaded onto a Superose6 10/300 column (GE). Protein fractions were analysed by SDS-PAGE on a 4-12% Bis-Tris gel and protein concentration was measured using a Nanodrop at A280nm.

The BiTLPFL construct was transformed into E. coli LOBSTR cells and selected on LB medium supplemented with 100 μg/μL ampicillin. Cells were harvested by centrifugation for 30 min at 4000 × g at 4 °C. The bacterial pellets were resuspended in lysis buffer (50 mM Tris-HCl pH 7.5; 500 mM NaCl; 10 mM imidazole; 10% glycerol; 1 mM TCEP-HCl) and lysed using a constant flow cell disruptor. The crude lysate was clarified by centrifugation (27,000 g; 30 min; 4C). The clarified lysate was passed through 0.45 and 0.2 µm syringe filters and subsequently applied to a prepacked HisTrapTM HP column (Cytiva). BiTLPFL was eluted stepwise using wash buffer containing 20 to 500 mM imidazole. Fractions containing BiTLPFL were concentrated and then loaded onto a Superdex™ 200 Increase 10/300 (Cytiva), and fractions corresponding to were pooled, concentrated to 0.2 mg/ml and stored at –80 °C.

Cryo-EM sample preparation and data collection

3 μl of purified BiTLP at 0.5 mg/ml was applied onto a plasma cleaned holey carbon grid (Quantifoil® R 1.2/1.3, 300 copper mesh, Agar Scientific; Cat#AGS143-2). The grids were blotted for 3.5 s (blot force=5), at 20 °C with 100% humidity then plunged into liquid nitrogen-cooled liquid ethane using a Vitrobot Mark IV (Thermo Fisher Scientific). Data were collected on a Titan Krios (Thermo Fisher Scientific) equipped with a K3 Summit direct electron detector at a nominal magnification of 105,000x with a calibrated pixel size of 0.83 Å/pixel. Image stacks were acquired with an accumulated dose of 38.3 e2 fractionated over 40 frames and a defocus range of −0.8 to −2.2 μm.

For BiTLPFL, 2 μl of purified sample at 0.2 mg/ml was applied onto a glow discharged holey carbon grid (Quantifoil® R 1.2/1.3, 300 copper mesh). Following a 3 s incubation time, the grid was blotted for 10 s, and then plunged into liquid ethane using the EMGP2 Leica system operating at 4 °C and 95% humidity. Data for BiTLPFL were collected using CryoARM 300 (JEOL) equipped with an in-column Omega energy filter and K3 direct electron detector (Gatan). Movies of BiTLPFL were acquired using SerialEM v3.199 in super resolution and CDS modes at a nominal magnification of 100,000x corresponding to a calibrated pixel size of 0.2432 Å/pixel. Image stacks were acquired with an accumulated dose of 40 e2 fractionated over 40 frames and a defocus range of −0.5 to −2.5 μm.

Cryo-EM data processing, model building and validation

For BiTLP, preprocessing including motion correction, CTF parameter estimation, particle picking, and extraction was carried out using SIMPLE 3.0100. Further processing was carried out in cryoSPARC101. Junk particles were discarded through iterative rounds of 2D classification. Five ab initio model were generated. Particles were further 3D classified through heterogenous refinement. The best map was further refined using homogenous refinement. Following particle polishing in RELION102, particles were re-imported into CryoSPARC101 and after two rounds of 2D classification, the map was further refined by homogenous refinement with a total of 3,010,268 particles included in the final map reached an average resolution of 2.06 Å based on the FSC = 0.143 criteria as estimated in CryoSPARC101.

The final map was sharpened by applying a uniform, inverse B-factor of −50 Å2, following which an AlphaFold model of BiTLP was docked into the sharpened map, using the ‘dock in map’ tool in PHENIX103. The fitted model was subjected to iterative rounds of manual refinement in COOT104 and real space refinement in PHENIX103. The final model was validated using MolProbity105 within PHENIX103.

Data processing for BiTLPFL was performed in cryoSPARC101. Iterative rounds of 2D classification were conducted to remove junk particles. 2D class rebalancing was then performed and the resultant particles used for an ab initio model. Heterogenous refinement was then performed, and the highest quality map was selected for further refinement using homogenous refinement. This map was then used as the reference map for further rounds of 3D classification using exported particles from cryoSPARC101 to RELION102. Improvement of the β-propeller density was observed using RELION102. The corresponding particles and map were re-imported back into cryoSPARC for homogenous refinement, yielding a final map at 3.74 Å resolution. The map was sharpened using DeepEMhancer to improve map resolvability106.

An AlphaFold model of BiTLPFL was then generated and fitted into the map using the Namdinator server107. Manual model building using COOT104 was then performed and molecular dynamic flexible fitting was performed using ISOLDE (ChimeraX plugin108;). Real space refinement and validation were then performed in PHENIX103.

All figures were prepared using UCSF ChimeraX v.1.8109.

Western blot

Samples were separated on a 4-12% Bis-Tris NuPAGE gel and transferred to a nitrocellulose membrane by electrophoresis. The membrane was blocked with 3% (w/v) BSA in PBS buffer supplemented with 0.01% (v/v) Tween-20 for 30 min at room temperature. After that, the membrane was sequentially incubated with primary and secondary HRP-conjugated antibodies prior to signal detection with ECL reagents (RPN2106, VWR). Depending on the tag of interest, commercial monoclonal anti-HA (sigma Aldrich, H3663-200UL), anti-strepII (IBA, 2-1507-001) or anti-FLAG (sigma Aldrich, F1804-50UG) antibodies were used at 1:1000 dilution. Secondary antibody anti-mouse IgG HRP (ThermoFisher, 31430) was used at 1:10000 dilution.

Molecular dynamics simulations

The structure of Bacillus inaquosorum CTD used for molecular dynamics simulation was generated using Alphafold249. The protein was mapped to a Coarse-grained representation using Martinize2110. To study membrane association, a system containing one protein positioned in the solvent, above a POPG:POPE (3:1) membrane was built. All systems were built using the insane111 python script. To study membrane deformation, we constructed a system containing four proteins already positioned at proximal the membrane based on the final snapshot of a system containing one protein.

All simulations were performed with GROMACS 2020.5 and the MARTINI3 force field112. The Bussi–Donadio–Parrinello (V-rescale)113 thermostat was used to control the temperature. During equilibration, the Berendsen barostat6 (τp = 3 ps) was used, while the Parrinello-Rahman barostat114 (τp = 12 ps) was used for production runs. In all cases, the barostat was semi-isotropic, the reference pressure was set to 1 bar and the compressibility to 3.0×10-4 bar-1. Electrostatic interactions and Lennard-Jones interactions were cutoff at 1.1 nm using the reaction-field115 method and the potential-shift Verlet method. Bonds between beads were constrained to equilibrium values using the LINCS algorithm116.

After a short minimization of 5000 step, the systems were equilibrated at 300 K in the NPT ensemble for 200 ps, using a 20 fs timestep, with the proteins being harmonically restrained. Production runs were performed then in the NPT ensemble. Ten runs of 3 µs were performed for the mono-protein system, while a production run of 10 µs was set up for the 4-protein system.

Distance analysis was performed as previously described in Jackson et al.117. Analysis of the membrane deformation was performed using a previous script118. Briefly, the script segments the membrane in patches of defined sizes and computes their local norm. The averaged dot product between local norms and the z axis was then used to quantify the deformation of the membrane.

Propidium iodide staining

Bacterial cells expressing either the empty vector, BiTLPCTD or sp-BiTLPCTD were incubated with propidium iodide (stock 1 mg/ml) at room temperature for 10 min. Cells were then centrifuged at 5000 g for 3 min and pellets are resuspended with fresh M9 media. 5 µl of cell resuspension was applied onto agar pad and air dried under sterile condition prior to sealing with with a coverslip. Cells were imaged using the Oxford Nanoimager-S with 100x/1.49 oil immersion objective lens and a pixel size of 117 nm. Each image was a composite of 200 frames, each with 100 ms exposure, and cells were imaged at an angle of 49°. Images were analysed using the MicrobeJ119 plugin for ImageJ120, and plotted using Graphpad (version 9 for MacOS, GraphPad Software, San Diego, California USA, hQps://www.graphpad.com/).

Toxicity assay

Each CTD construct was transformed into E. coli Top10 cells. An overnight LB pre-culture was used to inoculate 20 mL of LB medium, and protein expression was induced when the OD₆₀₀ reached 0.5-0.6 by adding 2% L-arabinose. OD₆₀₀ was measured every hour for 6 hours post-induction. Bacterial cells were then harvested by centrifugation and stored for small affinity pulldown on streptavidin agarose beads. For each construct, an overnight culture was serially diluted, and 3 µL of the culture was spotted onto a soft agar plate supplemented with 2% arabinose and 100 µg/mL ampicillin, then incubated overnight at 37 °C

NAD glo assay

Relative intracellular NAD+ levels were quantified in cell lysate using the NAD/NADH-glo bioluminescence assay as per the instructions of the manufacturer (Promega). NAD bioluminescence was recorded on a CLARIOstar Plus (BMG Labtech) microplate reader, for 2 h. Statistical analysis using a one-way ANOVA test, with a Tukey’s post-hoc test was performed using GraphPad Prism (version 9 for MacOS, GraphPad Software, San Diego, California USA, hQps://www.graphpad.com/).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.