Introduction

Bacillus thuringiensis (Bt) is a Gram-positive, spore-forming bacterium that occurs among others in soils, on leaf surfaces, and in the gut of caterpillars. During sporulation, many Bt strains produce crystalline protein condensates or parasporal bodies with insecticidal activity, known as Cry toxins. These toxins target a diverse range of insect species, primarily of the order Lepidoptera (butterflies and moths), Diptera (flies and mosquitoes), Coleoptera (beetles and weevils), and hymenopterans (wasps and bees)1,2. When susceptible larvae ingest the Bt toxin crystals, digestive enzymes present in their highly alkaline gut activate the Cry proteins. Binding to specific receptors on the membranes of mid-gut epithelial cells results in pore formation, followed by paralysis and/or rupture of the digestive tract3. In contrast to poisonous insecticides that target the nervous system, Bt acts by starving infected larvae, as they will stop feeding within hours after ingestion. Organisms lacking the appropriate receptors in their gut are not affected by the Cry protein, making each Cry toxin only effective against a specific group of insects4.

Spores and parasporal bodies produced by Bt have been used in agriculture to control insect pests since the 1920s5. Today, numerous Bt strains producing different toxins are commercially available (e.g., trade names such as Monterey Bt, Milky spore, Mosquito dunks, DiPel, and Thuricide). These Bt-based products are regarded as environment-friendly biopesticides because of their high species specificity, with little or no effect on humans, wildlife, and pollinators2,6,7. However, whether applied as a spray or, less often, as granules, cry toxins biodegrade quickly in sunlight, hence, most formulations persist on foliage less than a week following application, and reapplication is often necessary under heavy insect pressure.

Bt has the potential to form biofilms—complex microbial ecosystems embedded in a self-produced matrix of extracellular polymeric substance—that are considered to be a key factor for survival when bacteria face environmental, chemical, physical, and mechanical stresses8,9,10. Within the biofilm, Bt spores are engulfed by a proteinaceous parasporal sacculus known as the exosporium11. The surface of the exosporium is adorned by a dense array of BclAs (bacterial collagen-like protein of Anthracis), collectively referred to as the hairy nap12. From this nap, fibrous structures termed endospore appendages (ENAs) emanate, contributing to the unique architecture and functionality of the biofilm13,14.

In this contribution, we sought to further understand the molecular architecture of a B. thuringiensis subsp. Kurstaki (Btk) spore biofilm. Using cryo-electron transmission electron microscopy (cryoEM) we show that Btk spores are decorated with S-ENA fibers13 and an unknown type of pili of 5 nm diameter, referred to as F(fibrillar)-type endospore appendages (F-ENA), that are tethered to the exosporium and decorated with a 3 nm diameter tip fibrillum (ruffle) at their spore-distal terminus. We solved the helical fiber ultrastructure and showed that the major subunit is a 10 kDa protein that belongs to the DUF4183 family. Using an integrative structural biology approach combining cryoEM, AlphaFold3 (AF3) modeling, and X-ray crystallography, we identify two collagen-like proteins that are involved in F-type fiber biogenesis. We elucidate the assembly mechanism by unraveling the molecular mode of docking to the exosporium and the mode of fibrillum decoration. Phylogenetic analysis demonstrates that F-ENA is widespread in the class of Bacilli and is, to a lesser extent, also found in the class of Clostridia. Moreover, we identify a second and third subgroup of F-like ENAs in the Paenibacillus genus and in the class Clostridia, and solve the cryoEM structures of representative members of each subgroup.

Results

Btk spores are decorated with F- and S-type ENAs

To gain insight into the molecular architecture of a Btk spore biofilms, a confluent macrocolony of sporulated Btk was investigated in negative stain transmission electron microscopy (nsTEM). Btk spores were observed to be decorated with two types of ENAs: S-ENA, with a 10 nm diameter and a typical staggered 2D projection image, recently described for B. paranthracis13 and a second, unidentified ENA measuring 5 nm in diameter (Fig. 1 and Supporting Fig. 1). Given the increased flexibility of these unknown ENAs compared to the more rigid S- and L-type ENAs13,14, we hitherto refer to them as ‘fibrillar endospore appendages’ or in short F-ENA. Btk spores are enclosed by an exosporium. High-magnification imaging of the hairy nap (BclA layer) shows that F-ENAs are anchored to the exosporium base layer and protrude through the hairy nap (Fig. 1b). At the spore’s distal terminus, F-ENAs are decorated with a 3 nm diameter flexible tip fibrillum or ruffle, which is characterized by a 206 ± 17 nm (n = 21; mean; standard deviation) long stalk region ending in a distinctive globular tip or head-group (Fig. 1c). We further demonstrate that another widely used Bt strain, i.e., Bt subsp. Berliner (Btb), is also abundantly decorated with F-ENAs that terminate into a tip fibrillum (Fig. 1d). Interestingly, the F-ENA ruffles of Btb are shorter than the Btk F-type ruffles, measuring 82 ± 6 nm in length (n = 11; mean; standard deviation). The presence of ruffles and the observed differences in length are reminiscent of the termini of S- and L-ENAs found on B. paranthracis, which terminate into multiple ruffles (e.g., 3–4) or a singular ruffle, respectively13,14. The presence of ruffles on the S-ENA fibers of Btk is also confirmed in this study. Using nsTEM, we resolved multiple 125 ± 6 nm (n = 12; mean; standard deviation; 1–3 ruffles per S-ENA fiber) long ruffles on Btk S-ENA (Supplementary Fig. 1).

Fig. 1: nsTEM observations of Btk and Btb spores, revealing F-ENA fibers.
figure 1

a Low magnification micrograph of Btk decorated with previously identified S- and newly discovered F-ENA. b High magnification micrograph of F-ENA (diameter = 5 nm) anchored to the exosporium of Btk, permeating through the hairy nap layer. c, d The spore distal F-ENA terminus is decorated with a flexible 3 nm diameter tip fibrillum (ruffle) that terminates into a globular head group. For Btk (c), the ruffle measures ±200 nm, whereas for Btb (d), the ruffle is 85 nm long. e Micrograph of recombinantly produced Btk F-ENA fibers purified from the cytoplasm of E. coli. The images in (ae) were derived from three biological replicates.

CryoEM structure of F-ENA and S-ENA of Btk

Next, we aimed to characterize F-ENAs using cryoEM. To achieve this, we developed a protocol to shear off the F-ENA fibers from the spore surface. The resulting ENA suspension was plunge-frozen, and a cryoEM dataset was acquired using a cryoARM 300 microscope (Fig. 2a, b). Using standard helical processing in cryoSPARC, we obtained reconstructed cryoEM volumes with resolutions of 2.7 Å (based on the FSC 0.143 criterion) and 4.1 Å for F- and S-ENA, respectively (Fig. 2c, d and Supplementary Fig. 2). Helical parameters for F-ENA were estimated from the 2D class averages (Fig. 2b), while the rise and twist values of 7A02 (protein data bank (PDB)) were used as inputs for S-ENA. The refined helical parameters were determined to be a twist of 42.1° and a rise of 33.7 Å for F-Ena, and a twist of −31.1° and a rise of 3.14 Å for S-ENA. Inspection of the F-ENA volume revealed a trimeric symmetry, and thus processing was continued by applying C3 symmetry along the z-axis (Fig. 2c). The resulting volume displayed a clear, continuous density of the protein backbone, but did not permit precise de novo assignment of all sidechain entities. To facilitate unambiguous protein identification, the purified fiber suspension was subjected to trypsin digestion, followed by liquid chromatography tandem mass spectrometry (LC-MS-MS). The initial list of 276 detected proteins (Supporting Data 1) was filtered according to protein size. Using a de novo-built main chain model generated by Modelangelo, we estimated that the major F-ENA subunit would be approximately 100 amino acids (aa) in length, and hence only proteins detected in the LC-MS-MS analysis with lengths between 70 and 130 aa were retained for further analysis. From the 38 retained sequences, we retrieved the corresponding AlphaFold2 (AF2) models from the AlphaFold Protein Structure Database. Pairwise structural alignments, performed using FoldSeek with a de novo F-ENA model as a template, led to the unambiguous identification of WP_001121647.1 as the major subunit of F-type fibers in Btk. WP_001121647.1, hereafter called F-ENA, is a 96 aa protein that is classified by Pfam as a member of the Domain of Unknown function 4183 (DUF4183; PF13799). To confirm the ability of WP_001121647.1 to self-assemble into F-ENA fibers, we recombinantly expressed the protein in the cytoplasm of E. coli. nsTEM analysis of the insoluble fraction of the lysate revealed the presence of 5 nm diameter flexible fibrils that were indistinguishable from ex vivo F-ENA fibers (Fig. 1e).

Fig. 2: CryoEM analysis of F-ENA fibers found on the surface of BtK spores.
figure 2

a 60k magnification cryoEM micrograph of an F-ENA fiber purified from Btk. b 2D class average of extracted F-ENA fiber segments. c Reconstructed cryoEM volume of F-ENA (color coded per fiber segment, which corresponds to an F-ENA trimer). d Reconstructed cryoEM volume and corresponding molecular model of S-ENA. e Cartoon representation of a single F-ENA protomer, which belongs to the DUF 4183 family characterized by two juxtaposed beta-sheets interconnected by a series of loops, and an N-terminal extension (Ntc) that serves as the docking mechanism for incoming F-ENA trimers to connect to the fiber terminus. f Consensus logo of the N-terminal hook derived from a multiple sequence alignment of the top 500 blast hits using WP_001121647.1 as the query (g) molecular model of F-ENA and top-view image of the F-ENA pilus (h) Zoom-in of the Ntc of segment i docked into the receiving hydrophobic groove of segment i-1. i Lateral stacking at the trimeric interface through a predominantly polar interaction with an interface area of 5680 Å2, involving seven identified hydrogen bonds. j Electrostatic (left) and hydrophobic (right) surface coloring. The image in (a) is a representative image from two cryoEM datasets that were collected from two biological replicates.

The Btk F-ENA fold comprises two juxtaposed β-sheets composed of strands A, C, E, B, and D, respectively, along with a short N-terminal extension (Fig. 2e). Using the WP_001121647.1 sequence as input, an atomic model of the helical architecture of F-type fibers was constructed using Modelangelo, followed by real-space refinement in Coot and Phenix (Fig. 2c). F-ENA subunits assemble into trimeric units that stack axially with a 42° rotation (Fig. 2c, g). Lateral stacking at the trimeric interface occurs through a predominantly polar interaction with an interface area of 5680 Å2, involving seven identified hydrogen bonds, mainly mediated by sidechains (Fig. 2i). Axial stacking is principally facilitated by the docking of an N-terminal seven residue extension, referred to as the N-terminal hook, into a cleft defined by the interface between two neighboring F-ENA protomers in the trimer configuration (Fig. 2e, h). The interaction between the N-terminal hook of subunit i and protomers of the preceding fiber segment i-1 is mainly composed of hydrophobic contacts, where residues P2, I3 and I4 dock into a hydrophobic groove lined by small hydrophobic sidechains, including L17, V55, I77, L83, I87, and I89 (Fig. 2h). These N-terminal hook residues are conserved, as judged from the consensus logo derived from a multiple sequence alignment of the top 500 blast hits using WP_001121647.1 as the query (Fig. 2f). Guided by the cryoEM structure, we identify the following signature motif for the F-ENA N-terminal hook region: M-P-Ψ12-$-P-Ψ3 where Ψ1 and Ψ2 represent small hydrophobic residues (I, V), Ψ3 tends to be a larger hydrophobic, aromatic residue (F, Y) and $ is a surface-exposed polar residue (K, Q). Examination of the surface electrostatics of the F-ENA trimer (https://server.poissonboltzmann.org/) reveals a net negative charge of −6 at pH 7.0 (PDB2PQR) (Supplementary Fig. 3). Notably, residues D81 and D84 form a negative patch on the surface of the spore distal terminus, repeated threefold due to the C3 symmetry (Supplementary Fig. 3). In contrast, at the spore proximal F-ENA terminus, a surface-exposed arginine (R11) forms a salt-bridge with D84 of the preceding segment i-1 (Supplementary Fig. 3d). Using the dipole moment server (https://dipole.proteopedia.org/), we calculated an overall molecular dipole moment for the F-ENA trimer of 772 ± 0.55 D (Supplementary Fig. 3b), which is aligned with the helical axis (Supplementary Fig. 3c). This suggests that new F-ENA trimers that are diffusing towards the fiber tip may become pre-oriented via electrostatic interactions (driven by the alignment of the dipole moment) and subsequently latch onto the fiber tip via coordinated docking of the three N-terminal hook segments into the corresponding hydrophobic clefts. Despite the small interaction interface, the F-ENA fibers are relatively stable. Testament to that stability is the fact that the fibers were washed numerous times in miliQ water prior to cryoEM analysis, showing no clear signs of depolymerization.

The limited resolution (4.1 Å) of the S-ENA cryoEM volume prevented us from unambiguously identifying the constituent subunits of the S-ENA Btk fiber (Fig. 2d). Recently, we showed that ex vivo purified S-ENA from Bacillus paranthracis are composed of two different subunits, Ena1A and Ena1B13. Considering that the relative stoichiometry of Ena1A and Ena1B (which share 39.4% sequence identity) was unknown, and both subunits are likely randomly distributed along the length of the fiber, the reconstructed EM-volume represents a weighted average of both sequences, stopping us from building an atomic model for the ex vivo S-type fibers13. To address this issue, we solved the structure of recombinant Ena1B fibers (Supplementary Figs. 4 and 5). Anticipating similar ambiguities during model building of ex vivo Btk S-ENA, we expressed recombinant homo-polymeric S-ENA in the cytoplasm of E. coli using WP_001277547.1 (hitherto referred to as Ena2A), which shares 37.9% sequence identity with the solved Ena1B structure (PDB:7A02; Supplementary Fig. 4a). After purifying S-ENA from the E. coli lysate, we collected a cryoEM dataset and solved the Ena2A structure to 2.7 Å resolution (Fig. 2d and Supplementary Fig. 5). The helical parameters for Ena2A were refined to a twist of −31.03° and a rise of 3.17 Å, which is subtly different to the rise and twist of Ena1B, i.e., −32.35° and 3.44 Å respectively (Supplementary Fig. 4c). At the protomer level, the pruned and unpruned atomic root mean squared deviations (RMSD) between Ena2A and Ena1B are 0.9 Å and 2.6 Å, with the most significant structural differences located in the N-terminal connector (residues 8–15) (Supplementary Fig. 4b). The positioning of the double cysteine motif (C12C13) is slightly closer to the Ena2A fold compared to Ena1B. This is the result of the shorter linker region to strand B (Ena2A: P14D15/Ena1B: A14N15G16Q17), which translates into the observed reduction in rise. Similarly to Ena1B, we could not resolve the first 8 residues, suggesting that this region is flexible or disordered; and its role in fiber biogenesis remains unclear. Despite the low sequence identity, there is remarkable structural conservation between the two structures (Supplementary Fig. 4). We attribute this structural preservation to the mode of protomer coupling, i.e., β-sheet augmentation. Pairwise docking of open-ended sheets entails the formation of a hydrogen bonding network between the main chains of both protomers and can therefore reasonably be expected to be insensitive to the protein sequence, provided the main fold and overall surface characteristics (such as charge and hydrophilicity) are conserved.

Two collagen-like proteins couple F-ENA to the exosporium and a tip fibrillum

Inspection of the local genomic context of the f-ena gene (locus: BTK_RS13480 in NZ_CP004870) showed that the upstream neighboring gene (BTK_RS13475) is predicted to encode a 327-residue protein (M1Q906), hitherto referred to as F-anchor (nomenclature explained below; Fig. 3a). An InterProScan15 analysis of the primary sequence identified three distinct domains: an N-terminal exosporium leader peptide (residues 9–37), a collagen triple helix repeat region (residues 30–241, comprising 70 GXY triplet repeats) and a C-terminal DUF4183 domain (residues 252–323). An exosporium leader sequence was first identified in Bcl16 and has been shown to be responsible for the tethering of the BclA N-terminus to the exosporium via docking to ExsF, forming the hairy nap17. The presence of an exosporium leader sequence on the N-terminus of F-Anchor fits with the experimentally observed localization of F-ENA on the surface of the exosporium (Fig. 1b).

Fig. 3: Structural organization of the F-ENA.
figure 3

a Genetic locus of the genes associated with the biogenesis of F-ENA; HP, hypothetical protein. b X-ray crystal structure of ExsF (pink) in complex with a peptide (F-Anchor14-28) corresponding to residues 14–28 of F-Anchor (green). c Zoom-in of the ExsF/F-Anchor14–28 interaction. d AlphaFold3 (AF3) prediction of the F-ENA/F-Anchor complex—a truncated F-Anchor spanning residues 140–276 was used for clarity purposes. e Schematic representation of the F-type pili tethered at the spore-proximal end to the exosporium via a tripartite F-ENA/F-Anchor/ExsF complex, and decorated at the spore-distal terminus with the tip-fibrillum F-BclA. f X-ray crystal structure of the C-terminal head group (F-BclA727-863) of F-BclA, g AF3 prediction of the F-ENA/F-BclA1–46 complex with zoom-in of the docking interface.

To test whether the putative exosporium leader sequence of F-Anchor would indeed form a complex with ExsF, we co-crystallized recombinantly produced and purified ExsF, which was tagged with an N-terminal 6xHis tag, with a synthetically produced peptide spanning residues 14–35 of F-Anchor (NH2-IEPENIGPTFSALPPIYIPTG-COOH; annotated as FA14–35) (Supplementary Figs. 6 and 7). The crystal structure of the ExsF/FA14–35 complex was solved to 1.92 Å resolution (R/Rfree of 18.4/21.6) and reveals three FA14–35 copies that are docked into a cleft at the trimeric interface between two neighboring ExsF copies (Fig. 3b and Supplementary Fig. 7a). The interaction between FA14–35 and ExsF involves both polar and apolar contacts, with no salt bridges detected. Three ExsF interacting regions can be identified in FA14–35 (Supplementary Fig. 7a, b). In region 1, residues I14, P16, and I19 tether the N-terminus of the peptide to a hydrophobic patch on the surface of ExsF (Supplementary Fig. 7a). Within region 2, we identified 14 hydrogen bonds between FA14–35 and ExsF, involving both sidechain-sidechain and sidechain-mainchain contacts. Notably, P16, I1, 9, and P21 are hydrogen bonded to R97, T22 interacts with R157, and S24 forms H-bonds with D87 and S85 of ExsF (Fig. 3c). In addition, F23 fits tightly into a hydrophobic pocket lined by I151, the side-chain of Q149 (Cγ), A95, and the Cβ-Cδ region of R97. Towards the C-terminal end of FA14–35 (region 3), residues L26 and I29 make hydrophobic contacts with I15, P93 and V91, while Y30 forms multiple hydrogen bonds with N119 and A92 (Fig. 3c). Although the N-terminal sequence of F-Anchor is strongly conserved (Supplementary Figs. 7b and 8a), the role of the first 13 residues remains unclear. For BclA, it has been shown that residues 1–19 are proteolytically cleaved off16 and do not contribute to binding to ExsF. The pairwise sequence alignment between the N-termini of BclA and F-Anchor shows clear sequence similarity (Supplementary Fig. 7c), suggesting that they may share similar modes of binding to ExsF. Whether residues 1–13 from F-Anchor also get proteolytically removed is not known. We also looked at the consensus motif for the general exosporium leader peptide, which we derived from the hmm profile of NCBifam entry TIGR03720 (Supplementary Fig. 7d). Conservation in the motif is highest at positions 12–17. These positions map onto the region that forms the greatest number of H-bonds with ExsF. We also note strong conservation of the proline residues at positions 9, 14, 17, 18, 20, and 21 (Supplementary Fig. 7d). Based on the ExsF/FA14–35 crystal structure, we can rationalize this conservation. These positions correspond to kinks in the peptide backbone that are required to accommodate the meandering mainchain path imposed by the shape of the cleft defined by the trimeric ExsF interface (Supplementary Fig. 7a).

F-Anchor shares many similarities with BclA. Both are collagen-like proteins, and both couple to the exosporium of Bt. The key distinction, however, lies in their respective C-terminal domains (CTDs). BclA has a BclA-C domain (Pfam: PF18573; TNF/C1q superfamily: SSF49842), whereas F-Anchor possesses a C-terminal DUF4183 domain. This divergence suggests that F-Anchor will likely serve a different function compared to BclA. Although the pairwise sequence identity of the DUF4183 domain of F-anchor to F-ENA is only 23%, we reasoned that F-anchor and F-ENA might form a hetero-hexameric complex. AlphaFold318 (AF3) predictions support this hypothesis, showing a high-confidence dimer of trimers (3 F-ENA/3 F-anchor: ipTM = 0.89, pTM = 0.91) that mimics the homotypic interactions between F-ENA ring segments in the F-type pilus (Fig. 3d). In the AF3 model, the N-terminal hook of F-ENA docks into a solvent-exposed hydrophobic cleft on the surface of the F-anchor trimer (Fig. 3d, insert). The pruned RMSD between the DUF4183 domains of the F-ENA trimer and the F-anchor trimer is 0.8 Å, indicating a high degree of structural similarity. When considering all atom pairs (non-pruned), the RMSD increases to 3.5 Å, reflecting minor variations likely due to differences in sidechain conformations and flexibility. This close structural agreement supports the potential for hetero-hexameric complex formation.

These data strongly suggest that F-Anchor functions as a critical bridging moiety that tethers F-ENA to the exosporium of Btk by displaying an F-ENA-like CTD on a semi-flexible display platform (i.e., the collagen stalk). We hypothesize that F-Anchor also serves as a starting point for F-ENA self-assembly, in that the F-Anchor-CTD will capture incoming F-ENA trimers through conserved docking, to preferentially initiate fiber formation onto the exosporium. To ensure efficient F-ENA capture, F-Anchor must remain sterically accessible despite being integrated into the hairy nap. The collagen region of BclA spans residues 29–161, corresponding to 40 triplet repeats. Based on AF3 predictions, the expected length of a collagen-helix segment is approximately 2.8 Å/residue. For F-Anchor and BclA, this translates into a collagen-helix length of 59 nm and 37 nm, respectively. Using nsTEM, we measured the width of the Btk hairy nap layer to be 30 nm, which is in correspondence with the expected dimensions of BclA, assuming full extension of the collagen stalk. F-Anchor can therefore be expected to protrude through the hairy nap, making it fully accessible for binding with F-ENA. It was shown that polymorphism in the collagen-like region of BclA leads to variation in the width of the hairy nap layer11. This raises the question of whether similar polymorphism in the collagen region of BclA could be coupled to potential polymorphism in F-Anchor. Hence, to test whether the observed difference in length between F-Anchor and BclA of Btk is a conserved feature for all F-ENA harboring strains, we downloaded all the NCBI bacterial genomes in the Bacillus/Clostridium group (taxid 1239, assembly level complete) and used hmmersearch (E-value < 1E-40) to identify F-ENA and BclA homolog pairs. In Supplementary Fig. 8 we plot the lengths of the respective collagen stalks (expressed in number of aa), which shows that the stalk region of F-Anchor is, on average, longer than that of BclA. This suggests that the collagen stalk polymorphism of both proteins could indeed be coupled, with a bias toward CLS regions that are longer in F-Anchor than in BclAs. Looking at the locus of f-ena and f-anchor, we find that the gene synteny is not conserved (Supplementary Data 1) across the Bacillales order of the Firmicutes phylum.

In addition to F-Anchor, we identified another gene possibly involved in F-ENA biogenesis: BTK_RS34675. This gene also encodes a putative collagen-like protein with a C-terminal C1q domain (Fig. 3a, e, f, g) and is referred to as F-BclA (see below). Interproscan does not identify an N-terminal exosporium leader sequence in F-BclA, indicating it will likely not be coupled to the exosporium. A pairwise-sequence alignment of the F-BclA N-terminus (F-NTD) with the F-ENA N-terminal hook sequence reveals a putative F-ENA docking motif in F-BclA (starting at residue 14: IILPH) that follows the Ψ12-$-P-F pattern (observed in F-ENA as IIQPF) responsible for F-ENA/F-ENA docking. Based on this analysis, we reasoned that F-BclA might encode for the tip fibrillum of F-ENA. Indeed, AF3 predicts a heterohexameric F-ENA/F-BclA complex that mimics the homotypic F-ENA interactions (Fig. 3g). Specifically, the putative F-BclA docking motif (IILPH) is modeled by AF3 to bind at the same location as the F-ENA N-terminal hook.

Moreover, the collagen region of F-BclA comprises 701 residues, which translates into an extended triple-collagen helix of 196 nm in length. This fits with the experimentally measured value of 206 ± 17 nm for the F-ENA of Btk. We can perform a similar length analysis for the ruffle of Btb. Using blast, we identify the proteins F-ENA, F-Anchor, and F-BclA of the Berliner strain to be WP_001121647.1 (C4B13_RS12845; 100% seq. id to Btk), WP_014482026.1 (C4B13_RS12840; 74.6% seq. id to Btk), and WP_142388267.1 (C4B13_RS36340; 94.4% seq. id. to F-BCLA-CTD of Btk). WP_142388267.1 has approximately 91 GXY triplet repeats (273aa), which is predicted to correspond to an elongated collagen-like helix of 77 nm. Experimentally, we measured a ruffle length of 82 ± 6 nm. This measurement includes the dimensions of the ruffle-tip (which is 4 nm). If we subtract this value, the corrected experimental length of the collagen stalk of F-BclA of Btb is 78 nm, which is in good correspondence with the theoretical value of 77 nm derived from the number of collagen triplets.

Next, we recombinantly expressed, purified, and crystallized the CTD of F-BclA (F-BCLA-CTD; Fig. 3f; Supplementary Fig. 9). Using X-ray crystallography, we solved the structure of F-BCLA-CTD to 2.29 Å resolution (R/Rfree = 22.5/27.7). F-BCLA-CTD forms a trimeric complex of all-β domains with a TNF-like jelly fold topology, reminiscent of the CTD of BclA (BclA-C, PDB entry 1WCK) (Fig. 3f, g). The RMSD between F-BCLA-CTD and BclA-C is 0.7 Å, with a sequence identity of 27.5%.

BclAs from B. cereus and B. anthracis have been shown to be extensively O-linked glycosylated in a domain-specific manner19. In these proteins, the collagen regions are modified with species-specific short O-glycans (anthrose, cereose), and the BclA-C domains are substituted by species-specific polysaccharide O-linked glycans. Although the F-BCLA-CTD presented here is not glycosylated as a result of recombinant expression in E. coli, we anticipate that F-BclA will likely be glycosylated in its native context. Firstly, two genes upstream of F-BclA lie the gene BTK_RS13455, which encodes for a putative glycosyltransferase (Fig. 3a). Secondly, F-BCLA-CTD contains a remarkable 78 out of 185 (42%) solvent-exposed residues that are serine or threonine. Finally, the collagen region of F-BclA (residues 19–720) consists of 326 threonine or serine residues, accounting for 45% of the collagen stalk (Supplementary Fig. 10). We also analyzed the conservation of F-BclA by performing BLAST searches using either F-NTD or F-BCLA-CTD as query sequences. Both approaches lead to very similar results, producing a dataset of collagen-like sequences in which both the N- and CTDs are highly conserved (see Supplementary Fig. 11). Note that searches using F-NTD as a query could have identified putative F-ENA ruffle genes with a different CTD, but no such sequences were detected using this approach.

Taken altogether, a picture emerges of F-ENA fibers that are coupled to the exosporium via a dedicated anchoring complex (ExsF/F-Anchor/F-ENA) on the spore proximal side, and decorated with a dedicated, conserved collagen-like (putatively glycosylated) tip fibrillum (F-BclA) on the spore distal side that extends into the surrounding environment (Fig. 3e).

F-ENA contributes to spore clustering

We recently showed that S- and L-type endospore appendages in B. paranthracis contribute to spore-spore interactions that lead to the formation of spore clusters20. In the current study, we tested whether the endospore appendages of B. thuringiensis are also involved in the auto-aggregation of spores. nsTEM micrographs of freshly prepared Btk spore solutions deposited onto a Cu-mesh grid revealed the presence of loosely packed spore clusters wherein spores typically do not engage in direct lateral contacts via the spore body or the exosporium (Fig. 4a). Higher magnification imaging revealed a dense network of F- and S-ENA-fibers pervading the spaces in between adjacent spores (Fig. 4b). No such clusters were observed for Btk spores that are deficient in both F- and S-ENA (Fig. 4c). It is tempting to speculate at this point that ENA is mediating direct spore contacts by binding to a (specific) surface receptor on neighboring spores. However, it is important to stress that the observed globular spore clusters could also have formed during the EM grid preparation process (i.e., as a drying artifact), and may not accurately reflect the spore assembly state in solution. In addition to globular spore clusters, however, we also observed linear trains of spores spaced at semi-regular intervals (seemingly) connected by S-type ENAs (Fig. 4d). Further high-magnification imaging also revealed F-ENA fibers bridging the interstitial regions between neighboring spores (Fig. 4e). We argue that such linear spore trains, which are in an extended state, likely formed as a result of viscous drag during the sample blotting process and are indeed indicative of inter-spore clustering and contacts in the liquid phase.

Fig. 4: ENA-mediated aggregation of Btk spores.
figure 4

a, b nsTEM image of a globular Btk spore cluster at consecutively higher magnification (white rectangle denotes zoom-in area). F- and S-type ENAs populate the regions in between the spores; c nsTEM image of a Btk S-F- spore sample; d Linear spore train of Btk spores; e S- and F-type fibers making lateral contacts across neighboring Btk spores; f Time dependence of the location of the developing sedimentation front of Bt407 wild type spores (BT S + F+ (WT)) or spores of strains deficient in either S-type (BT SF+), F-type (BT S+F) or both ENA types (BT SF (Bald)); and g mean rate of sedimentation (µm s1) of Bt407 WT or Ena mutant spores. The error bars in (f, g) correspond to the standard deviation based on three biological replicates. Source data are provided as a Source Data file.

To probe for the relative contributions of S- and F-type ENAs in spore clustering, we set out to generate deletion strains deficient in either S-type (Bt SF+), F-type (Bt S+F), or both ENA types (Bt SF). However, we experienced difficulties in producing ENA-knockouts in Btk, and had to resort to knockout production in an alternative strain, i.e., B. thuringiensis 407 (taxid: 527021; ASM16149v1). The Bt407 strain is closely related to Btk and produces F- and S-ENA fibers with high similarity (Ena1A: 76.99%, Ena1B: 85.37%, and F-ENA: 100% sequence identity). All Bt407 deletion strains and the WT Bt407 strain were sporulated on solid media (LB agar) through nutrient starvation, harvested, resuspended in Mili-Q water, and stored under stagnant conditions. Time-lapse photography was used to monitor the sedimentation kinetics over a 175 h period by tracking the position of the interface that develops between the spore-dense and dilute region (Fig. 4f). We see a marked increase in the average rate of sedimentation for Bt S+F compared to WT and Bt S-F+. This observation would be in agreement with F-ENA agglutinating suspended spores into a loosely contacting network. Notably, we did not record an additional increase in sedimentation rate for the double knockout (SF) compared to Bt S+F, suggesting F-ENA takes a dominant role in the agglutinating activity of the Bt407 extra-sporal matrix.

Paenibacillacea spores are decorated with a subclass of F-ENA-like pili

To understand the phylogenetic distribution of F-ENA, we searched for DUF4183-containing genes across the B. cereus group sensu lato using the Btyper database21. Out of the 5976 genomes searched, 5647 (94.4%) harbor one or more DUF4183-containing genes (Supplementary Fig. 12a). Looking more broadly across the Bacillota phylum (taxid: 1239), we examined 9342 complete Genbank genomes, of which 783 (8.3%) hold one or more DUF4183-containing genes (Supplementary Fig. 12b). These F-ENA carrying genomes cluster into the Bacillacea family, containing the B. cereus group hits, Lysinibacillus, Geobacillus, Priestia, and Anoxybacillus, as well as the Paenibacillaceae family, which includes hits in Paenibacillus, Brevibacillus and Cohnella. A limited number of hits were also obtained in the Clostridia class.

Looking at the Paenibacillaceae hits in further detail, we identified F-ENA homologs with N- and C-terminal extensions to the DUF4183 domain. A0A1I1C8X4 (136aa; 22.7% sequence identity to F-ENA; hereafter referred to as F-ENA-2) is one such example, for which AF3 predicts extensions to the A and E β-strands (Supplementary Fig. 12b). To test whether this distant homolog still self-assembles into F-ENA-like fibers, we recombinantly expressed and purified F-ENA-2 in the cytoplasm of E. coli. nsTEM micrographs of the insoluble E. coli fraction after cell lysis confirmed the presence of 3 nm diameter fibrils with a regularly spaced beads-on-a-string appearance, and which exist either as single fibers (Fig. 5a, b) or are grouped into parallel bundles (Fig. 5c, d). We collected a cryoEM dataset and solved the F-ENA-2 pilus structure to a global resolution of 4.1 Å, with a helical rise and twist of 95.25 Å and 67.50°, respectively (Fig. 5e). Due to the limited resolution, we post-processed the reconstructed volume with EMready22 and performed rigid-body docking using an AF3 F-ENA-2 trimer as the starting model (Supplementary Fig. 13 and Fig. 5f). Similar to F-ENA, F-ENA-2 fibers are helical ultrastructures with global C3 symmetry, where each fiber segment (asymmetric unit) consists of an F-ENA-2 trimer that couples to the next segment via docking of the N-terminal hooks (Fig. 5i, g). Like F-ENA, the F-ENA-2 N-terminal hook is composed of the first seven residues (MPVIKPV…) in a conserved M-P-Ψ1-Ψ2-$-P-Ψ3 motif, which adheres to the hydrophobic groove on the head domain of a preceding segment. The latter is composed of a globular DUF4183 region that is structurally equivalent to the DUF4183 fold of F-ENA (pruned RMSD = 1.0 Å). The N- and C-terminal extensions of F-ENA-2 form a toroidal β-barrel extension or neck composed of 6 antiparallel strands that position the N-terminal hooks 110 Å away from the head domains (Fig. 5f, h). This configuration results in the characteristic dotted pearl on strings appearance observed in cryoEM and nsTEM micrographs (Fig. 5a–d and Supplementary Fig. 13b). Despite the low sequence identity to F-ENA, F-ENA-2 is characterized by a similar bipolar distribution of surface charges (Fig. 5j), leading to a dipole moment of 2561D, with the negative pole situated at the head group and directed towards the positive N-terminal hook region (Supplementary Fig. 14). Similarly to F-ENA, three salt bridges reinforce inter-segment connections between the head groups of segment i and the N-terminal hooks of segment i + 1, specifically between K5 and D101.

Fig. 5: CryoEM of recombinant (rec) Paenibacillaceae F-ENA-2 fibers.
figure 5

a CryoEM micrograph of rec-F-ENA-2 with corresponding 2D class average in (b); c nsTEM micrograph of rec-F-ENA-2 with corresponding 2D class averages in (d); e reconstructed cryoEM volume with (f) the corresponding molecular model shown in cartoon representation; g top-view slice-through along the C3 axis; h zoom-in of the beta-cylinder extension; i docking site of the N-terminal hook; j hydrophobic (left) and electrostatics (right) surface coloring; k structural overlay of the DUF4183 domains of F-ENA and F-ENA-2; l genetic loci of F-ENA-2 for two members of the Paenabillacaea group; and m AlphaFold3 (AF3) model of the F-ENA-2 and F-Anchor-2 complex. The images in (a, c) were derived from two biological replicates.

Looking at the genomic context of f-ena-2 (Refseq locus BM235_RS28245 in NZ_FOKE01000015.1 of Chonella sp. OV330), we identified the gene BM235_RS32880, which encodes WP_218161362.1, a putative DUF4183-domain containing protein (Fig. 5l). Inspection of the primary sequence reveals that this protein contains collagen repeats. AF3 predicts a putative hetero-hexameric complex between F-ENA-2 and WP_218161362.1 that is similar to the predicted complex of F-ENA and F-Anchor (Fig. 5m). Based on this observation, we hypothesize that WP_218161362.1 may serve a similar role in coupling the F-ENA-2 pili to the spore, and for this reason, we refer to WP_218161362.1 as F-Anchor-2 (23% sequence identity to F-Anchor). Contrary to Btk, we did not identify a putative ruffle gene for F-ENA-2.

It is interesting to note that recombinant F-ENA-2 fibers can form intercalated super bundles, wherein the fibers stack laterally in an anti-parallel fashion (Fig. 5c, d and Supplementary Fig. 15). To test whether the F-ENA-2 fiber isoform can also be found in an in vivo spore setting, we screened various Paenibacillus strains and observed F-ENA-2-like fibers on one of the Paenibacillus sp. isolates from our collection (Supplementary Fig. 16). EM imaging showed an extrasporal matrix (ESM) composed of a high density network of fibers that were seen to decorate the spore body of Paenibacillus sp. spores. The corresponding beads-on-a-string 2D class averages confirm that these fibers are indeed F-ENA-2-like. Although some spores have an exosporal sacculus with a clear crystalline hexagonal pattern (with unit cell dimensions that match the lattice of ExsY sheets23), many spores do not have a pronounced exosporium. Interestingly, ENAs seem to directly emerge from the surface of the spore body. Since no exosporium leader sequence motif was detected in the F-Anchor-2 sequence, we hypothesize that the spore coupling mechanism for F-ENA-2 could be different from the ExsF-based coupling seen in B. thuringiensis.

F-ENA fibers are produced by anaerobic bacteria of the class Clostridia

To explore the distribution of the F-ENA fibers in the Bacillota phylum, we performed homology searches for the DUF4183 domain-containing proteins in the non-redundant protein sequence database. The results showed that F-ENA homologs are widely distributed in both the Paenibacillaceae and Bacillaceae families (class Bacilli), and even within the anaerobic bacteria of the class Clostridia (Supplementary Fig. 12). By analyzing F-ENA homologs in various Clostridia strains, we show that the mechanism of F-ENA anchoring does indeed seem to vary. For example, in Clostridium aceticum strain DSM 1496, the major F-ENA subunit (locus: CACET_c28680; protein: AKL96313.1; hereafter referred to as F-ENA-3) is flanked by CACET_c28690 (protein: AKL96314.1; hereafter referred to as F-Anchor-3) (Supplementary Fig. 17) which is composed of one N-terminal DUF4183 domain, four DUF11 domains, one CsxA-like domain, and two additional DUF4183 domains. AF3 predicts a hetero-hexameric complex between 3 copies of F-ENA-3 (Supplementary Fig. 17a), one copy of F-Anchor-3, and two copies of CsxA (i.e., the major subunit of the Clostridium exosporium). As expected, the F-ENA-3 trimer is predicted to dock onto the triad of DUF4183 domains of F-Anchor-3, while the CsxA-like domain of F-Anchor-3 is predicted to form a heterotrimer with the two CsxA copies. Based on this, we hypothesize that the F-ENA-3 pilus is tethered to the C. aceticum exosporium via integration of F-Anchor-3 into the CsxA crystal lattice. It is interesting to note that the number of DUF11 domains in F-Anchor-3 tends to vary from zero to four across different homologs within the Clostridia class (Supplementary Fig. 17c).

Consistent with this prediction, we observed F-ENA-3 type fibers using cryo-EM on the surface of an anaerobic bacterium of the family Anaerovoracaceae within the class Clostridia (Fig. 6 and Supplementary Fig. 18). The fibers, several microns in length, were present on the surface of the spores and exhibited a repeating beads-on-a-string pattern characteristic of F-ENA filaments observed in Bacillus and Paenibacillus spores (Fig. 6a, d, e). The averaged power spectrum of aligned raw particles was indexed to estimate all potential helical symmetries (Supplementary Fig. 18), which were further tested in helical reconstruction. The correct symmetry was determined to be C3, with a rise of 71.2 Å and a twist of 74.4°. The final reconstruction, with helical symmetry applied, reached a resolution of approximately 4.1 Å, as judged by map-to-map FSC. Cα tracing at this resolution was straightforward, and the sequence of the building block was predicted using ModelAngelo to be MDF2656102 (hereafter referred to as F-ENA-3a), and the identity of the corresponding gene was confirmed by PCR.

Fig. 6: CryoEM of F-ENA-3a fibers of an uncultivated member of the family Anaerovoracaceae within the class Clostridia.
figure 6

a Reconstructed cryoEM volume with b the corresponding top and c side view of the molecular model shown in cartoon representation; d cryoEM micrograph of F-ENA-3 fibers with corresponding 2D class average in (e); f zoom-in of the beta-cylinder extension; g docking site of the N-terminal hook; and h unmodelled densities corresponding to putative O-glycosylation sites.

F-ENA-3a exhibits a relatively high sequence identity of 66.9% to F-ENA-3 from C. aceticum, but only moderate sequence identities to F-ENA and F-ENA-2, at 21.5% and 29.1%, respectively. Despite the moderate sequence identities among these three experimentally determined F-ENA fibers, all three share a C3 symmetry and similar packing of the DUF4183 domain trimer (Fig. 6b, c). The variation in helical symmetry is primarily due to the differences in the lengths of the N- and C-termini that form spacers between the beads composed of the DUF4183 domain trimers (Figs. 5f and 6c). In particular, the shortest spacer of 34 Å is present in F-ENA, the largest of 95 Å in F-ENA-2, and F-ENA-3a falls in the middle with 71 Å. Within this extended region, composed of a six-stranded β-barrel in F-ENA-3a, a similar level of hydrogen bonding was observed (Fig. 6f) compared to the other two F-ENAs. Furthermore, F-ENA-3a also possesses an inserted N-terminal hook that tightly binds to the next layer of the DUF4183 trimers (Fig. 6g). Interestingly, and in contrast to the other two F-ENAs, extra cryo-EM densities, putatively representing post-translational modifications, were observed on eight residues per subunit. These residues, five serines and three threonines, are likely to correspond to O-linked glycosylation sites.

In summary, F-ENA fibers appear to be ubiquitous among members of the Bacillota, characterized by a conserved DUF4183 fold and N-terminal hook, yet displaying variability in the surface biophysical properties and inter-domain distances between DUF4183 domains.

Discussion

Bacteria have a remarkable ability to populate and colonize a myriad of interfaces and surfaces. Under favorable growth conditions, they develop large, actively growing communities that—as resources gradually deplete—transform into structured biofilms, consisting of sessile cells and a complex, multi-component extracellular matrix (ECM) that encases and reinforces the biofilm24. Within the biofilm, there is often a small population of persister cells that are either fully metabolically dormant or exhibit selective inactivation of genes, which enables them to survive highly unfavorable, otherwise lethal conditions such as nutrient deprivation, antibiotic stress, or exposure to detergents25. This specialized survival strategy guarantees the persistence of the population when conditions become favorable again. Members of the Firmicutes phylum have, in a sense, perfected the concept of persister cells through the development of endospores, a highly specialized dormant cell subtype that is optimized for survival and dissemination. The cell’s ability to produce endospores creates an additional layer of complexity with respect to biofilm formation and allows for fine-tuning of the relative abundances of vegetative cells and endospores in response to environmental stimuli. The impact of the cell-to-spore ratio on biofilm stability, structure, resilience, and adhesive properties is still poorly understood. Reports on biofilms of B. subtilis (Bs)—the most extensively studied species within the Firmicutes phylum- tend to predominantly focus on cells and the ECM, which for Bs is composed of extracellular DNA, exopolysaccharides, TasA protein fibers, and BslA26. Nevertheless, it is clear that spores can also be found within Bs biofilms27. For B. cereus (Bc), depending on the strain and culture conditions, spores can even constitute up to 90% of the total biofilm population28. Biofilms dominated by spores upend the conventional perspective on biofilm formation and call into question the broader applicability of lessons learned from Bs biofilms to those formed by Bc and Bt29. Recent cryoEM studies of Bc spores show that these are decorated with proteinaceous appendages not found on vegetative cells13,14. Here, we expand the family of ENAs and find that they can give rise to a dense extrasporal matrix distinct in composition from the ECM surrounding biofilms dominated by vegetative cells.

In the conditions tested in the current work, biofilms of B. thuringiensis serovar kurstaki were predominantly (94%) composed of spores and toxin crystals, with only a small minority of cells being present. We observed the presence of an ESM that was dominated by spore-attached ENAs. Using cryoEM, we resolved the structure of the S-type variant, which was found to be structurally closely related to the S-type ENAs observed on B. paranthracis (Bp) NVH 0075-95. Moreover, we identified a previously unknown ENA subtype: flexible, 5 nm diameter ENAs (F-ENA) coupled to the exosporium of Btk. The identity of the F-ENA protomers was determined through cryo-identification. F-ENAs are composed of small (10 kDa) DUF4183 proteins that trimerize and stack axially into a helical ultrastructure. We showed that F-ENAs are (i) tethered to the exosporium via a collagen-like protein, F-Anchor, which docks onto ExsF via an N-terminal exosporium leader sequence, and (ii) serve as an assembly platform for F-ENA formation by displaying a C-terminal DUF4183 domain on a long collagen triple helix that pierces through the hairy nap layer (Fig. 7). At the spore distal terminus, F-ENAs are decorated with a flexible tip fibrillum that consists of the collagen-like protein F-BclA. Although F-BclA has a variable number of collagen triplet repeats, its C-terminal C1q-like domain is highly conserved throughout the B. cereus group.

Fig. 7: Composite molecular model of F-ENA found across the B. cereus sensu lato group.
figure 7

At the spore proximal terminus, F-ENA is docked onto the exosporium basal layer via the F-ENA/F-Anchor/ExsF ternary complex. At the spore distal terminus, the F-ENA tip fibrillum is composed of the collagen-like protein F-BclA.

Our findings suggest that F-ENA contributes to the clustering of Btk spores. In a previous contribution, we showed that spore aggregation in Bp was dependent on the L-ENA tip fibrillum protein L-BclA, which is localized on the termini of both S- and L-ENA20. We hypothesize that F-BclA could have adopted a role similar to L-BclA, in that it might bind to the surface of the spore to mediate aggregation, either via targeting a specific receptor or via aspecific contacts. The coupling mechanism of F-BclA onto F-ENA mimics the homotypic axial interactions that are specific to the F-ENA pilus. Consequently, F-BclA binding is expected to be limited to F-ENA pili, i.e., with no promiscuous binding to different ENA subtypes. Although the exact identity of the S-ENA ruffle protein(s) remains unclear, a HMMER search using L-BclA as a query against the Btk genome yields AGE80398.1. This is a collagen-like protein with (i) a C-terminal C1q domain that has 47.89% sequence identity to the C1q domain of L-BclA, and (ii) a cysteine-rich (5 cys) N-terminus that carries a signature CC-motif. A CC motif is also found in the S- and L-ENA protomers, as well as the N-terminal connector of L-BclA. Notably, this protein does not contain an exosporium-leader sequence. Whether or not AGE80398.1 constitutes the tip fibrillum of S-ENA, will be the subject of further study.

Regardless of the exact molecular mechanism of binding, one could ask why Btk spores have developed independent, seemingly redundant methods to self-aggregate. The widespread distribution of S and F-ENA operons across the B. cereus group and the strong conservation of F-BclA suggest spore aggregation could be a conserved trait of Bc group spores. Functional redundancy in the form of the production of various ENA subtypes, each decorated with distinct tip fibrillae, could be a logical adaptation to the evolutionary pressure to maintain the ability to aggregate. Moreover, S-ENA and F-ENA may play similar or different roles in cellular clustering. Taking the S- and L-ENA fibers of Bp NVH 0075-95 as an example, we found that S-ENA fibers are much longer (multiple micrometers) than the L-ENA fibers (50–100 nm in length), making them more likely to maintain long-range spore contacts. Their structure enables them to grapple spores that lie beyond the first sphere of nearest neighbors within a spore cluster. In contrast, L-ENAs can only make direct spore contacts owing to their limited length. If the F- and S-ENAs of Btk exhibit a similar division of labor remains unclear. And how the differences in molecular architecture, tensile strength, and flexibility of F-, L-, and S-ENA translate to the macroscopic material properties of the biofilm remains to be studied.

It is remarkable that Bs spores are devoid of ENAs. Spore clustering could provide a fitness advantage in numerous defensive scenarios, such as protection against predatory pressures (e.g., ameba grazing), UV radiation (unlikely for the rhizosphere habitat of Bs, but potentially relevant in dispersion), macrophage phagocytosis, etc. Having said that, in a cell-rich biofilm setting, spore clustering may not be a requirement, because the ECM, in particular TasA fibers secreted by the cells30, will also encase the spores embedded within the biofilm, thereby abrogating the necessity for direct inter-spore coupling. For biofilms that are dominated by spores with minimal or no ECM present, however, ENAs likely take up a compensatory functional role in lieu of the ECM. Secondly, many of the species in the B. cereus group sensu lato exhibit pathogenic lifestyles. ENA-mediated spore clustering could factor into offensive strategies by facilitating the formation of spore aggregate sizes that approach or exceed the minimal infectious dose for a given host. If indeed proven to be the case, S- and F-ENA could become valuable targets to increase the virulence of Bt and with it the efficacy of Bt-based biopesticides.

Including the newly identified F-ENA, the number of characterized ENA-subtypes stands at three – a number that is expected to increase as research continues. S-, L-, and F-ENAs share little sequence identity, are all architecturally distinct, and are tethered to different spore surfaces: S-ENAs are associated with the spore coat13, while L- and F-ENAs are anchored to the exosporium14). The common denominator across this group of ENAs is that they all terminate in a collagen-like tip fibrillum. This suggests that the main biological function of ENAs is to serve as a structural platform to display C1q-like domains at their distal ends, which have presumed adhesive properties. The F-ENA-2 type ENAs found in the Paenibacillacea described in this contribution may deviate from that pattern in that we did not observe any tip fibrillae via nsTEM, nor did we identify putative ruffle candidates through genetic screening. However, we did observe antiparallel, lateral co-alignment of F-ENA-2 fibers in nsTEM. Importantly, we did not observe such bundling in cryoEM, suggesting that F-ENA-2 bundling predominantly occurs under dehydrated conditions. In natural settings, spores can be exposed to desiccation events, and hence, ENA bundling might be a response to dehydration. (Anti-parallel) fiber bundling due to lateral interactions, as seen with F-ENA-2, has been observed for TasA in B. subtilis31, archaeal bundling pili (ABP) in Pyrobaculum calidifontis32, and more recently in ABP-like fibers in Pyrodictium abyssi (ABPx)33. In these cases, the bundling likely contributes to the stability of their respective ECMs. TasA, ABP, and ABPx are all donor-strand-exchanged fibers. Although F-ENA-2 falls outside that structural category, it may have evolved to achieve a similar goal, i.e., to bridge contacts between spores, either through entropic entanglement or direct fiber-fiber coupling.

Methods

Biofilm formation

Btk and Paenibacillus sp. were sporulated via nutrient starvation on solid LB agar medium. For this, an overnight LB culture of either strain was spread on an LB agar plate and grown to confluency over a period of 72 h at 30 °C.

nsTEM

nsTEM imaging of bacterial biofilms was done using formvar/carbon-coated copper grids (Electron Microscopy Sciences) with a 400-hole mesh. The grids were glow-discharged (ELMO; Agar Scientific) with 4 mA plasma current for 45 s. A section of a confluent microcolony was scraped from an LB agar plate and resuspended in Milli-Q. Three microliters of the bacterial spore suspension were applied onto the glow-discharged grids and left to adsorb for 1 min. The solution was dry blotted, followed by three washes with 15 μl Milli-Q. Next, 15 μl drops of 2% (w/v) uranyl acetate were applied three times for 10 s, 2 s, and 1 min, respectively, with a blotting step in between each application. The excess uranyl acetate was then dry blotted with Whatman type 1 paper. All grids were screened with a 120 kV JEOL 1400 microscope equipped with LaB6 filament and a TVIPS F416 CCD camera.

Ex vivo isolation of S-type and F-ENA from Btk

To dislodge the ENA fibers from the Btk spore surface, 500 µl of the bacterial spore suspension was subjected to sonication on ice using a Qsonica Q500 sonicator equipped with a 1/16” microtip probe using the following settings: 15/30 s on/off, 2 min total time, 30% amplitude. Following sonication, the spore suspension was loaded onto a 1.8 mL 60% (w/v) sucrose cushion and centrifuged at 20,000 rcf for 1 h. The cleared supernatant was carefully pipetted off the sucrose layer and subjected to 3 rounds of washing in Mili-Q via repeated centrifugation (1 h, 20,000 rcf) and resuspension of the pellet. nsTEM imaging of the resuspended pellet confirmed the presence and enrichment of the S- and F-type ENA fibers.

Cloning of ENA2A, F-ENA-2, F-BclA-CTD and ExsF

All genes were cloned in the pASK-IBA3+ vector via Gibson assembly (New England Biolabs). The genes for recombinant expression of ENA2A (WP_001277547.1), F-ENA-2 (A0A1I1C8X4) and the CTD of F-BclA (F-BclA-CTD; BTK_RS34675 residues 721–863) were ordered as double-stranded synthetic DNA constructs (gblocks Gene Fragment, IDT) with the appropriate overhangs, while the exsF gene, in which the first 23 residues were replaced by a 6xHis tag, was amplified from the B. paranthracis genome.

Gibson assembly mixtures were transformed into E. coli DH5a cells and plated on LB agar plates supplemented with 100 µg ml−1 ampicillin. Positive clones were identified via colony-PCR followed by Sanger sequencing (Mix2seq, Eurofins).

Recombinant production of S-ENA and F-ENA-2 fibers

Recombinant S-ENA and F-ENA-2 fibers were produced by heterologous expression in the cytoplasm of Escherichia coli. A single colony of DH5α(pASK-IBA3+-ena2A) or DH5α(pASK-IBA3+-f-ena2A) was grown overnight in LB supplemented with 100 µg/ml ampicillin. The overnight preculture was diluted 50× fold in fresh LB supplemented with 100 µg/ml ampicillin and grown to exponential phase at 37 °C, shaking at 180 rpm. Recombinant expression was induced by adding 200 µg/l final anhydrotetracycline at an OD600nm of 0.8 and incubated at 20 °C, 180 rpm for 3 h. Cells were pelleted at 4000 rcf and stored at −20 °C until further use.

To isolate S-ENA fibers from the cytoplasm of E.coli, stored pellets were resuspended in 50 mM Tris-HCl pH 7.5, 250 mM NaCl, 10 mM ethylenediaminetetraacetic acid (EDTA), 0.5 mg/ml Henn egg white lysozyme (HEWL) (Merck) and continuously stirred at 37 °C for 1 h. Next, sodium dodecyl sulfate (SDS) was added to the lysate to a final concentration of 2% (w/v), and the suspension was heated to 95 °C for 1 h. The cleared lysate was centrifuged for 1 h at 35,000 rcf, and the pellet was resuspended in Mili-Q. This process was repeated threefold to remove residual SDS and obtain a pure suspension of S-ENA fibers. To isolate F-ENA-2 fibers from the cytoplasm of E.coli, stored pellets were resuspended in 50 mM Tris-HCl pH 7.5, 250 mM NaCl, 10 mM EDTA, 0.5 mg/ml HEWL, 1% (w/v) n-dodecyl-β-D-maltopyranoside (DDM), and continuously stirred at 37 °C for 1 h. The lysate was centrifuged for 1 h at 35,000 rcf, and the pellet was resuspended in Mili-Q. This process was repeated threefold to remove residual DDM and obtain a pure suspension of F-ENA fibers.

Recombinant production and purification of ExsF and F-BclA-CTD

Recombinant ExsF and F-BclA-CTD were produced by heterologous expression in the cytoplasm of Escherichia coli. A single colony of DH5α(pASK-IBA3+-exsF) or DH5α(pASK-IBA3+-f-bclA-ctd) was grown overnight in LB supplemented with 100 µg/ml ampicillin. The overnight preculture was diluted 50× fold in fresh LB supplemented with 100 µg ml1 ampicillin and grown to exponential phase at 37 °C, shaking at 180 rpm. Recombinant expression was induced by adding 200 µg l1 final anhydrotetracycline at an OD600nm of 0.8 and incubating at 20 °C, 180 rpm for 3 h. Cells were pelleted at 4000 rcf and stored at −20 °C until further use.

Stored cell pellets were thawed and resuspended in 100 mM Tris pH 7.5, 500 mM NaCl, 10 mM EDTA, 0.5 mg/ml HEWL, Complete Protease Inhibitor Cocktail (Merck), incubated for 30 min at 37 °C, and passed three times through a Microfluidics LM10 Microfluidizer operated at 20 kpsi. The lysate was centrifuged for 45 min at 35,000 rcf to pellet remaining cell debris. The cleared supernatant was supplemented with 20 mM imidazole, pH 8.0, final concentration, and loaded onto a 5 mL Histrap High Performance pre-packed column (Cytiva) that was equilibrated with 5 column volumes (CV) of 100 mM Tris pH 7.5, 500 mM NaCl, and 20 mM imidazole (Buffer A). Following sample loading, the column was washed with 10 CV of Buffer A, and gradient (flow rate 2.5 ml/min; 50% A/B gradient over 30 min) eluted with 100 mM Tris pH 7.5, 150 mM NaCl, and 500 mM imidazole (Buffer B) whilst fractionating the eluate. SDS-PAGE analysis was used to monitor sample purity. Pure F-BclA-CTD fractions were pooled and dialyzed against 25 mM Tris pH 7.45, 500 mM NaCl using Spectra/por 3 Dialysis Membrane, MWCO 3500 Spectrum, and concentrated using a 10 kDa MWCO Amicon Ultra Centrifugal Filter, to a final concentration of 18.4 mg/ml. The protein concentration was determined based on absorption at 280 nm using a NanoDrop One UV-Vis spectrophotometer, an extinction coefficient of 5960 M−1 cm−1, and a molecular weight of 16583.22 Dalton.

Crystallization, X-ray data collection, processing, and structure determination

Protein crystallization screens were set up using the sitting drop vapor diffusion method and a Mosquito nanoliter-dispensing crystallization robot (SPTlabtech). For F-BclA-CTD, plate-like crystals appeared within one week in condition C2 of the JCSG-plus HT-96 crystallization screen (Molecular Dimensions) containing 1 M LiCl, 0.1 M citrate, pH 4.0, 20% (w/v) PEG 6000. ExsF/F-Anchor14–35 crystallized in 5 % v/v 7T-mate pH 7, 0.1 M sodium cacodylate pH 5.3, 15 % v/v PEG smear broad and 10 % v/v ethylene glycol (BCS screen). Crystallization was further optimized by manual setup of replicated crystallization conditions in hanging drop geometry using a Greiner Bio-One ComboPlate 24-well Protein Crystallization Plate, Pre-greased, and crystals appeared within one week. The crystallization buffer was supplemented with 10% glycerol, and crystals were mounted in nylon loops and flash-cooled in liquid nitrogen. X-ray diffraction data were collected at 100 K using the Beamline Proxima 2 (wavelength = 0. 9801 Å) at the Soleil synchrotron (Gif-sur-Yvette, France). For F-BclA-CTD, diffractograms were processed using Autoproc34 and Staraniso35 at 2.29 Å in P321 with unit-cell dimensions a = b = 112.88 Å, c = 111.71 Å, a = b = 90.0°, and g = 120.0°. ExsF/F-Anchor14–35 diffracted to 1.92 Å and was processed using AutoProc in P 21. Both crystal structures were determined by molecular replacement using Phaser from the Phenix suite36 and using the AlphaFold237 predictions of the F-BclA-CTD or ExsF as search models. The structures were refined through iterative cycles of manual model building with COOT38 and reciprocal space refinement with phenix.refine to R values of Rwork/Rfree of 0.18/0.24 for F-BclA-CTD and 0.18/0.21 for ExsF/F-Anchor14–35, respectively. The crystallographic statistics are shown in Supplementary Table 1. Structural Figures were generated with ChimeraX39,40. Atomic coordinates and structure factors have been deposited in the41 PDB under the accession codes 9H38 (F-BclA-CTD) and 9H3D (ExsF/F-Anchor14–35).

Cryo-electron transmission microscopy

Quantifoil® holey Cu 400 mesh grids with 2-µm holes and 1-µm spacing were glow discharged in vacuum using a plasma current of 5 mA for 1 min (ELMO; Agar Scientific). For cryo-plunging, 3 µl of a fiber suspension was applied on freshly glow-discharged grids at 100% humidity and room temperature in a Gatan CP3 cryo-plunger. After 10 s of absorption, the grid was machine-blotted with Whatman grade 2 filter paper for 3.5 s from both sides and plunged frozen into liquid ethane at −176 °C. Grids were stored in liquid nitrogen until further use. In total, three datasets were collected: one for the ex vivo ENA fibers purified from Btk (dataset 1: 27,272 movies), one for recombinant Ena2A (dataset 2: 4348), and one for recombinant F-ENA-2 fibers (dataset 3: 8715 movies). High-resolution cryoEM movies were recorded using SerialEM 3.0.850 on a JEOL CRYO ARM 300 microscope equipped with an omega energy filter (operated at a slit width of 12 eV). The movies were captured with a K3 direct electron detector run in counting mode at a magnification of 60 K with a calibrated pixel size of 0.70 Å/pixel, and a total exposure of 60 e/Å2 over 60 frames.

Movies were imported into cryoSPARC v4.6.242 for further processing. Movies were motion-corrected using Patch Motion Correction, and defocus values were determined using Patch CTF. Exposures were curated, and segments were picked using the filament tracer and extracted at 4× binning. After several rounds of 2D classification, initial estimates of the helical rise and twist values were obtained based on the 2D class averages (F-ENA; F-ENA-2) or were taken from the S-ENA structure (7A02), and were used in subsequent helical refinement jobs. Next, particles were reextracted at native unbinned resolution (Ena2A: 0.695 Å/pixel) or twofold binned (F-ENA/F-ENA-2: 1.39 Å/pixel) and used as input for a second round of helical refinement, followed by local and global CTF refinement, and helical refinement. In the end, 196,540, 388,452, and 286,873 fibril segments were used for F-ENA, Ena2A, and F-ENA-2, respectively. The resulting high-resolution volumes and particle stacks were used for reference-based motion correction after particle duplicate removal, followed by a final helical refinement. For Ena2A, the volume was initially refined with a helical twist of 31.03°. The map handedness was later determined to be left-handed, and the volume was flipped along the z-axis, yielding the final reported twist of −31.03°. Initial atomic models were built using either ModelAngelo43 without providing an input sequence for F-ENA (build_no_seq), or by providing input sequences for recombinant Ena2A. For F-ENA-2, an amber-relaxed AF2 model was generated of an F-ENA-2 hexamer using ColabFold42. Modelangelo generated models of F-ENA and Ena2A were real-space refined in Coot.9.8.9538 and phenix.refine (default settings with NCS constraints. The AF2 F-ENA-2 model was fitted into the reconstructed helical volume, and real-space refined in phenix.refine with tight restraints (target bonds rmsd: 0.05; Target angles rmsd: 0.5). Figures were created using ChimeraX 1.8.9239,40. Map and model statistics are found in Supplementary Table 1.

Cryo-EM and modeling of F-type-3 ENA fibers from the class Clostridia

Initially, spore-like architecture and endospore appendages were observed in an anaerobic, Gram-negative culture of Syntrophus aciditrophicus (DSM 26646), purchased from the DSMZ. Since recent advancements in cryo-EM and deep learning modeling tools now routinely allow for direct protein identification from cryo-EM maps at resolutions of 4 Å or better43, we proceeded directly to high-resolution cryo-EM imaging. This putative endospore culture (~4.5 μl) was applied to glow-discharged, lacey carbon grids and plunge-frozen using an EM GP Plunge Freezer (Leica). Cryo-EM data were collected on a 300 keV Titan Krios equipped with a K3 camera (University of Virginia) at a pixel size of 1.08 Å/pixel and a total dose of approximately 50 e/Ų. Patch motion correction and CTF estimation were performed using cryoSPARC44. Particles were then automatically selected using the “Filament Tracer” tool. These particles were subjected to multiple rounds of 2D classification to remove low-quality particles. Following this, 70,487 particles remained in the F-ENA-3 dataset, with a shift of 65 pixels between adjacent boxes. Possible helical symmetries were calculated from an averaged power spectrum generated from aligned, raw particles. Subsequently, 3D reconstruction was performed using “Helical Refinement,” testing all potential helical symmetries. The resolution of the final reconstruction was estimated using map:map FSC, model:map FSC, and d99. The final volume was sharpened with EMready22, and the corresponding statistics are listed in Supplementary Table 1.

The final helical reconstruction of the F-ENA-3 fiber achieved a resolution of 4.1 Å, as estimated by map:map FSC. To identify the protein sequence, ModelAngelo43 was used to build the protein backbone and predict amino acid sequences based on side-chain densities. A BLASTP search using this sequence against all proteins in GenBank yielded a top hit of MDF2656102, a protein encoded by an uncultivated member of the family Anaerovoracaceae within the class Clostridia, with 62% sequence identity. The second-best match had only 48% identity. To validate the appendage sequence, we designed several primer pairs targeting regions near F-ENA-3 and successfully amplified the gene from the spore genome, confirming that MDF2656102 is indeed the correct sequence. For the final model building, an AlphaFold prediction of MDF2656102 was docked into the cryo-EM map. The single subunit was manually refined in Coot38, the helical symmetry was applied to the model, and the resulting helical model was real-space refined in PHENIX36. Refinement statistics for F-ENA-3 are presented in Supplementary Table 1.

Generation of deletion strains

Appendage-depleted mutants were constructed as described in Pradhan et al.13. using markerless gene replacement45 using Bt407 as a background strain. Gene replacement constructs, containing the start and stop codons of the target genes flanked by upstream and downstream homologous sequences were amplified with chromosomal DNA as the PCR template. The homologous sequences were ligated into pMAD vectors and cloned into One Shot TOP10 chemically competent E. coli (Invitrogen, Thermo Fisher Scientific, USA) for plasmid amplification. Verification of the construct was performed through sequencing (LightRun, Eurofins, Luxembourg). The construct was passed through One Shot INV110 E. coli (Thermo Fisher Scientific) to demethylate the vector before transformation into electrocompetent Bt407 cells. Colonies containing the chromosomally integrated pMAD construct were identified by PCR and subsequently transformed with a demethylated pBKJ223 plasmid. The pBKJ223 plasmid encodes an enzyme that introduces a double-stranded DNA break in the integrated pMAD plasmid, leading to its excision from the chromosome and leaving behind a markerless gene deletion. Verification of the deletion was conducted using PCR with primers targeting regions upstream and downstream of the gene of interest.

Determination of the spore vs cell ratio

A confluent macrocolony of Btk grown on agar was resuspended in miliQ water to a final OD600 of 34. 5 µl of the suspension was deposited onto a microscope slide, sealed with a coverslip, and imaged using an inverted microscope Leica DMi8 equipped with a DFC7000 GT camera using a 100× oil objective in phase-contrast mode. A total of 50+ image tiles were collected using LasX and stitched together into a single tiff file covering 374 × 238 cm2. The total number of spores and cells was determined using Fiji46, yielding 2878 spores and 179 cells, leading to a spore fraction of 94%.

Sedimentation assays

For sedimentation analysis, spores of Bt407 and isogenic appendage-depleted mutants were prepared by streaking freshly cultured cells onto LB agar plates and incubating at 30 °C for two weeks or until sporulation reached ~98%, determined by phase-contrast (PH) microscopy. Mature spores were scraped off the agar, washed three times in distilled water, and resuspended in sterile distilled water. Purity was verified via PH microscopy.

Spore suspensions (OD600 = 10) were mixed by vortexing for 15 s, placed in borosilicate tubes (14 mm × 130 mm, DWK Life Sciences), and left to settle. For each strain, a total of three spore batches were tested. A SONY ILCE-5100 camera with a SELP1650 lens captured images at 0 h, 3 h, 5 h, and 10 h, then every 24 h until full sedimentation. Images were analyzed in ImageJ 1.54 f. Pixel intensity profiles of sedimenting spores were normalized to the highest intensity and converted to 200 datapoints using a randomized algorithm in RStudio (Version 2023.12.1). The profiles were plotted over time, with datapoints crossing 50% intensity extracted and plotted as a function of time.

Phylogenetic analysis

To probe for the presence of F-ENA across the Firmicutes phylum, all publicly available Genbank genomes (9342 genomes) belonging to the Bacillus/Clostridium group (taxid 1239; assembly-level complete) were downloaded from the NCBI database. Homo- and orthologs of F-ENA were searched in all assemblies with hmmsearch47 (881 genomes) using a hidden Markov model that was generated with hmmbuild using a manually curated multiple sequence alignment of F-ENA sequences obtained from a blastp48 search using WP_001121647.1 as a query. The inclusion threshold for hmmsearch was set to an E-value of 1e-7. The phylogenetic tree of the Firmicutes phylum was generated using phyloTv2 (https://phylot.biobyte.de/) using the NCBI taxonomy IDs of the corresponding assemblies as node identifiers, and imported into iTOL49 for visualization using the online server (https://itol.embl.de/).

For the more detailed search of F-ENA homo- and orthologs across the B. cereus group sensu lato, the BtyperDB v1 database (5976 genomes) was downloaded from https://www.btyper.app/, and hmmsearch was run against all assemblies with an E-value threshold of 1e-7 (5647 genomes). The phylogenetic tree described in Ramnath et al.21 and the corresponding metadata table were kindly provided by Laura M. Caroll. Tree figures were generated using https://microreact.org/50.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.