SymProFold: Structural prediction of symmetrical biological assemblies

Buhlheller, Christoph; Sagmeister, Theo; Grininger, Christoph; Gubensäk, Nina; Sleytr, Uwe B.; Usón, Isabel; Pavkov-Keller, Tea

doi:10.1038/s41467-024-52138-3

Download PDF

Article
Open access
Published: 18 September 2024

SymProFold: Structural prediction of symmetrical biological assemblies

Nature Communications volume 15, Article number: 8152 (2024) Cite this article

16k Accesses
5 Citations
16 Altmetric
Metrics details

Subjects

Abstract

Symmetry in nature often emerges from self-assembly processes and serves a wide range of functions. Cell surface layers (S-layers) form symmetrical lattices on many bacterial and archaeal cells, playing essential roles such as facilitating cell adhesion, evading the immune system, and protecting against environmental stress. However, the experimental structural characterization of these S-layers is challenging due to their self-assembly properties and high sequence variability. In this study, we introduce the SymProFold pipeline, which utilizes the high accuracy of AlphaFold-Multimer predictions to derive symmetrical assemblies from protein sequences, specifically focusing on two-dimensional S-layer arrays and spherical viral capsids. The pipeline tests all known symmetry operations observed in these systems (p1, p2, p3, p4, and p6) and identifies the most likely symmetry for the assembly. The predicted models were validated using available experimental data at the cellular level, and additional crystal structures were obtained to confirm the symmetry and interfaces of several SymProFold assemblies. Overall, the SymProFold pipeline enables the determination of symmetric protein assemblies linked to critical functions, thereby opening possibilities for exploring functionalities and designing targeted applications in diverse fields such as nanotechnology, biotechnology, medicine, and materials and environmental sciences.

Accurate prediction of protein assembly structure by combining AlphaFold and symmetrical docking

Article Open access 13 December 2023

Emergence of low-symmetry foldamers from single monomers

Article 20 November 2020

Rapid and accurate prediction of protein homo-oligomer symmetry using Seq2Symm

Article Open access 27 February 2025

Introduction

Symmetry patterns are a fundamental and widespread feature, spanning from the microscopic to the macroscopic scale. At the molecular level, the driving force of this symmetry is self-assembly, where individual components autonomously organize into larger, ordered structures driven by their interactions. Many proteins have symmetrical structures, which are essential for their function¹. Examples of this symmetrical organization include the hexagonal arrangement of transmembrane chemotaxis receptors, the icosahedral capsid proteins of viruses, and most prominent, the repetitive regular array of S-layer proteins^2,3,4,5,6.

Surface layers (S-layers) are porous 2-dimensional crystalline protein arrays covering the cell envelopes of many eubacterial and archaeal strains^5,6. The S-layer is built from one or more (glyco)-protein subunits, S-layer proteins (SLPs), that self-assemble into a highly flexible and dynamic lattice through an entropy-driven process, allowing for structural adaptation in response to changing environmental conditions^{4,6,7,8,9,10,11}. So far, S-layers have been shown to serve various functions, including cell stability, adhesion, molecular sieve, and aiding in osmotic stress adaptation^6,12,13,14. Still, each S-layer’s exact functionality and assembly properties remain unclear. Additionally, S-layers represent a distinct structural basis for generating complex supramolecular assemblies with considerable application potential in (nano)biotechnology, biomimetics, biomedicine, and synthetic biology^6,15. The ability to self-assemble also plays a crucial role for viral capsid proteins, which form envelopes encapsulating the viral nucleic acid¹⁶. Successful assembly and disassembly of the virus proteinaceous coat is crucial in virus replication and infection¹⁷.

Classical structure determination methods are often insufficient for obtaining atomic resolution insights into fully assembled S-layers due to their supramolecular assembly properties. We recently demonstrated that a fully assembled model from individual domain fragments can be obtained for the SlpA S-layer protein of Lactobacillus acidophilus¹⁸. However, determining the structure of fully assembled S-layers remains challenging, and so far, atomic-level assemblies have been solved for only a few species (Supplementary Table 1). The fast-advancing development and accuracy in structure prediction programs like RoseTTAFold¹⁹, AlphaFold2²⁰, and AlphaFold-Multimer²¹ are bridging the gap between missing experimental structures and protein interactions. Here, we present Symmetry Protein Fold (SymProFold), a pipeline for predicting the fully assembled proteins with certain symmetry and unit cell parameters, using only sequential information as input, without prior knowledge of symmetry or oligomerization state, compared to other methods.

AlphaFold has been extensively used to predict individual virus proteins^22,23 and a method to combine AlphaFold monomer predictions with symmetric all-atom docking simulations to predict cubic complexes is available²⁴. Several methods are currently available for the prediction of large oligomeric complexes which require besides the sequence additional information as input to ensure a reliable output. Depending on the method used in addition to the protein sequence, either stoichiometry information^25,26,27 or symmetry group information²⁸ are needed. Schweke et al. present a pipeline for the calculation of cyclic homo-oligomers using sequential information and the program AnAnaS^29,30,31 for the identification of symmetries, concentrating on cyclic symmetries and dihedral or cubic groups.

In this work, we present SymProFold a method specifically designed for the prediction of supramolecular 2D assemblies, as found in S-layers and virus capsids. The assemblies of 19 S-layers from bacteria and archaea as well as a viral capsid are predicted, which were not part of the AlphaFold-Multimer training set and whose structure was hitherto not solved (Supplementary Table 2). The generated SymProFold models were validated using experimental data both at the cellular and atomic levels, when available. The predictions provide comprehensive insight into the scarcely understood assembly of S-layers at an atomic level and reveal exciting features for S-layers which, considering the central function of S-layers in microorganisms, will be crucial for clarifying S-layer functionality and enabling the design of S-layers for their usage in a broad range of applications.

Results

Workflow of SymProFold

The underlying idea behind SymProFold is to combine oligomer predictions with the general symmetry patterns found in nature, as observed for S-layers, for generating a model of fully assembled layers. An overview of the general workflow is shown in Fig. 1 and is described in detail in the Methods section.

SymProFold uses the sequential information of the protein of interest, therefore, no prior knowledge of the symmetry or size of the unit cell of the assembly is needed. The protein analysis process begins with the definition of protein domains, which are defined either manually or via the tool ‘Domain_Separator’ (Supplementary Method 1). Subsequently, full-length and truncated variants of the protein, called subchains, are created according to specified domains (Supplementary Method 2, Supplementary Table 3). These subchains are then used for oligomer predictions, forming different symmetric complexes (dimers to hexamers), which are evaluated for rotational symmetry. The need for using subchains arises from three main reasons. Large systems (>3000–4000 amino acids) exceed the computing power of the used hardware, one symmetry center is strongly favored and hinders reliable prediction of the second symmetry center, or an assembly is impossible due to twisted arrangements of full-length models (Supplementary Table 4). The symmetry complexes are then filtered based on (i) rotational symmetry, (ii) a weighted model confidence score (0.8*ipTM+0.2*pTM, further referred to as ipTM+pTM score) of at least 0.2, (iii) the number of clashes, and (iv) no unusually high fraction of intermolecular β-strands (Supplementary Method 3). An example output of the filtering and clustering of symmetry complexes from A. salmonicida S-layer exhibiting a p4 symmetry is shown in Fig. 2. Prediction scores suggest the presence of two rotational symmetry axes (A, B), each exhibiting a 4-fold symmetry. A complete overview of our proposed models’ top-scored symmetry complexes and clustering is recapped in Supplementary Fig. 2 and all individual plots for each prediction are shown in Supplementary Figs. 3–20. Compared to full-length predictions, subchain predictions can result in higher ipTM+pTM scores due to the omission of domains connected via flexible linkers and their introduced uncertainty reflected in lower scores.

Fig. 2: Clustering of predicted symmetry complexes for a p4 S-layer from *A. salmonicida.*

Assessment of prediction quality

Eventually, the symmetry complexes are clustered by their binding interfaces, and potential repetitive assemblies are tested by superposition and scored with the aim of identifying the most probable representation of a fully assembled S-layer. For SLPs listed in Table 1, high-scoring SymProFold S-layer models were obtained (Fig. 3), corresponding well with the experimentally determined S-layer parameters. The structures shown are representative of the calculated ensemble of SymProFold models exhibiting high-quality scores (see “Methods – Parameter extraction”). The primitive unit cell of each S-layer is shown in Supplementary Fig. 29. In general, the symmetry complex which forms the rotational symmetry axis A had a slightly higher median ipTM+pTM score of 0.80 than axis B with a median ipTM+pTM of 0.74. Both axes combined result in a medium ipTM+pTM score of 0.77, indicating good prediction results (Supplementary Method 6, Supplementary Fig. 22, Supplementary Table 5). The number of effective sequences (Neff)³² in the MSA for the individual subchains was calculated using NEFFy³³, which did not correlate with a successful outcome. Subchains with an integrated Neff of below 5 still can result in an assembled layer (Supplementary Table 7). For p1 S-layers of EA1 from Bacillus anthracis and the main SLP from Bacillus licheniformis, an augmented set of single-domain subchains and heterodimer predictions with the full-length protein were used as described in the methods section and Supplementary Method 7 (Supplementary Figs. 23, 24).

Table 1 Comparison of lattice constants determined by prediction and experimentally

Full size table

**Fig. 3: Top view of the calculated S-layers.**

Validation of models with experimental data

We selected a broad range of S-layer proteins from Gram-positive, Gram-negative bacteria and archaea and validated calculated assembly models with experimentally published data (Table 1, Supplementary Table 1–2).

The unit cell parameters extracted by SymProFold of the assembled models correlate with literature values (Table 1). If available, we also compared the experimental microscopy data with our models (Fig. 4). Overall, the domain arrangement and assembly of our predictions agree well with the published data, showing average differences of 5% regarding cell parameters (Table 1). The SymProFold predicted S-layer assemblies reveal detailed structural insights where experimental studies provide limited and ambiguous information. For S-layers from Vibrio aerogenes, Paenibacillus naphtalenovorans, Pyrococcus abyssi, Methanococcus voltae, Thermococcus camini, Thermococcus thioreducens, and Phocaeicola vulgatus, no experimental data on the unit cell parameters, structure, or symmetry are reported. Still, SymProFold predictions of the mentioned S-layers exhibit high output scores, indicating the possible symmetry and structural architecture of these S-layers (Fig. 3).

**Fig. 4: Comparison of experimental data with predicted assemblies.**

High-resolution experimental data confirming the proposed assemblies at the atomic level is very scarce. Based on our predicted models of Viridibacillus arvi and Methanococcus voltae, we designed constructs (see Methods section) containing only the domain responsible for the formation of either the 4-fold or 2-fold axis. We successfully obtained crystal structures for both (PDB 9FS9; PDB 9FSA) and compared them to the predicted models (Fig. 4G, H). Aligning the crystal structures with the SymProFold models revealed a great similarity, with RMSD values of 0.65 Å for V. arvi and 1.38 Å for M. voltae, further confirming our predicted assemblies. Recently, high-resolution structures for EA1 of Bacillus anthracis³⁴ and the SLP from Nitrosopumilus maritimus³⁵ (not included in our initial dataset) were reported. Both structures agree well compared to our SymProFold models (Supplementary Figs. 25–27).

Exploring distinct features of predicted S-layer assemblies

To date, the structures of only a few SLPs have been reported (Supplementary Table 1). The assembly data acquired for S-layers whose structure was not yet known provides a foundation for the functional analysis of distinct S-layers, as well as for delineating distinct features within this protein family (Supplementary Fig. 28). As observed for D. mucosus, pore sizes can be enormous (Supplementary Fig. 28I), indicating that in this individual case, the flexibility of the layer is more critical than the barrier function. Most of the other predicted S-layer assemblies form a tight network, acting as an efficient filter selectively allowing molecules to pass in and out. Moreover, depending on the environmental conditions^36,37,38 either flexibility or mechanical stability of the S-layer is of greater importance. The length of the structural elements linking the surface exposed region to the membrane provides additional valuable information about the thickness of the periplasmic-like space in bacteria and archaea (Supplementary Fig. 28I).

The prediction of the C. glutamicum S-layer reveals a single domain α-helical protein with p6 symmetry (Supplementary Fig. 28E). The C-terminus ends in a predicted transmembrane helix resembling the anchoring mechanism of archaeal S-layers (Supplementary Fig. 28F,G). The SymProFold prediction of S-layers from P. abyssi, M. voltae, T. camini, T. thioreducens, and M. vannielii, even though different in sequence, size, and domain structure, show an analog domain arrangement as the crystal structure of M. acetivorans SLP (Supplementary Fig. 27) and reveal a dimeric anchor, which has not been described previously (Supplementary Fig. 28A–D). For the assembly prediction of S-layers from B. brevis and V. arvi (Slp1), an additional region above the core, responsible for the self-assembly, is present. The domains and residues in this layer might be important for interaction with the environment, including the host. In addition, the fully assembled S-layer gives valuable information about the orientation of the layer and surface properties as the electrostatic potential, revealing potential functions (Supplementary Fig. 30).

Prediction of viral capsids

In the case of viral capsids, a curved surface favors a 5-fold rotational symmetry (Supplementary Fig. 21). SymProFold predicted an icosahedral viral capsid from Odonata-associated circular virus 21 (T = 1) belonging to the Smacoviridae³⁹. Analog to the regular SymProFold, two symmetry complexes were calculated and superimposed to one fully assembled tile (Fig. 5A–C). The curvature is present in the calculated tile (Fig. 5D, E) that, when manually superimposed on each other, generates the full viral capsid (Fig. 5F). This model would have a diameter of ca. 23 nm. There is no experimental data for the diameter of Smacoviridae, but close members of the same phylum Cressdnaviricota show a diameter of 17–20 nm (Nanoviridae) and 20–22 nm (Genomviridae)⁴⁰, which correlates with our proposed model. Compared to the median ipTM+pTM score (Supplementary Fig. 22) of our benchmark cases (axis A: 0.80, axis B: 0.74), the virus example shows slightly lower ipTM+pTM values for axis A 0.72 and axis B 0.55, but are still within a typical range of the predicted models.

**Fig. 5: Prediction of a viral capsid of *Odonata-associate circular virus 21*.**

Prediction of symmetrical oligomers with only one symmetry axis

SymProFold was further tested on proteins that do not form symmetrical 2D arrays yet are known, through literature or existing crystal structure data, to possess a rotational symmetry axis. The circadian clock protein KaiC (Uniprot: Q8GGL1) forms symmetrical ring-like shaped hexamers⁴¹. SymProFold can reliably predict the 6-fold axis with high model confidence scores. Another 2-fold axis was found with extremely low scores just above the cutoff value of ipTM+pTM of 0.2 (Supplementary Fig. 31). At the superposition step, it was not possible to generate an assembly in a plane without obtaining severe clashes. Therefore, it was automatically filtered out. The crystal structure of YabJ (PDB 5Y6U⁴²) forms a homotrimer (Supplementary Fig. 32A). SymProFold identified a 3-fold symmetry complex with a high ipTM+pTM of 0.96, which is higher than the median score of the benchmark cases probably due to the structure’s association in the AlphaFold-Multimer training set (Supplementary Fig. 33). A 4-fold symmetry complex was predicted with much lower scores, showing the same main binding interfaces as the 3-fold prediction, and therefore was clustered to the same axis. No additional symmetry axis was predicted, therefore a superposition of two different axes is not possible. As a third example, we tested the sequence of N9 neuraminidase from the Influenza A virus. The crystal structure (PDB 6MCX⁴³) shows a symmetrical 4-fold axis (Supplementary Fig. 32B). For this protein, SymProFold predicts a single cluster with a 4-fold rotational symmetry axis (Supplementary Fig. 34). Since there is only one axis of rotational symmetry, spanning a 2D layer is not possible. For all of our three presented test cases, SymProFold stopped and no 2D layer was falsely positively generated for proteins with just one rotational axis.

Discussion

SymProFold can predict the structural organization of higher symmetrical assemblies, such as S-layers, as present in their native state in assembled form at the cell surfaces. As only sequential information is required as an input, this approach enables numerous avenues for investigating symmetrical formations, even those that pose significant experimental challenges, such as high toxicity, pathogenicity, extreme growth conditions, and substantial costs for the experimental investigation. Furthermore, the self-assembly property and significant sequence variation of S-layers present a challenge for structural characterization using techniques such as X-ray crystallography and electron microscopy.

Due to the considerable size diversity of SLPs, some exceed 1000 amino acids⁶, predictions of tetramers or hexamers can easily exceed the computing power needed to get highly confident models with AlphaFold. Nevertheless, the rapid improvements of AlphaFold in predicting large protein complexes will increase the size limit of SymProFold calculations. Calculated models can potentially be improved by manual pre-processing of input sequences according to prior knowledge. Regions like signal sequences or domains not involved in the assembly, such as cell wall anchors, can be removed before the calculation, thereby reducing the size of the input sequence. The SymProFold method has several restrictions that lead to early termination of the pipeline when sequences of non-assembling proteins are used. The SymProFold pipeline terminates if at least one of the following cases is true: the axis tilt is above 45° (SymProFold supports 2-, 3-, 4- or 6-fold rotational symmetry), the gap between the predicted symmetry complexes is too large (and they cannot be connected), the deviation between the calculated lattice constant (unit cell) and the model (by superposition) is too large, or a symmetry axis complex does not have a rigid folding unit.

In case SymProFold fails to detect more than one strong intermolecular interaction, the construction of a fully assembled layer is not possible and results in a partially assembled model. This limitation causes a loss of information regarding weak but critical interactions, which may be required for a complete assembly. S-layers and viral capsids, which consist of multiple proteins, also represent a challenge for SymProFold predictions. In such cases, prior knowledge may be needed as the identification of the interactions important for the initiation of the assembly could be difficult via the automatic pipeline. Nevertheless, manual intervention and/or an extension of the software could overcome this limitation and enable the prediction of assemblies composed of more than one protein. Further adaptations in SymProFold are also needed for viral capsid automated predictions to address potential challenges like pseudosymmetry, higher triangulation number, and variations in assembly curvature.

Many biological functions of S-layers depend on the completeness of the cell coverage as well as the structural and physicochemical repetitive uniformity, down to the subnanometer scale. Obtained structural data of assembled S-layers reveal properties of the exposed surface regions (Supplementary Fig. 29), internal pores, and anchoring domains essential for cell attachment and highlight interaction interfaces within the layer. Understanding these aspects is crucial for elucidation of the specific function of a particular S-layer, depending on the microorganism. This is especially important in pathogenic bacteria since S-layers play a role in surface adhesion and in interactions with the host’s immune system. Furthermore, numerous studies on the in vivo and in vitro morphogenesis of S-layers demonstrated that lattice growth on growing cells is a highly dynamic process^{6,7,11,12,44,45,46,47}. Approximately 500 subunits per second must be synthesized at high growth rates, translocated to the cell surface, and incorporated in a defined orientation to the existing lattice while maintaining an equilibrium of the lowest free energy^{6,46,47,48,49}. The adaptable lattice of S-layers on growing and dividing bacterial and archaeal cells represents an advanced evolutionary stage in morphogenesis⁴⁶. At these specific sites, bonds must swiftly open and re-form. These dynamics are also present in the assembly of the capsid during the viral life cycle, including disassembly upon infection and reassembly during viral packaging. To gain insights into these dynamic processes, the SymProFold predicted assemblies can serve as a starting point to identify the required interactions within the lattice, perform molecular dynamic studies, and investigate binding events with ligands or receptors. S-layers often represent the outermost layer, therefore environmental conditions could impact their structure and functionality. Using the predicted assemblies in combination with computational methods, the influence of e.g. pH changes on the S-layer could be investigated.

Structural insights into the S-layer assembly offer a vast opportunity for developing technologies that can mimic the distinct properties of these versatile and adaptable structures. Rational engineering of artificial structures with different properties could include the design of materials with advanced self-assembly properties or the creating improved drug delivery systems and biosensors⁵⁰.

Since the S-layers of pathogens serve as favorable regions for drug targeting^36,51, understanding of anchoring mechanisms and/or interactions within the self-assembled S-layer provides the ideal basis for rational drug design and enables the development of strategies for weakening S-layer protective function. Furthermore, structural information now allows for analyzing the antifouling properties of S-layer lattices⁶ and offers a foundation for mimicking these structures in polymer technologies⁵². The precise biophysical characterization of the pores allows for custom alterations of S-layer permeability, which was previously possible only through chemical modifications¹². S-layers have also shown great potential as a platform for drug delivery because of their biocompatibility, stability, and regular pore structure. Lipidic nanoformulations such as liposomes, solid lipid nanoparticles (SLNs), or emulsomes as drug delivery systems show better resistance to oxidative stress and membrane damage when covered with a crystalline S-layer^53,54. Uptake studies with emulsomes coated with S-layer proteins did not show significant cytotoxicity by human liver carcinoma cells (HepG2). The capacity to recrystallize S-layer proteins in lipidic formulations allows extremely precise targeted delivery of specific antibodies in high concentrations^55,56 or drug-loaded particles, especially poorly water-soluble targets such as antimicrobial peptides or easily degradable biologicals such as enzymes used for the enzyme replacement therapy^{6,53,55,55,57}. Not yet resolved functionalities, such as cellular targeting and generation of fusion proteins with specific properties⁵⁸ can be introduced. The S-layers of bacteria and archaea in the human microbiome are perfect candidates for these biotechnological applications. Altogether, this opens a promising chapter in rational engineering and adaptation of many S-layers for applications in medicine and diagnostics, like treating fungal and viral infections, dermal conditions, cancer, immune deficiency, or rare genetic disorders.

Methods

Pre-filtering of possible prediction candidates

SymProFold uses sequence files in fasta format as input and consequently checks their predictability by performing homodimer predictions with AlphaFold-Multimer²¹. Homodimer predictions are scored according to the ipTM+pTM score, a weighted model confidence score for complex predictions, which ranges from 0–1 and is calculated using 80% of the ipTM score and 20% of the pTM score as described by Evans et al. 2022²¹. Only homodimer predictions with a minimal score of 0.3 ipTM+pTM are further processed, excluding proteins that are unlikely to dimerize. The sequences of 19 annotated SLPs, 3 non-SLPs, and viral capsid from Odonata-associated circular virus 21 were used as input. Table 1 includes a list of proteins for which SymProFold S-layer predictions were performed. We used the annotated S-layer protein for each species, as reported in the Uniprot databank, for all of the presented S-layer predictions and calculations.

Identification of domains and subchains

As a next step, domains of the protein need to be defined in a fasta file with domains separated by line breaks. Domain boundaries can be set manually, or an automated domain identification can be used. Both start with predicting the full-length protein structure as a monomer using AlphaFold. For automated domain identification, we created a ‘Domain_Separator’ tool that can be included in the SymProFold pipeline to prepare the fasta file. Domain_Separator is a Python script that uses a coordinate file as input, identifies structural domains, and creates a fasta file with domain sequences separated by line breaks. Additionally, a ChimeraX file with separately colored domains can be generated. In an iterative process, small domain subsections are merged until they describe a complete domain. The initial domain subsections are chain ranges created through local crosslinking of secondary structures. Using relations between contact areas, surface areas, and subsection sequence lengths, neighboring domain subsections are then iteratively merged into even larger domain subsections. This iterative process ends when a saturation of merging is reached. In a postprocessing step, for each linker between domain sections, a cropping point is identified that minimizes the contacts between both domains. An example of Domain_Separator utilization is found in the Supplementary Method 1 and Supplementary Fig. 1. The generated fasta file can be used as direct input for SymProFold. Both methods lead to comparable results for identifying domains in our presented examples.

Once the domains are defined in the fasta file, a set of five different subchains of the protein is generated. These subchains represent truncations of the protein according to a scheme described in Supplementary Table 3, which includes the full-length sequence, a subchain without the N-terminus, a subchain without the C-terminus, the first third of the domains, and the last third of the domains. Minimum or maximum lengths were defined (Supplementary Table 3) to mitigate the effects of very large or small domains.

Prediction of oligomer sets and filtering

For each subchain, the algorithm starts different oligomer predictions (dimers, trimers, tetramers, and hexamers). Five models are calculated for each prediction, and the resulting symmetry complexes are evaluated and further processed.

All predictions are evaluated to check if there is a rotational symmetry axis that can align all the subchain components within the oligomer prediction to each other, satisfying the symmetry requirement. Therefore, the algorithm checks whether the occurring angles between the monomers correspond to a 2-, 3-, 4- or 6-fold rotational symmetry and whether the associated rotational symmetry axis is uniform within a tolerance range (max. of 5 Å deviation for each monomer). Models with an ipTM+pTM score of ≥0.20 and existing interfaces between the monomers are treated as symmetry complexes. The order of the rotational symmetry axis (k-fold) of each complex is deduced from its symmetry angle (2-, 3-, 4-, or 6-fold rotational symmetry axis). For example, a trimer prediction of a given subchain of an SLP that forms a p6 S-layer could result in a 3-fold axis (∆φ = 120°) or 6-fold axis (∆φ = 60°). In the latter case, the property of a 6-fold symmetry axis is described by only three predicted subunits instead of 6. The symmetry complexes are further filtered by criteria to discard predictions of low quality. With respect to the complete protein chain, relaxed models with at least 3.0 clashes per 100aa (and 60.0 per 100aa for unrelaxed) are excluded. Additionally, in all possible sequence sections of length 200aa, only 6.0 clashes per 100aa for relaxed models (and 120.0 per 100aa for unrelaxed) are allowed. The number of clashes is calculated using the corresponding method in ChimeraX⁵⁹. Models in which a part of the protein chain passes incorrectly through the neighboring molecule are excluded by filtering out models with unusually high fractions of intermolecular β-strands.

Clustering via interfaces

The symmetry complexes are clustered according to the agreement between their binding interfaces, which are compared via interface matrices (distograms). Each cluster can be viewed as a candidate for a rotational symmetry axis of the assembled S-layer. All symmetry complexes are represented by a node in a network graph, connected by edges. Every edge weight is proportional to a correlation coefficient between the interface matrices of the two symmetry complexes (nodes) connected by it. A more significant agreement between both interface matrices leads to a larger correlation coefficient (see Supplementary Method 4 for details). This network graph is partitioned using the Louvain⁶⁰ method.

Typically, each cluster contains differently cropped subchains, resulting in the same symmetry axis. For each cluster, high ipTM+pTM scores and, where present, several different molecule counts leading to the same order of symmetry are indicators for the fold of the respective rotational symmetry axis.

Superposition of symmetry complexes

Pairs of symmetry complexes from 2 different clusters (symmetry axes) are tested on the possibility of repetitive 2D assembly formation by the superposition of overlapping regions. At least one domain must constitute the overlapping region. The pairs of symmetry complexes to be tested are selected according to indicators of highest scores in a cluster. In case no clear result is found, more pairs of symmetry complexes with lower ipTM+pTM scores from different clusters can be tested. From a pair of symmetry complexes (A, B), the symmetry complex (A) with the higher order of rotational symmetry is aligned in z direction and then complemented by the corresponding number of copies of the other (B) by overlapping superposition. The symmetry complexes are superposed using the matchmaker method of ChimeraX⁵⁹. Subsequently, the rotational symmetry axes of symmetry complexes B are aligned in z direction using linkers between complexes A and B as pivot points. Then the symmetry complexes B are complemented by copies of the symmetry complex A by overlapping superposition and the rotational symmetry axes are aligned likewise.

Parameter extraction

The superposition of the assembled S-layer is scored based on the average intermolecular clashes per residue and the bending score of the assembly. Assembly bending is assessed by the axial tilt between the rotational symmetry axes of symmetry complexes A and B. The bending score is the RMSD of the angles between the axes of the central symmetry complex A and the nearest neighbor symmetry complex B. The score is normalized so that an angle of 0 rad (0°) corresponds to a bending score of 0, and an angle of π/2 rad (90°) corresponds to a bending score of 1. Addition of the average clashes per residue (Eq. (1)) and the bending score (Eq. (2)) results in the combined quality score (Eq. (3)).

$${{{\rm{score}}}}_{{{\rm{clash}}}}=\frac{{N}_{{{\rm{clashes}}}}}{{N}_{{{\rm{res}}}}}$$

(1)

$${{{\rm{scor}}}}{{{\rm{e}}}}_{{{\rm{bend}}}}=\frac{2}{\pi }\sqrt{\frac{1}{n}{\sum }_{i=1}^{n}{\left({\varphi }_{i}-{\varphi }_{0}\right)}^{2}}$$

(2)

$${{\rm{scor}}}{{{\rm{e}}}}_{{{\rm{quality}}}}={{\rm{scor}}}{{{\rm{e}}}}_{{{\rm{clash}}}}+{{\rm{scor}}}{{{\rm{e}}}}_{{{\rm{bend}}}}$$

(3)

Unit cell parameters are determined using averaged distances calculated from vector differences between the noncentral symmetry complexes A with their central symmetry mate. Function command clashes in ChimeraX is used to calculate the clash score (Supplementary Method 3)

Assembly of S-layer unit cell

A mathematically exact unit cell is created using the determined unit cell parameters. To obtain the best possible model, the quality score can be optimized by variation of the pairs of symmetry complexes. The final output is a mmcif file with the primitive unit cell, the unit cell parameters, and symmetry operations to generate the symmetry mates to get a fully assembled S-layer. Furthermore, in the output folder files from previous steps of SymProFold for a subchain definition, symmetry complexes (SymPlot) and interface network graphs are available.

Prediction of p1 S-layer

If SymProFold terminates during the workflow and cannot create a model of a primitive unit cell, it might be due to a p1 symmetry of the S-layer. Within a p1 S-layer, no rotational symmetry axes of order 2 or higher occur, therefore SymProFold needs some adjustments to predict the assembly successfully. If a p1 symmetry is known or assumed, an alternative SymProFold method for the p1 S-layer can be manually applied. In the p1 pipeline, no set of subchains is generated, but a full-length prediction is split either manually or with Domain_Separator into all possible single domains. Sets of heterodimers of the full-length protein and each single domain are calculated. If two or more strong interactions are found, the full-length model is mapped to the individual domains, and the fully assembled layer in both directions is generated. Automated superposition of axis objects can still be applied to generate the fully assembled S-layer.

Prediction of the viral capsid

As a test case, we selected the viral capsid from Odonata-associated circular virus 21 (Uniprot: A0A0B4UH63). Allowing for the possibility of a 5-fold rotational symmetry axis in the SymProFold workflow basically allows the prediction of some viral capsids. For this, oligomer predictions are performed for dimers, trimers, tetramers, pentamers, and hexamers. In our test case, the predictions are made for only one subchain. In the filtering step, the oligomer predictions are tested for the presence of rotational symmetry axes that can map subchains in the oligomer prediction to one another and fulfill rotational symmetry operations. Therefore, the algorithm checks whether the occurring angles between the monomers correspond to 2-, 3-, 4-, 5- or 6-fold rotational symmetries and whether the associated rotational symmetry axes are uniform within a tolerance range. Since the surface of viral capsids is not a flat 2D plane, the rotational symmetry axes of symmetry complexes B are not aligned parallel to symmetry complex A. SymProFold currently does not support the automated assembly of viral capsids, but the results from the best scoring rotational symmetry axes can be manually built into an orientation that results in fully assembled capsid.

Resources

For SymProFold, we present an implementation in Python 3, which uses function libraries of ChimeraX 1.6⁵⁹. The Domain_Separator tool is written in Python and uses ChimeraX function libraries and dssp⁶¹ to determine secondary structures. Contacts between residues are determined using the function contacts of ChimeraX and the functions surface and measure to calculate solvent-excluded surface areas. Domain_Separator uses the Python library NetworkX.

Computing requirements

An AlphaFold-Multimer 2.3 installation in standard configuration with full databases and v3 model weights was used for all complex calculations. The default value of max. 20 recycling iterations was used. The predictions were calculated at the VSC-5 Vienna Scientific Cluster (Vienna, Austria) on a GPU (NVIDIA A100, 40GB of vram).

Protein expression

The codon-optimized sequences (E. coli) for Varv_VI (Viridibacillus arvi; amino acids 765–844) and Mvol_anchor (Methanococcus voltae; amino acids 24–75 and 484–576 connected with a GGS linker) were purchased precloned in pET24 (BioCat GmbH, Heidelberg, Germany). The 5 µg of plasmid were dissolved in 50 µl water and 0.5 µl were transformed into One Shot BL21 Star (DE3) Chemically Competent E. coli (Thermo Fisher Scientific, Waltham, MA, U.S.A.) and plated on LB agar containing 35 µg/ml kanamycin. Single colonies were picked for overnight cultures in LB broth containing 35 µg/ml kanamycin. Main cultures in 100 ml LB broth containing kanamycin were inoculated 1/100 with ONC and grown to an OD600 between 0.4 and 0.7 at 37 °C. The expression was induced by the addition of 0.5 mM IPTG, the temperature was reduced to 20 °C and carried out ON. Expression cultures were harvested by centrifugation (2800 g, 20 min, 4 °C). The supernatant was discarded, and the pellet was frozen (−20 °C until further use).

Protein purification

Expression pellets were thawed and resuspended in 20 ml lysis buffer (50 mM HEPES pH 7.5, 300 mM NaCl) and sonicated for cell disruption (Bandelin Sonoplus sonicator at 80%, 5 cycles, 5 min, on ice). The supernatant containing the soluble protein was filtered (Rotilabo syringe filter, PVDF, pore size 0.45 µm, Carl Roth GmbH and Co., Karlsruhe, Germany) before loading samples onto an ÄKTA pure system from GE Healthcare (Chicago, United states). The HisTrap FF affinity column 5 ml from GE Healthcare was equilibrated with the previously mentioned lysis buffer. The column was washed with 20 column volumes with 5% elution buffer (50 mM HEPES pH 7.5, 300 mM NaCl, 500 mM imidazole). The proteins were eluted with 50% elution buffer, concentrated using Amicon Ultra centrifugal filters (Millipore, Merck KGaA, Darmstadt, Germany) to a volume of 500 µl and subjected to gel filtration with SEC buffer (25 mM HEPES pH 7.5, 150 mM NaCl) with a Superdex 200 Increase 10/300 column (Cytiva, Marlborough, MA, USA). Peak fractions were pooled and concentrated for crystallization.

Crystallization

Crystallization experiments were performed using vapor diffusion sitting drop in Swissci UVXPO 3 Lens crystallization plates (High Wycombe, United Kingdom). Pipetting was carried out with an Oryx 8 robot (Douglas Instruments, United Kingdom) with 35 µL of condition solution in the reservoir and drops of 0.3 µl protein; the concentration ranges from 10 mg/ml to 22 mg/ml, mixed with 0.3 µl screening solution (JCSG+ eco screen, Molecular Dimensions, Calibre Scientific, Rotherham, UK).

Data collection and processing

Crystals were frozen in liquid nitrogen and data collection of all crystals was performed at 100 K. Crystal screening and data collection were carried out with an in-house dual port system made up of a MetalJet X-ray source (Excillium, Kista, Sweden), a D8 Venture X-ray diffractometer (Bruker, Billerica, USA) and a Photon III detector (Bruker, Billerica, USA). Data processing was performed with DIALS 3.8⁶², and data reduction with pointless and aimless⁶³. For the dataset collected for Methanococcus voltae, an anisotropic cutoff was applied with the Staraniso webserver⁶⁴.

Structures solution, refinement, and analysis

Both structures were solved by molecular replacement using Phaser⁶⁵ with the predictions presented in this manuscript (monomeric fragments of the respective amino acid ranges) as templates. Refinement was performed with Refmac⁶⁶ and phenix.refine⁶⁷. Structures were deposited at the RCSB Databank with the PDB codes 9FS9 and 9FSA. Table1 containing the collection and refinement statistics is available in Supplementary Method 8 and Supplementary Table 6.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The generated crystal structures are available at the RCSB protein database under accession codes 9FS9 and 9FSA. All presented SymProFold models are available under https://github.com/symprofold including a detailed tutorial. Previously published crystal structures mentioned in the paper include: 5Y6U and 6MCX. UniProt entries used in this work are Q2VRQ3, P35823, A0A1M7YYM3, A0A1M5ZCF8, A0A7G2D8K1, A0A5P3AY64, A6URZ5, A0A0Q2M111, A0A0K2Z0V7, P22258, P06546, E8R795, Q6TL21, I3XTG6, Q50833, Q0VJW4, Q9V0N3 and A0A0U2M877. Source Data are provided as a Source Data file. Source data are provided with this paper.

Code availability

The source code of SymProFold and for Domain_Separator, an installation guide, a tutorial, and the models calculated with SymProFold, are available at the GitHub repository [https://github.com/symprofold]. Source code is also available at [https://doi.org/10.5281/zenodo.13327126].

References

Ahnert, S. E., Marsh, J. A., Hernandez, H., Robinson, C. V. & Teichmann, S. A. Principles of assembly reveal a periodic table of protein complexes. Science (1979) 350, aaa2245–aaa2245 (2015).
Google Scholar
Levy, E. D., Pereira-Leal, J. B., Chothia, C. & Teichmann, S. A. 3D complex: a structural classification of protein complexes. PLoS Comput Biol. 2, e155 (2006).
Article ADS PubMed PubMed Central Google Scholar
Muok, A. R. et al. Atypical chemoreceptor arrays accommodate high membrane curvature. Nat. Commun. 11, 5763 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Pum, D., Breitwieser, A. & Sleytr, U. B. Patterns in nature—S-layer lattices of bacterial and archaeal cells. Cryst. (Basel) 11, 869 (2021).
CAS Google Scholar
Messner, P., Schäffer, C., Egelseer, E.-M. & Sleytr, U. B. Occurrence, Structure, Chemistry, Genetics, Morphogenesis, and Functions of S-Layers. in Prokaryotic Cell Wall Compounds 53–109 (Springer Berlin Heidelberg, Berlin, Heidelberg, 2010). https://doi.org/10.1007/978-3-642-05062-6_2.
Sleytr, U. B., Schuster, B., Egelseer, E. M. & Pum, D. S-layers: principles and applications. FEMS Microbiol Rev. 38, 823–864 (2014).
Article CAS PubMed Google Scholar
Pum, D., Toca-Herrera, J. L. & Sleytr, U. B. S-Layer protein self-assembly. Int. J. Mol. Sci. 14, 2484–2501 (2013).
Stel, B., Cometto, F., Rad, B., De Yoreo, J. J. & Lingenfelder, M. Dynamically resolved self-assembly of S-layer proteins on solid surfaces. Chem. Commun. 54, 10264–10267 (2018).
Article CAS Google Scholar
Rad, B. et al. Ion-specific control of the self-assembly dynamics of a nanostructured protein lattice. ACS Nano 9, 180–190 (2015).
Article CAS PubMed Google Scholar
Bharat, T. A. M., von Kügelgen, A. & Alva, V. Molecular Logic of Prokaryotic Surface Layer Structures. Trends Microbiol. 29, 405–415 (2021)..
Herrmann, J. et al. A bacterial surface layer protein exploits multistep crystallization for rapid self-assembly. Proc. Natl Acad. Sci. 117, 388–394 (2020).
Article ADS CAS PubMed Google Scholar
Fagan, R. P. & Fairweather, N. F. Biogenesis and functions of bacterial S-layers. Nat. Rev. Microbiol. 12, 211–222 (2014)..
Stuknyte, M. et al. Lactobacillus helveticus MIMLh5-specific antibodies for detection of S-layer protein in Grana Padano protected-designation-of-origin cheese. Appl Environ. Microbiol 80, 694–703 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Schuster, B. & Sleytr, U. B. S-layer ultrafiltration membranes. Membr. (Basel) 11, 275 (2021).
CAS Google Scholar
Schuster, B. & Sleytr, U. B. Nanotechnology with S-layer Proteins. in Methods Mol Biol. 195–218 https://doi.org/10.1007/978-1-4939-9869-2_12 (2020).
Johnson, J. E. & Olson, A. J. Icosahedral virus structures and the protein data bank. J. Biol. Chem. 296, 100554 (2021).
Article CAS PubMed PubMed Central Google Scholar
Kim, H., Ko, C., Lee, J.-Y. & Kim, M. Current progress in the development of hepatitis B virus capsid assembly modulators: chemical structure, mode-of-action and efficacy. Molecules 26, 7420 (2021).
Article CAS PubMed PubMed Central Google Scholar
Sagmeister, T. et al. The molecular architecture of Lactobacillus S-layer: Assembly and attachment to teichoic acids. Proceedings of the National Academy of Sciences 121, (2024).
Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science (1979) 373, 871–876 (2021).
CAS Google Scholar
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Evans, R. et al. Protein complex prediction with AlphaFold-Multimer. bioRxiv 2021.10.04.463034 https://doi.org/10.1101/2021.10.04.463034 (2022)
Gutnik, D., Evseev, P., Miroshnikov, K. & Shneider, M. Using AlphaFold predictions in viral research. Curr. Issues Mol. Biol. 45, 3705–3732 (2023).
Article CAS PubMed PubMed Central Google Scholar
Yang, Z., Zeng, X., Zhao, Y. & Chen, R. AlphaFold2 and its applications in the fields of biology and medicine. Signal Transduct. Target Ther. 8, 115 (2023).
Article PubMed PubMed Central Google Scholar
Jeppesen, M. & André, I. Accurate prediction of protein assembly structure by combining AlphaFold and symmetrical docking. Nat. Commun. 14, 8283 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Shor, B. & Schneidman-Duhovny, D. CombFold: predicting structures of large protein assemblies using a combinatorial assembly algorithm and AlphaFold2. Nat. Methods 21, 477–487 (2024).
Article CAS PubMed PubMed Central Google Scholar
Gao, M., Nakajima An, D., Parks, J. M. & Skolnick, J. AF2Complex predicts direct physical interactions in multimeric proteins with deep learning. Nat. Commun. 13, 1744 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Bryant, P., Pozzati, G. & Elofsson, A. Improved prediction of protein-protein interactions using AlphaFold2. Nat. Commun. 13, 1265 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Li, Z. et al. Uni-Fold Symmetry: Harnessing Symmetry in Folding Large Protein Complexes. https://doi.org/10.1101/2022.08.30.505833 (2022)
Schweke, H. et al. An atlas of protein homo-oligomerization across domains of life. Cell 187, 999–1010.e15 (2024).
Article CAS PubMed Google Scholar
Pagès, G. & Grudinin, S. Analytical symmetry detection in protein assemblies. II. Dihedral cubic. symmetries. J. Struct. Biol. 203, 185–194 (2018).
Article PubMed Google Scholar
Pagès, G., Kinzina, E. & Grudinin, S. Analytical symmetry detection in protein assemblies. I. Cyclic symmetries. J. Struct. Biol. 203, 142–148 (2018).
Article PubMed Google Scholar
Wu, T., Hou, J., Adhikari, B. & Cheng, J. Analysis of several key factors influencing deep learning-based inter-residue contact prediction. Bioinformatics 36, 1091–1098 (2020).
Article CAS PubMed Google Scholar
Haghani, M. NEFFy: NEFF Calculator and MSA File Converter. https://github.com/Maryam-Haghani/Neffy (2024).
Sogues, A. et al. Structure and function of the EA1 surface layer of Bacillus anthracis. Nat. Commun. 14, 7051 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
von Kügelgen, A. et al. Membraneless channels sieve cations in ammonia-oxidizing marine archaea. Nature 630, 230–236 (2024).
Article ADS Google Scholar
Fioravanti, A., Mathelie-Guinlet, M., Dufrêne, Y. F. & Remaut, H. The Bacillus anthracis S-layer is an exoskeleton-like structure that imparts mechanical and osmotic stabilization to the cell wall. PNAS Nexus 1, (2022).
Pandur, Ž. & Stopar, D. Evolution of mechanical stability from lipid layers to complex bacterial envelope structures. In 207–251 https://doi.org/10.1016/bs.abl.2020.09.005 (2021).
Engelhardt, H. Are S-layers exoskeletons? the basic function of protein surface layers revisited. J. Struct. Biol. 160, 115–124 (2007).
Article CAS PubMed Google Scholar
Varsani, A. & Krupovic, M. Smacoviridae: a new family of animal-associated single-stranded DNA viruses. Arch. Virol. 163, 2005–2015 (2018).
Article CAS PubMed Google Scholar
Krupovic, M. et al. Cressdnaviricota: a Virus Phylum Unifying Seven Families of Rep-Encoding Viruses with Single-Stranded, Circular DNA Genomes. J Virol 94, (2020).
Mori, T. et al. Circadian clock protein KaiC forms ATP-dependent hexameric rings and binds DNA. Proc. Natl Acad. Sci. 99, 17203–17208 (2002).
Article ADS CAS PubMed PubMed Central Google Scholar
Fujimoto, Z., Hong, L. T. T., Kishine, N., Suzuki, N. & Kimura, K. Tetramer formation of Bacillus subtilis YabJ protein that belongs to YjgF/YER057c/UK114 family. Biosci. Biotechnol. Biochem 85, 297–306 (2021).
Article PubMed Google Scholar
Streltsov, V. A., Schmidt, P. M. & McKimm-Breschkin, J. L. Structure of an Influenza A virus N9 neuraminidase with a tetrabrachion-domain stalk. Acta Crystallogr F. Struct. Biol. Commun. 75, 89–97 (2019).
Article CAS PubMed PubMed Central Google Scholar
Boutonnet, C. et al. Dynamic Profile of S-Layer Proteins Controls Surface Properties of Emetic Bacillus cereus AH187 Strain. Front Microbiol 13, (2022).
Comerci, C. J. et al. Topologically-guided continuous protein crystallization controls bacterial surface layer self-assembly. Nat. Commun. 10, 2731 (2019).
Article ADS PubMed PubMed Central Google Scholar
Pum, D., Messner, P. & Sleytr, U. B. Role of the S layer in morphogenesis and cell division of the archaebacterium Methanocorpusculum sinense. J. Bacteriol. 173, 6865–6873 (1991).
Article CAS PubMed PubMed Central Google Scholar
Sleytr, U. B. & Plohberger, R. The Dynamic Process of Assembly of Two-Dimensional Arrays of Macromolecules on Bacteria Cell Walls. in 36–47 https://doi.org/10.1007/978-3-642-67688-8_5 (1980).
Sleytr, U. B. & Glauert, A. M. Analysis of regular arrays of subunits on bacterial surfaces; evidence for a dynamic process of assembly. J. Ultrastruct. Res 50, 103–116 (1975).
Article PubMed Google Scholar
Sleytr, U. B. Heterologous reattachment of regular arrays of glycoproteins on bacterial surfaces. Nature 257, 400–402 (1975).
Article ADS CAS PubMed Google Scholar
Schuster, B. S-layer protein-based biosensors. Biosens. (Basel) 8, 40 (2018).
Article Google Scholar
Missiakas, D. & Schneewind, O. Assembly and Function of the Bacillus anthracis S-Layer. Annu. Rev. Microbiol. 71, 79–98 (2017)..
Picher, M. M. et al. Nanobiotechnology advanced antifouling surfaces for the continuous electrochemical monitoring of glucose in whole blood using a lab-on-a-chip. Lab Chip 13, 1780 (2013).
Article CAS PubMed Google Scholar
Schuster, B. & Sleytr, U. B. Biomimetic interfaces based on S-layer proteins, lipid membranes and functional biomolecules. J. R. Soc. Interface 11, 20140232 (2014).
Article PubMed PubMed Central Google Scholar
Ucisik, M., Sleytr, U. & Schuster, B. Emulsomes meet S-layer proteins: an emerging targeted drug delivery system. Curr. Pharm. Biotechnol. 16, 392–405 (2015).
Article CAS PubMed PubMed Central Google Scholar
Preiner, J. et al. IgGs are made for walking on bacterial and viral surfaces. Nat. Commun. 5, 4394 (2014).
Article ADS CAS PubMed Google Scholar
Pérez-Herrero, E. & Fernández-Medarde, A. Advanced targeted therapies in cancer: Drug nanocarriers, the future of chemotherapy. Eur. J. Pharm. Biopharm. 93, 52–79 (2015)..
Howorka, S. Rationally engineering natural protein assemblies in nanobiotechnology. Curr. Opin. Biotechnol. 22, 485–491 (2011)..
Ilk, N., Egelseer, E. M. & Sleytr, U. B. S-layer fusion proteins—construction principles and applications. Curr. Opin. Biotechnol. 22, 824–831 (2011).
Article CAS PubMed PubMed Central Google Scholar
Pettersen, E. F. et al. UCSF ChimeraX: Structure visualization for researchers, educators, and developers. Protein Sci. 30, 70–82 (2021).
Article CAS PubMed Google Scholar
Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. https://doi.org/10.1088/1742-5468/2008/10/P10008 (2008)
Kabsch, W. & Sander, C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983).
Article CAS PubMed Google Scholar
Winter, G. et al. DIALS: implementation and evaluation of a new integration package. Acta Crystallogr D. Struct. Biol. 74, 85–97 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Evans, P. Scaling and assessment of data quality. Acta Crystallogr D. Biol. Crystallogr 62, 72–82 (2006).
Article ADS PubMed Google Scholar
Tickle, I. et al. The STARANISO Server. https://staraniso.globalphasing.org/cgi-bin/staraniso.cgi (2024).
McCoy, A. J. et al. Phaser crystallographic software. J. Appl Crystallogr 40, 658–674 (2007).
Article ADS CAS PubMed PubMed Central Google Scholar
Murshudov, G. N. et al. REFMAC 5 for the refinement of macromolecular crystal structures. Acta Crystallogr D. Biol. Crystallogr 67, 355–367 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Afonine, P. V. et al. Towards automated crystallographic structure refinement with phenix.refine. Acta Crystallogr D. Biol. Crystallogr 68, 352–367 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Chateau, A., Van der Verren, S. E., Remaut, H. & Fioravanti, A. The Bacillus anthracis cell envelope: composition, physiological role, and clinical relevance. Microorganisms 8, 1864 (2020).
Article CAS PubMed PubMed Central Google Scholar
Wildhaber, I., Santarius, U. & Baumeister, W. Three-dimensional structure of the surface protein of Desulfurococcus mobilis. J. Bacteriol. 169, 5563–5568 (1987).
Article CAS PubMed PubMed Central Google Scholar
Dooley, J. S., Engelhardt, H., Baumeister, W., Kay, W. W. & Trust, T. J. Three-dimensional structure of an open form of the surface layer from the fish pathogen Aeromonas salmonicida. J. Bacteriol. 171, 190–197 (1989).
Article CAS PubMed PubMed Central Google Scholar
Stewart, M., Beveridge, T. J. & Trust, T. J. Two patterns in the Aeromonas salmonicida A-layer may reflect a structural transformation that alters permeability. J. Bacteriol. 166, 120–127 (1986).
Article CAS PubMed PubMed Central Google Scholar
Engelhardt, H., Saxton, W. O. & Baumeister, W. Three-dimensional structure of the tetragonal surface layer of Sporosarcina ureae. J. Bacteriol. 168, 309–317 (1986).
Article CAS PubMed PubMed Central Google Scholar
Kadurugamuwa, J. L. et al. S-layered Aneurinibacillus and Bacillus spp. are susceptible to the lytic action of Pseudomonas aeruginosa membrane vesicles. J. Bacteriol. 180, 2306–2311 (1998).
Article CAS PubMed PubMed Central Google Scholar
Scheuring, S. et al. Charting and unzipping the surface layer of Corynebacterium glutamicum with the atomic force microscope. Mol. Microbiol 44, 675–684 (2002).
Article CAS PubMed Google Scholar
Arbing, M. A. et al. Structure of the surface layer of the methanogenic archaean Methanosarcina acetivorans. Proc. Natl Acad. Sci. USA 109, 11812–11817 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Tsuboi, A. et al. In vitro reconstitution of a hexagonal array with a surface layer protein synthesized by Bacillus subtilis harboring the surface layer protein gene from Bacillus brevis 47. J. Bacteriol. 171, 6747–6752 (1989).
Article CAS PubMed PubMed Central Google Scholar
Lupas, A. et al. Domain structure of the Acetogenium kivui surface layer revealed by electron crystallography and sequence analysis. J. Bacteriol. 176, 1224–1233 (1994).
Article CAS PubMed PubMed Central Google Scholar
Couture-Tosi, E. et al. Structural analysis and evidence for dynamic emergence of Bacillus anthracis S-layer networks. J. Bacteriol. 184, 6448–6456 (2002).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors acknowledge the financial support by the University of Graz. Computational results presented have been achieved using the Vienna Scientific Cluster (VSC) (project Nr 71272; T.P.K.). Additionally, funding by the Austrian Science Fund FWF doc.fund Biomolecular Structure and Interactions (grant doi:10.55776/DOC130), Land Steiermark, the City of Graz and Doctoral Academy Graz (BioMolStruct consortium) was received for C.G. and T.P.K. T.S. thanks doc.fund project Molecular Metabolism (grant doi:10.55776/DOC50). N.G. thanks FWF for financial support (grant doi: 10.55776/T1239 and Land Steiermark (Project number: PN 3046). I.U. appreciates support by Ministry of Science and Innovation / Spanish State Research Agency / European Regional Development Fund / European Union (Grants PGC2018-101370-B-I00, PID2021-128751NB-I00) and support from Science and Technology Facilities Council (CCP4-ARCIMBOLDO_LOW).

Author information

These authors contributed equally: Christoph Buhlheller, Theo Sagmeister.

Authors and Affiliations

Institute of Molecular Biosciences, University of Graz, Graz, Austria
Christoph Buhlheller, Theo Sagmeister, Christoph Grininger, Nina Gubensäk & Tea Pavkov-Keller
Medical University of Graz, Graz, Austria
Christoph Buhlheller
Institute of Nanobiotechnology, University of Natural Resources and Life Sciences Vienna, Vienna, Austria
Uwe B. Sleytr
Structural Biology Unit, Institute of Molecular Biology of Barcelona, Spanish National Research Council, Barcelona, Spain
Isabel Usón
ICREA, Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
Isabel Usón
Field of Excellence BioHealth, University of Graz, Graz, Austria
Tea Pavkov-Keller
BioTechMed-Graz, University of Graz, Graz, Austria
Tea Pavkov-Keller

Authors

Christoph Buhlheller
View author publications
Search author on:PubMed Google Scholar
Theo Sagmeister
View author publications
Search author on:PubMed Google Scholar
Christoph Grininger
View author publications
Search author on:PubMed Google Scholar
Nina Gubensäk
View author publications
Search author on:PubMed Google Scholar
Uwe B. Sleytr
View author publications
Search author on:PubMed Google Scholar
Isabel Usón
View author publications
Search author on:PubMed Google Scholar
Tea Pavkov-Keller
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization: C.B., T.S., C.G., T.P.K. Methodology: C.B., T.S., C.G., N.G., U.B.S., I.U., T.P.K. Investigation: T.S., C.B., C.G., N.G., I.U., T.P.K. Visualization: T.S., C.B., C.G. Funding acquisition: N.G., I.U., T.P.K. Project administration: T.P.K. Supervision: U.B.S., I.U., T.P.K. Writing – original draft: C.B., T.S., C.G., N.G., T.P.K. Writing – review & editing: C.B., T.S., C.G., N.G., U.B.S., I.U., T.P.K.

Corresponding author

Correspondence to Tea Pavkov-Keller.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethics

Authors mentioned contributed to the manuscript and roles and responsibilities were agreed. This research does not involve humans or animals.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Buhlheller, C., Sagmeister, T., Grininger, C. et al. SymProFold: Structural prediction of symmetrical biological assemblies. Nat Commun 15, 8152 (2024). https://doi.org/10.1038/s41467-024-52138-3

Download citation

Received: 22 November 2023
Accepted: 28 August 2024
Published: 18 September 2024
DOI: https://doi.org/10.1038/s41467-024-52138-3