Introduction

It is becoming increasingly clear that a deep understanding of the molecular mechanisms involved in viral infection is required for the effective rational design of antiviral agents and vaccine immunogens. In the case of SARS-CoV-2, special attention has been given to the entry process and the role that the Spike (S) protein plays in it1. The SARS-CoV-2 S protein is a trimeric class I fusion glycoprotein that mediates target cell recognition and membrane fusion2,3,4. During post-translational maturation, the full-length protein is cleaved into two subunits, S1 and S2. S1 contains the receptor-binding domain (RBD) and facilitates attachment to the angiotensin-converting enzyme 2 (ACE2) on the host cell surface. S2 drives the fusion of the viral and host membranes and encompasses the fusion peptide, two heptad repeat regions, and a C-terminal transmembrane domain (TMD).

The transmembrane domain (TMD) is frequently regarded as a passive anchor that attaches the protein to the viral envelope. However, recent studies on the structure, dynamics, and biogenesis of integral membrane proteins demonstrate that TMDs and adjacent membrane-proximal regions perform essential functions beyond membrane anchoring5,6,7. Notably, advances in the understanding of viral fusion protein TMDs have highlighted their critical involvement in membrane fusion and multiple stages of the viral lifecycle8,9.

Broer et al. demonstrated that substituting the N-terminal membrane proximal region, the TMD, and the endodomain of the SARS-CoV-1 S protein with the corresponding domains of the vesicular stomatitis virus (VSV) Glycoprotein greatly affects the protein’s function10. A few years later, a mutagenesis-based study, once again using VSV pseudo-particles, revealed that substituting Gly1201, Ala1204, Val1210, and Leu1216 with Lys residues in the core hydrophobic region of the SARS-CoV-1 S protein resulted in reduced infectivity11 likely by altering the hydrophobic nature of the TMD.

The SARS-CoV-2 S protein TMD has been linked with trimerization. A chimera including the RBD followed by the TMD was identified as a trimer in which the RBD was properly folded, suggesting that the TMD acts as an oligomerization domain12. However, there is some controversy regarding the oligomerization motif within the TMD. Abery et al. identified a GxxxG motif (a sequence known to promote oligomerization of TMDs13) within the TMD of the SARS-CoV-1 S protein and demonstrated its importance in the trimerization of the protein14. Shortly after, Corver et al.’s results indicated that these residues were not important for the trimerization of the protein or viral entry15. Recently, it has been suggested that a new motif based on a hydrophobic zipper mediates the trimerization of SARS-CoV-2 S protein TMD16.

Several structures of the S protein have been solved17,18,19. However, only one includes the TMD19. In this structure, the S protein was found in a post-fusion state where the fusion peptide (FP) forms a hairpin-like wedge spanning the entire lipid bilayer, and the TMD wraps around it. Two contact points between the FP and the TMD were described, involving Phe1220 and Leu1234 on the TMD.

This study investigates the functional role of the SARS-CoV-2 S protein TMD during viral entry by introducing targeted variations within its hydrophobic core and assessing their effects on protein function. The analysis also examines the TMD’s capacity to mediate oligomerization and identifies the structural and sequence determinants involved. The findings indicate that the TMD is essential for S protein function beyond membrane anchoring. Sequence and structural features critical for viral entry are distributed throughout the TMD, potentially accounting for its strong sequence conservation and conflicting reports in the literature. The data further demonstrate that the orientation of regions flanking the TMD influences particle entry. Additionally, the TMD forms homo-oligomers via a motif enriched in small residues such as glycine and alanine, as confirmed by computational modeling.

Results

Pseudotyped-VSV particles for the analysis of viral entry

The SARS-CoV-2 S protein contains a single TMD in the C-terminal region of the protein (Fig. 1A). It spans 23 amino acids from Trp1214 to Cys1236 with a predicted ΔGapp of −3.4 kcal/mol for its insertion into the membrane, according to the ΔG Prediction Server20,21. The predicted TMD includes an aromatic-rich region (Trp1214-Tyr-Ile-Trp1217) in its N-terminal end and a Cys pair (Cys1235–Cys1236) in its C-terminal end. However, the structure of the S trimer in its post-fusion state19 shows a helical secondary structure from Leu1218 to Leu1234. Although the aromatic region and the C-terminal Cys-rich section on the TMD most likely interact with the membrane either as an interfacial membrane proximal region19,22 or through palmitoyl post-translational modifications23,24,25,26, in the present study, we focus exclusively on the helical hydrophobic core of the TMD, that is, from Leu1218 to Leu1234. The sequence of all the mutations included in this paper and their insertion potential can be found in Fig.1B.

Fig. 1: Schematic representation of the SARS-CoV-2 spike protein and its TMD.
figure 1

A Schematic representation of SARS-CoV-2 Spike protein domain architecture. The position of key domains is highlighted: NTD N-terminal domain, RBD receptor binding domain, CTD1 C-terminal domain 1, CTD2 C-terminal domain 2, S1/S2, S1/S2 cleavage site, S2′, S2′ cleavage site, FP fusion peptide, HR1 heptad repeat 1, HR2 heptad repeat 2, TM transmembrane domain, CT cytoplasmic tail. B Sequence of the SARS-CoV-2 S protein TMD and the mutations included in this study. The sequence of the SARS-CoV-2 S protein from Glu1206 to Thr1273 is shown. Membrane-associated regions, including an aromatic-rich section and a cysteine-rich domain, are highlighted with light letters on a gray background. The TMD hydrophobic core is shown with dark letters on a pink background. The TMD, as predicted by the ΔG Prediction Server (WT), and the variations included in this work are highlighted. A heat map (right of the sequence) shows the predicted ΔGapp for each TMD mutant.

To study the role of the SARS-CoV-2 S protein TMD during viral entry, we utilized pseudotyped vesicular stomatitis virus (VSV) particles coated with different S protein mutants (Fig. 2A)10,15,27,28. Briefly, HEK293T cells were transfected with plasmids encoding the different S mutants and subsequently infected with a recombinant VSV in which the glycoprotein gene has been replaced with GFP (VSVΔG-GFP). Next, the infectivity of the resulting particles was assayed on VeroE6 cells expressing the TMPRSS2 cofactor by counting GFP-positive cells. VSVΔG-GFP particles were pseudotyped with the native SARS-CoV-2 Wuhan-Hu-1 S protein as a positive control and reference value (WT). No infection was observed when the TMD of the SARS-CoV-2 S protein was eliminated (ΔTMD) or was substituted by a scrambled version of the TMD (SCRBL) where the composition of the residues included in the TMD was preserved but their position was randomly varied, confirming the importance of the SARS-CoV-2 S protein TMD in the entry process and indicating a role beyond membrane anchoring (Fig. 2B). Expression of the chimeras included in this work can be found in Supplementary Fig. 1.

Fig. 2: VSVΔG-GFP pseudotyped virus assay.
figure 2

A Generation of VSVΔG-GFP pseudotyped particles. HEK293T cells were transfected with the corresponding SARS-CoV-2 S protein mutant and infected with a VSV lacking the G gene and expressing GFP (VSVΔG-GFP). The resulting virus, incorporating the S mutants on its surface (VSVΔG-GFP -S), was recovered from the media and used to infect Vero-TMPRSS2 cells. Viral infectivity was then analyzed by counting GFP-positive cells. B–D Focus forming units (FFU) measured after infection with VSVΔG-GFP pseudotyped with different mutants of the SARS-CoV-2 S protein TMD. All samples were normalized to the wild-type SARS-CoV-2 S protein (WT). The p-values (<0.05) for individual one-sample t-tests (vs. WT) are indicated above each bar. B FFU from VSVΔG-GFP particles decorated with the wild-type SARS-CoV-2 S protein (WT), the SARS-CoV-2 S protein where TMD was eliminated (ΔTMD), and a scrambled version of the SARS-CoV-2 S protein TMD (SCRBL). C FFU from pseudotyped VSVΔG-GFP with the SARS-CoV-2 S protein with substitutions of amino acid stretches, Phe1220 to Gly1223, Leu1224 to Ile1227, and Val1228 to Thr1231, with alanine (Phe1220-Gly1223A, Leu1224-Ile1227A, and Val1228-Thr1231A, respectively). Substitutions of amino acids Phe1220 to Gly1223 to leucine (Phe1220 to Gly1223L) were also included. D FFU from pseudotyped VSVΔG-GFP with the SARS-CoV-2 S protein, including alanine insertions at positions 1221, 1226, 1228, 1230, and 1232 (InsA1221, InsA1226, InsA1228, InsA1230, and InsA1232), tryptophan insertion at position 1221 (InsW1221), and methionine insertion at position 1230 (InsM1230). Four independent biological replicates were performed.

Stretches of four amino acids, corresponding to one helical turn, within the SARS-CoV-2 S protein TMD beginning at Phe1220 were substituted with alanine residues (Fig. 2C). Substitution of the Phe1220-Gly1223 and Leu1224-Ile1227 segments (F1220-G1223A and L1224-I1227A, respectively) significantly reduced the infectivity of pseudotyped VSVΔG-GFP particles. In contrast, substitution of Val1228-Thr1231 (V1228-T1231A) with alanines did not decrease viral entry; a non-significant increase was observed. To control for potential bias from alanine substitutions, the Phe1220-Gly1223 segment was also replaced with leucine residues (F1220-G1223L), yielding similar results.

These findings indicate that sequence and structural determinants relevant for viral entry are distributed throughout the TMD, with a greater contribution from the N-terminal region. Notably, none of the modifications tested significantly altered the predicted membrane insertion potential (ΔGapp) of the SARS-CoV-2 S protein TMD (Fig. 1B).

To further analyze the role of SARS-CoV-2 S protein TMD in the viral entry process, we inserted alanine residues in key positions of the hydrophobic segment (Fig. 2D). Inserting a residue not only increases the length of the TMD but also changes the relative orientation of the residues positioned before and after the insertion point due to the α-helical nature of this segment29. We first inserted an alanine between Phe1220 and Ile1221 (Ins-A1221), which alters the relative orientation of the aromatic membrane proximal region and the TMD. Additionally, Ins-Ala1221 disrupts the G1223xxxG1229 motif found in the TMD14, forcing a 100° rotation for Gly1219 relative to Gly1223 (Supplementary Fig. 2A vs. B). As shown in Fig. 2D, Ins-Ala1221 greatly reduced the infectivity of pseudo-typed VSVΔG-GFP particles, suggesting that either the relative orientation of the aromatic stretch vs the TMD, the GxxxG interaction motif, or both are important for particle entry.

Insertion of an alanine residue in positions 1226 or 1228 (Ins-A1226 and Ins-A1228) negatively affected viral entry, indicating that the relative orientation of the N-terminal and C-terminal sections of the TMD might be important for infectivity (Fig. 2D). These two insertion points were selected with the work of Fu and Chou16 in mind, who proposed a trimerization hydrophobic zipper involving residues 1221, 1225, 1229, and 1233. Thus, the observed reduction in viral entry could also result from disrupting the oligomerization motif they described.

Insertions at positions 1230 and 1232 (Ins-A1230 and Ins-A1232) also disturb the hydrophobic trimerization motif proposed by Fu and Chou (Supplementary Fig. 2), but reduced particle entry to a lesser extent than Ins-A1226 or Ins-A1228 (Fig. 2D). Of these two insertions, Ins-A1230 reduced pseudo-typed VSVΔG-GFP entry more than Ins-A1232, suggesting that the relative orientation of the N-terminal and C-terminal sections of the TMD is more important for the protein function than the trimerization hydrophobic zipper. Note that while the authors16 included leucine residues on positions 1229 and 1233, we incorporated the methionine residues found in the SARS-CoV-2 S protein consensus sequence (Uniprot: P0DTC2). Once again, to ensure that the observed results are not biased by the choice of Ala residues, we inserted a Trp at position 1221 (InsW1221) and a Met at position 1230 (InsM1230). Regarding the insertion of Ala or any other residue, we observed similar results (Fig. 2D).

A closer look at the SARS-CoV-2 TMD reveals a highly hydrophobic surface composed of Leu1218, Ile1221, Leu1224, Ile1225, Val1228, and Ile1232 and, an opposing surface rich in small residues including, Gly1219, Ala1222, Gly1223, and Ala1226 which could participate in TMD-TMD interactions15,27,28,29,30 (Fig. 3A). To analyze the importance of these regions, we first analyzed the impact of the GxxxG and the hydrophobic zipper as oligomerization motifs, we selectively mutated key residues in the SARS-CoV-2 S protein. First, we replaced Gly1223 with isoleucine (G1223I), as the equivalent substitution is sufficient to break the dimerization of Glycophorin A (GpA) TMD30,31. The GFP count associated with pseudo-typed VSVΔG-GFP entry revealed that the G1223I mutation is sufficient to reduce viral entry (Fig. 3B). Substituting Gly1219 and Gly1223 with isoleucine (G1219I G1223I) further decreases pseudo-typed VSVΔG-GFP infectivity. Alternatively, we replaced Ile1221, Ile1225, or Met1229 from the hydrophobic surface with tyrosine (I1221Y, I1225Y, and M1229Y), changes that were previously suggested to modify TMD homo-oligomerization16. While the I1221Y substitution slightly reduced viral entry, the I1225Y and M1229Y substitutions did not (Fig. 3B).

Fig. 3: Analysis of the TMD hydrophobic and small residue-rich surfaces.
figure 3

A Structural model of the SARS-CoV-2 S protein TMD. AlphaFold3 model of residues 1218-1234 (top) and helical wheel projection (bottom) of the same sequence. The surface of the helix containing small side chain residues is highlighted in orange, while the opposite surface containing highly hydrophobic residues is highlighted in purple. B FFU from pseudotyped VSVΔG-GFP with the SARS-CoV-2 S protein bearing the following point mutations Gly1223 to Ile, Gly 1219 to Ile and Gly1223 to Ile, Ile1221 to Tyr, Ile1225 to Tyr, Met1229 to Tyr, Ala1222 to Ile and Gly1223 to Ile, Ala1222 to Tyr, and Gly1223 to Tyr, Leu1224 to Tyr, Leu1224 to Tyr, and Val1228 to Tyr (G1223I, G1219I G1223I, I1221Y, I1225Y, M1229Y, A1222I G1223I, A1222Y G1223Y, L1224Y, L1224Y V1228Y, respectively). C Analysis of the Leu1218 and Ile1232 (L1218I I1232L) and Gly1219 and Ala1226 (G1229A A1226G) residue swapping. The single mutations associated with these residue swapping Leu1218, Ile1232, Gly1219, and Ala1226 (L1218I, I1232L, G1229A, and A1226G respectively) were also analyzed. Four independent biological replicates were performed.

Then, we replaced Ala1222 and Gly1223 with isoleucine (A1222I G1223I) and measured the impact on the pseudo-typed VSVΔG-GFP infectivity. Our results revealed that the A1222I G1223I mutant is less able to promote viral entry than the previously tested G1223I mutant (Fig. 3B). When Ala1222 and Gly1223 were replaced with tyrosine (A1222Y G1223Y), a larger and less hydrophobic amino acid, we observed a greater reduction in VSVΔG-GFP infectivity. Similarly, substituting Leu1224 with tyrosine (L1224Y) on the hydrophobic surface also reduced VSVΔG-GFP entry. Double substitution of Leu1224 and Val1228 to tyrosine (L1224Y V1228Y) diminished VSVΔG-GFP infectivity further. To further test our hypothesis, we swapped Leu1218 and Ile1232 (L1218I I1232L) on one hand and Gly1219 and Ala1226 (G1229A A1226G) on the other (Fig. 3C). Neither of these changes nor the single mutations associated with these residue swapping negatively affected VSVΔG-GFP entry. Therefore, according to our results, both the small residues and the hydrophobic bulky residues aligned on opposing faces of the helix (Fig. 3A) are relevant for the entry process, probably because they participate in protein-protein interactions or in TMD-lipid interactions. Protein levels for all the chimeras were probed by Western blotting to avoid misinterpretation of the data (Supplementary Fig. 1).

To incorporate the SARS-CoV-2 S protein into VSVΔG-GFP pseudo-particles, it must be located at the plasma membrane. Thus, we investigated whether those events in which no VSVΔG-GFP pseudo-particle infectivity was observed were the consequence of SARS-CoV-2 S protein absence at the cell surface. We determined whether the SARS-CoV-2 S protein was located at the plasma membrane by surface-staining with an anti-RBD antibody followed by flow cytometry analysis (Fig. 4). All SARS-CoV-2 S protein mutants that showed a decreased VSVΔG-GFP infectivity were included in this assay. Additionally, the WT S protein was used as a positive control while the ΔTMD was included as a negative control (Fig. 4A). Our results indicate that all SARS-CoV-2 S protein mutants were present at the plasma membrane at similar levels except for Ins-A1228 (Fig. 4B and Supplementary Fig. 3A). Thus, validating the results of the VSVΔG-GFP pseudo-particle assay. The absence of Ins-A1228 at the cell surface suggests that the SARS-CoV-2 S protein TMD might play a role in protein trafficking.

Fig. 4: Surface expression of Spike mutants.
figure 4

A Spike surface expression of WT and ΔTMD controls was analyzed by flow cytometry. HEK293T cells were transfected with plasmids encoding the corresponding S protein alongside a constitutively GFP-expressing plasmid. After 48 h, cells were stained with an anti-RBD antibody. Levels of the surface SARS-CoV-2 S protein (red) and GFP (green) were analyzed by flow cytometry. B Relative surface expression of all Spike mutants tested in this assay. GFP was used as a transfection control. The figure shows the percentage of SARS-CoV-2 S protein vs GFP signal ratio. Samples were normalized to the SARS-CoV-2 S protein WT. The significative p-values (<0.05) for individual one-sample t-tests vs. WT are indicated above each bar. Three independent biological replicates were performed.

BiMuC assay for the analysis of membrane fusion

The role of the SARS-CoV-2 S protein TMD during the viral and cellular membrane fusion process was corroborated with a syncytia-based assay known as BiMuC32,33. Briefly, the Venus fluorescent protein (VFP) can be split into two fragments, VN and VC, neither of which is fluorescent. However, when these two fragments are fused to a pair of interacting proteins that bring them together, such as cJun and bFos, the structure of the VFP is reconstituted and the fluorescence is recovered. The VN-cJun and VC-bFos chimeras were expressed in separate cell pools together with the viral machinery required for membrane fusion. Therefore, the two chimeras would only meet in the event of membrane fusion (Fig. 5A). The complete SARS-CoV-2 S WT protein was used as a positive control and reference value. The ΔTMD and SCRBL were included as negative controls (Fig. 5C). Measurements of the area of the syncytia can be found in Supplementary Fig. 3B.

Fig. 5: Analysis of SARS-CoV-2 S protein membrane fusion properties by BiMuC.
figure 5

A Schematic representation of the BiMuC assay. The Venus fluorescence protein (VFP) can be split into two fragments, VN and VC, respectively, neither of which is fluorescent. These two fragments were fused to cJun and bFos (VN-cJun and VC-bFos) and transfected into HEK 293 T cells in separate cell pools. Cells were co-transfected with the viral machinery required for membrane fusion, the corresponding SARS-CoV-2 S protein, and the ACE2 receptor. Additionally, both cell pools were transfected with the Renilla Luciferase (luciferase) to normalize the fluorescent values. Only functional SARS-CoV-2 S proteins facilitated the S-ACE2 recognition and membrane fusion, allowing VFP reconstitution and fluorescence. B Representative images of the assay where cells have been transfected with the SARS-CoV-2 S protein (WT) or the SARS-CoV-2 S protein bearing a scrambled version of the TMD (SCRBL). Scale bar = 100 microns. C–E The fluorescence/luciferase signal ratio (GFP/Luc ratio) was measured for the SARS-CoV-2 S protein mutants tested on this assay. Samples were normalized to the SARS-CoV-2 S protein WT. Significative p-values (<0.05) for individual one-sample t-tests are indicated above each bar. C GFP/Luc ratio for the SARS-CoV-2 S protein (WT), the SARS-CoV-2 S protein where TMD was eliminated (ΔTMD), and a chimera bearing a scrambled version of the SARS-CoV-2 S protein TMD (SCRBL). As a negative control, we also included cells that did not incorporate the SARS-CoV-2 S protein (ΔS). D GFP/Luc ratio for the SARS-CoV-2 S protein with substitutions of 4 amino acid stretches Phe1220 to Gly1223, and Val1228 to Thr1231, by alanines (Phe1220-Gly1223A and Val1228-Thr1231A), and the SARS-CoV-2 S protein including alanine insertions in positions 1221, 1228, and 1232 (InsA1221, InsA1228, and InsA1232). E GFP/Luc ratio for the SARS-CoV-2 S protein bearing the following point mutations: Gly 1219 to Ile and Gly1223 to Ile, Ile1221 to Tyr, Ile1225 to Tyr, Met1229 to Tyr, Ala1222 to Tyr and Gly1223 to Tyr, and Leu1224 to Tyr and Val1228 to Tyr (G1219I G1223I, I1221Y, I1225Y, M1229Y, A1222Y G1223Y, and L1224Y V1228Y, respectively). All panels include the WT values used to normalize the data (dotted line) and the SCRBL samples. F Correlation of FFU measured in the VSVΔG-GFP infection assay against values of reconstituted GFP/Luc ratio in the BiMuC fusion assay. Colors show samples that have lower signals than WT values in both systems (pink background), higher in both systems (green background), or different between the two assays (white background). At least three independent biological replicates were performed.

Next, we selected some of the previously described mutants of the S protein and tested them using the BiMuC methodology. Substituting the Phe1220-Gly1223 helical turn with alanines (F1220-G1223A) diminished syncytia formation, though not significantly (Fig. 5D). On the other hand, V1228-T1231A did not reduce syncytia formation (Fig. 5D). Alanine insertions Ins-A1221 altered the observed syncytia-derived fluorescence (Fig. 5D). However, only a small non-significant reduction in syncytia formation was observed for Ins-A1232. We also tested the influence of point mutations on the ability of the SARS-CoV-2 S protein to induce syncytia formation (Fig. 5E). In this case, the G1219I G1223I, I1221Y, or the M1229Y substitutions did not perturb the SARS-CoV-2 S protein function. However, I1225Y, A1222Y G1223Y, and L1224Y V1228Y reduced the formation of syncytia. Altogether, the BiMuC-derived results correlate with the pseudo-typed VSVΔG-GFP assay34 (Fig. 5F).

Analysis of TMD oligomerization

Some of the modifications included in the BiMuC and the pseudo-typed VSVΔG-GFP assay could alter the previously described homo-oligomerization of the SARS-CoV-2 TMD12,16,35. To directly analyze the SARS-CoV-2 TMD potential oligomer, we used a bimolecular fluorescent complementation (BiFC) approach adapted to study intramembrane interactions36,37,38,39. Briefly, the tested TMDs were fused to a split VFP, either to its N-terminal (VN) or its C-terminal (VC). If the interaction between the tested TMDs occurs, it will bring the VN and VC ends together, facilitating the reconstitution of the VFP structure and causing fluorescence (Fig. 6A). The TMD of GpA was used as a positive control and normalization value across experimental replicates. The TMD of Tomm20, a monomeric hydrophobic segment, was used as a negative control38,40.

Fig. 6: Analysis of TMD oligomerization by BiFC assay.
figure 6

A Reconstitution of the Venus fluorescent protein (VFP) mediated by the SARS-CoV-2 S protein TMD oligomerization. Different S TMD mutants were fused to both the N and C terminal sections of the VFP (VN and VC, respectively) and expressed in HEK 293T cells together with the Renilla Luciferase. If the SARS-CoV-2 S protein TMD can oligomerize the VFP will be reconstituted B GFP/Luc ratios measured for different chimeras bearing Glycophorin A TMD (GpA), E. coli Lep H2, wild-type SARS-CoV-2 S protein TMD (WT), and the SARS-CoV-2 S protein TMD with the following point mutations G1219I Gly1223 to Ile, Gly1219 to Tyr and Gly1223 to Tyr, Ala1222 to Tyr and Gly1223 to Tyr, Ala1222 to Tyr, Gly1223 to Ile, le1221 to Tyr, Ile1225 to Tyr, Met1229 to Tyr, and Leu1224 to Tyr and Val1228 to Tyr (G1219I G1223I, G1219Y G1223Y, A1222Y G1223Y, A1222Y, G1223I, I1221Y, I1225Y, M1229Y, and L1224Y V1228Y, respectively). All values are normalized against the GpA homo-oligomer. The significant p-values (<0.05) for individual one-sample t-tests are indicated above each bar. At least three independent biological replicates were performed.

Using this BiFC assay, we tested the SARS-CoV-2 TMD. Our results indicate that this TMD is sufficient for inducing oligomerization (Fig. 6B). Next, we focused on the role of the surface with small residues on the oligomerization of the TMD. The G1219I G1223I substitution did not decrease the SARS-CoV-2 S protein’s TMD oligomerization potential; however, the G1219Y G1223Y did, a result that correlates with the outcome of the BiMuC assay. The A1222Y G1223Y double substitution further reduced oligomerization. Interestingly, the substitution of only Ala1222 or Gly1223 with Y (A1222Y and G1223Y) did not alter the oligomerization capabilities of the SARS-CoV-2 TMD, suggesting a synergistic effect of these two residues.

We also tested the role of the bulky hydrophobic helix interface on the SARS-CoV-2 S protein TMD on oligomerization. The single substitutions I1221Y, I1225Y, or M1229Y did not alter oligomerization. Furthermore, the double substitution L1224Y V1228Y, which had proven fundamental for VSVΔG-GFP entry and membrane fusion in the BiMuC assay, did not decrease the observed fluorescence, suggesting that the hydrophobic surface does not participate in the oligomerization of the SARS-CoV-2 S protein TMD.

We modeled the potential SARS-CoV-2 S protein TMD homo-trimer using TMHOP41. TMHOP uses Rosetta symmetric all-atom ab initio fold-and-dock simulations in an implicit membrane environment to predict low-energy conformations based on the empirical measurement of amino acid insertion propensities into E. coli inner membrane42. These conformations are clustered based on energy and structural properties (Supplementary Fig. 4A), and a representative model of each cluster is shown (Fig. 7A). In each model, Gly1216, Ala1222, Gly1223, and Ala1226 are located on the interaction surface of the potential trimer, confirming our experimental results. We also modeled the potential TMD trimer using AlphaFold343, a recent evolution of the AlphaFold2 architecture and training procedure, which is capable of predicting the joint structure of complexes. Once again, the predicted models present, in most cases, Gly1216, Ala1222, Gly1223, and Ala1226 at the interior of a TMD homo-trimer (Fig. 7B and Supplementary Fig. 4). The AlphaFold3 server enabled us to model larger sequences, including the full-length S protein. When the TMD region was modeled in the context of the full-length protein, accuracy was low, and thus the models were discarded (Supplementary Fig. 5). However, the server modeled with sufficient confidence the TMD flanked by the N-terminal aromatic-rich region and the Cys-rich C-terminal end. The retrieved models for this section included, once again, Gly1216, Ala1222, Gly1223, and Ala1226 on the internal surface of a TMD trimer (Fig. 7C and Supplementary Fig. 4).

Fig. 7: SARS-CoV-2 S protein TMD trimer models.
figure 7

A and B SARS-CoV-2 S protein TMD trimer modeled by the TMHOP server (A) and by the AlphaFold3 server (B). The top five models are shown. The input sequence was 1212WYIWLGFIAGLIAIVMVTIMLCC1235. Models show the TMD hydrophobic core, residues 1218-1234 shown in pink. The secondary structure, as well as the surface of the peptide, is shown. Gly1216, Ala1222, Gly1223, and Ala1226 are highlighted in red on the secondary structure representation. C AlphaFold3 server model of the C-terminal region of the SARS-CoV-2 S protein, including the aromatic-rich section, the TMD, and the cysteine-rich domain. The five top models are shown. Models include just the TMD hydrophobic core, residues 1218-1234 shown in pink. The secondary structure, as well as the surface of the TMD hydrophobic core, is shown. Gly1216, Ala1222, Gly1223, and Ala1226 are highlighted in red.

Discussion

If the SARS-CoV-2 S protein TMD acted only as a passive membrane anchor, it should tolerate extensive sequence variation provided hydrophobicity and length were maintained. However, a closer look at the genome of all the variants of SARS-CoV-2 reveals that no mutation has been found in the S protein TMD, and few variations are found between SARS-CoV-2, the SARS-CoV-1, or MERS-CoV (Supplementary Fig. 5A), while TMDs of other SARS-CoV-2 proteins present multiple mutations (Supplementary Fig. 5B). These findings reinforce that the S protein TMD is not merely a passive anchor, but a critical structural element.

Within the S protein TMD, we identified at least two functionally distinct subdomains: a highly hydrophobic surface that most likely works in coordination with the lipid milieu to stabilize insertion and promote membrane fusion, and a surface rich in small residues (alanine and glycine) which might enable tight helix–helix packing necessary for TMD homo-oligomerization and subsequently S protein trimerization.

In the non-polar membrane environment, the hydrophobic effect is responsible for insertion into the membrane and does not usually contribute to the tertiary and quaternary structure44. Furthermore, the nature of the residues that facilitate membrane insertion impedes salt bridges and hydrogen bonds between TMDs45. In this context, the tertiary and quaternary structure of TMDs is dictated by a delicate balance of the remaining low-energy forces, where van der Waals interactions play a crucial role46. The nature of van der Waals forces requires a large contact area between the associating protein segments. Amino acids with small side chains, such as glycine and alanine, facilitate the helix-helix contact, increasing van der Waals forces. Substitution of alanines and glycines by bulky residues generally diminishes the helix-helix contact area and breaks TMD-TMD interactions. Our results indicate that Gly1219, Ala1222, Gly1223, and Ala1226 are key residues for the formation of a SARS-CoV-2 S protein TMD homo-trimer and thus SARS-CoV-2 infectivity. Furthermore, the nature of these residues and the fact that their substitution by large amino acids diminishes infectivity and oligomer formation suggest that this intramembrane homo-trimer is held by van der Waals forces.

However, the structure of the full-length S protein in a post-fusion conformation did not reveal the TMD in a trimeric disposition19. In this structure, the TMDs surround the FPs and make contact with them through the Leu1234 and Phe1120 residues, which were not involved in either of the subdomains described in our work. Given evidence for multiple oligomeric states of the spike47, and simulations showing the TMD is dynamic35 (monomeric, dimeric, trimeric), it is plausible that different TMD conformations are adopted at distinct stages of spike biogenesis and fusion. This conformation variety may be required for viral fusion. In the early stages of the SARS-CoV-2 S protein biogenesis, the TMD-TMD interactions responsible for the TMD trimer could facilitate the formation of a full-length protein trimer together with other motifs8, participate in the post-translational processing of the S protein48, or modulate its trafficking49. Next, the low-energy nature of the forces holding the TMD trimer together would allow the transition between the different oligomeric states of the protein required during the complex membrane fusion process. The loss of infectivity observed in our assays underscores that the SARS-CoV-2 S protein's ability to reorganize its TMD during fusion is tightly linked to successful infection. In summary, our results confirm that intact helix–helix interactions and lipid-protein interactions in the TMD are required for fusion.

Insertions within the hydrophobic core, including those that preserve the glycine/alanine motif, reduced infectivity. These alterations likely reorient the TMD and adjacent domains, potentially preventing essential conformational states of the spike protein or obstructing the large-scale rearrangements required for membrane fusion.

Going forward, it will be important to obtain high-resolution structures of the full-length S protein embedded in a lipid membrane during distinct stages of the membrane fusion process to see exactly how the helices pack and how mutations perturb that arrangement. Systematic mutagenesis combined with molecular dynamics simulations could map the critical contact residues and helix orientations. It will also be critical to assess how lipid modifications, such as palmitoylation, tune TMD stability. As new variants of SARS-CoV-2 emerge, monitoring for any changes in the TMD will be prudent, since even small substitutions there can reshape the S protein stability and membrane fusion properties.

Our work goes beyond understanding viral entry. Recently, TMDs and their interactions have been targeted for the modulation of protein function and the development of new therapeutic approaches40,50. Given the importance of the TMD for the SARS-CoV-2 S protein packing, stability, and function, perturbing the TMD intramembrane interactions could be an avenue worth exploring.

Methods

Plasmid constructs

Sequence encoding the SARS-CoV-2 S protein (Addgene #164433) was a gift from Alejandro Balazs. All subsequent mutants were obtained with the Pfu Plus! mutagenesis kit (EurX, Gdańsk, Poland) according to the manufacturer’s instructions. A list of primers used in the study is included in the Supplementary Data. For the generation of BiFC chimeric plasmids, the sequence encoding for the TMD was PCR amplified and subcloned to generate a fusion with the Nt or Ct of the Venus Fluorescent Protein (VN, VC, respectively: Addgene #27097 and #22011, a gift from Chang-Deng H). The TMD was cloned at the Ct of the VFP. These constructs were generated using the In-Fusion HD cloning kit (Takara, Japan), also according to the manufacturer’s instructions. Sequences were verified by sequencing the plasmid DNA at Macrogen (Seoul, South Korea).

Human cells and culture conditions

Human embryonic kidney (HEK293T) (from American Type Cell Collection, Manassas, VA, USA) were maintained in Dulbecco’s modified Eagle’s medium (DMEM, Gibco) supplemented with 10% fetal bovine serum, 1% penicillin–streptomycin (Sigma Aldrich, St. Louis, MO, USA), and 1% MEM non-essential amino acids. Cells were maintained at 37 °C and 5% CO2 conditions. VeroE6-TMPRSS2 were obtained from the JCRB Cell Bank (catalog JCRB1819). Cells were cultured in DMEM with either 10% or 2% FBS for maintenance or infection, respectively. Selection of VeroE6-TMPRSS2 was performed using media containing 1 mg/mL G418 (Cat. no. A1720, Sigma).

Transient DNA transfections

HEK 293T cells were seeded in 6, 12 or 24-well plates and grown in DMEM (Gibco) supplemented with 10% fetal bovine serum (FBS, Gibco) the day before transfection. For the pseudo-typed VSVΔG-GFP assay, transfections were carried out with calcium phosphate. Briefly, DNA was mixed with CaCl2 125 mM in HBS buffer (NaCl 140 mM, KCl 5 mM, Na2HPO4 750 μM, dextrose 6 mM, HEPES 25 mM pH 7.05), incubated for 15 min and added to the HEK293 cells. For the rest of the transfections, DMEM medium was mixed with plasmid DNA (up to a total of 4 μg DNA for six-well, 2 μg for 12-well plates) and transfected into HEK293T cells, adding 3 μl of PEI 1 mg/ml and 100 μl of DMEM per μg of DNA. For the surface expression analysis, 2–3 × 106 HEK293T cells were transfected in six-well plates with the corresponding 1 μg Spike plasmid and 1 μg of a GFP encoding plasmid. In the BiMuC assay, 2 × 106 HEK 293T cells in six-well plates were transfected either with 1 μg of S and 1 μg of VN-Jun or with 1 μg ACE2 and 1 μg VC-Fos, plus 100 ng pRL-CMV. Transfection mixtures were added dropwise onto cells, and the media were changed after 24 h. For BiFC chimeras, 1 μg of cMyc-VN-TMD and 1 μg HA-VC-TMD were transfected into 2 × 106 HEK 293T cells in 12-well plates. A plasmid expressing Renilla luciferase under the CMV promoter (pRL-CMV) (50 ng) was also added to the mix to normalize the signal. For western blot (WB) analysis, 2 μg of Spike plasmid was transfected into 6-well plates containing 6 × 106 HEK 293T cells.

Pseudo-typed VSVΔG-GFP assay

Pseudo-typed VSVΔG-GFP were produced by transfection of HEK 293T cells with lipofectamine 2000 (ThermoFisher Scientific) with a plasmid encoding the indicated S genes as indicated above in a 24-well plate using 750 ng DNA and 1.875 μL of lipofectamine per well. Following 24 h, cells were infected at a multiplicity of infection of three with VSV lacking the VSV-G protein and encoding both GFP and firefly luciferase34 that was previously pseudotyped with the VSV-G protein to yield infectious particles. Following infection, a monoclonal antibody targeting VSV-G (a kind gift of Rafael Sanjuan, University of Valencia, Spain) was added to neutralize any remaining VSV-G-bearing viruses, and infected cells were incubated for 18 h. The supernatant was collected, clarified and frozen at −80 °C. The viral titer (focus forming units [FFU] per mL) was obtained by serial dilution on VeroE6-TMPRSS2 cells in 96-well plates and counting of GFP-expressing cells using a live cell microscope following 24 hours (Incucyte SX5, Sartorius).

Flow cytometry

Two days post-transfection, cells were washed once with PBS and detached by pipetting in 500 μl of FACS buffer (0.1% sodium azide, 0.5% BSA in PBS). Then, 200 μl of cells were collected by centrifugation at 1500 rpm for 5 min at 4 °C and stained with anti-SARS-CoV-2 RBD antibody (Invitrogen, PA5-114529). After centrifugation, cells were incubated with Goat anti-Rabbit IgG (H + L) AlexaFluor 568 (ThermoFisher, A11011). Antibodies were diluted to 1:1000 in FACS buffer, and 100 μl per sample was used for resuspending cells, following a 30 min incubation on ice in both cases. After three final washes with FACS buffer, surface SARS-CoV-2 spike levels were analyzed using an LSRFortessa flow cytometer (BD Biosciences). Typically, 10,000 live cells were selected based on morphology and acquired. SARS-CoV-2 spike levels were assessed by gating AlexaFluor 568 (excitation at 561 nm, collection at 586/15 nm) positive cells within the gate of GFP positive cells (excitation at 488 nm, collection at 530/30 nm). Analysis was carried out using BDFACS Diva 8.0.2 and FlowJo X software.

BiMuC assay

One day post-transfection, the media was discarded and the cells rinsed with PBS. Then, 1 mL DMEM was added to each well, and one well expressing S and VN-Jun was pooled with another well expressing hACE2 and VC-Fos. Cells were counted, and 500 μl containing 3 × 105 cells were seeded into 4 wells of a 24-well plate. The next day, media was discarded in three of the wells, cells rinsed with PBS, collected in 100 μl PBS, and seeded into 96-well black plates for fluorescence read. Fluorescence was measured in a VictorX multi-plate reader (PerkinElmer, Waltham, MA, USA). The luminescence was measured using the Renilla Luciferase Assay kit from Sigma following the manufacturer's protocol on 96-well white cell-culture plates. At least three independent experiments were done. The ratio between VFP and luciferase was registered as the relative fluorescence units (RFU) for each well. Each of those three wells was accounted as a technical replicate, and the mean of the three values was taken as a biological replicate. RFU values were then normalized against the wild-type condition. The extra well was used for imaging syncytia in a fluorescence microscope.

Fluorescence imaging

For each condition, one well of a 24-well plate was used for imaging 24 h after co-culture of S and hACE2 expressing cells. Images were acquired with a Zeiss Axiovert 5 fluorescence microscope, both with transmitted light and GFP channels with a ×10 objective. Both images were overlaid, and the GFP area was quantified using Fiji51.

Bimolecular fluorescent complementation (BiFC) assay

Two days after transfection, cells were washed with PBS and collected for fluorescence and luciferase measurements in a Victor X3 plate reader. For the Renilla luciferase readings used as signal normalization, we used the Renilla Luciferase Glow Assay Kit (Pierce, ThermoFisher) according to the manufacturer’s protocol. In each experiment, the fluorescence/luminescence ratio obtained with the GpA-GpA homodimer was used for normalization. All experiments were done at least in triplicate.

Protein expression and western blotting

Two days after transfection, the media were aspirated, cells rinsed with PBS and lysed in 300 μl of radioimmunoprecipitation assay (RIPA) buffer [150 mM NaCl, 50 mM Tris–HCl (pH 8), 1% NP-40, 0.5% sodium deoxycholate, 0.4% SDS, and 1 mM EDTA] supplemented with cOmplete EDTA-free protease inhibitors (Roche). Lysates were sonicated with three pulses of 1 s in a VCX-500 Vibra Cell sonicator (Sonics) following addition of 5X sample buffer (final concentration 62.5 mM Tris–HCl [pH 6.8], 2% sodium dodecyl sulfate [SDS], 0.01% bromophenol blue, 10% glycerol and 5% β-mercaptoethanol). Protein samples were boiled for 4 min at 95 °C, subjected to 10% SDS-polyacrylamide gel electrophoresis (PAGE), and transferred to nitrocellulose membranes (Cytiva). Membranes were blocked for 30 min at room temperature in Tris-buffered saline supplemented with 0.05% Tween 20 (TBS-T) containing 5% non-fat dry milk and later incubated with primary antibodies diluted in the same buffer at 4 °C overnight. Antibodies used in this study were SARS-CoV-2 Spike S1 (Invitrogen PA5-81795), rabbit anti c-Myc (Sigma PLA0001) and mouse anti-GAPDH (Santa Cruz sc-47724). Then, membranes were washed with TBS-T and incubated with goat anti-rabbit or sheep anti-mouse IgG horseradish peroxidase conjugate (Sigma DC02L or GE Healthcare NXA931, respectively) for 1 h at room temperature and washed again. All antibodies were used at a 1:10,000 dilution in TBS-T with 5% non-fat dry milk. Detection of immunoreactive proteins was carried out using the enhanced chemiluminescence reaction (Amersham ECL Prime, Cytiva) and detected by the Image Quant LAS 4000 Mini (GE Healthcare).

Protein structure predictions

Structures of the Spike protein TMD were simulated either using Alphafold 343 or TMHOP41 web servers. In the case of Alphafold, WT and Ala-insertion mutant TMD sequences were used as input and three copies were simulated for each case, thus obtaining a trimeric structure. WT sequences were simulated using two different input sequences: one was more restricted to the TMD region (WYIWLGFIAGLIAIVMVTIMLCC) and the other included the previous seven residues of the TMD and cytosolic domains (EQYIKWP-WYIWLGFIAGLIAIVMVTIMLCC-MTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT). In parallel, a model for the full-length S protein was also predicted, without including any post-translational modifications (PTMs). All five output structural models and their PAE graphs were exported and inspected. For TMHOP (Trans-membrane Homo Oligomer Predictor; https://TMHOP.weizmann.ac.il), the WT TMD sequence (WYIWLGFIAGLIAIVMVTIMLCC) was used as input, and the five lowest energy models were selected for inspection. All structural models were visually inspected using UCSF ChimeraX software52.

Sequence alignments

Alignments of coronavirus Spike sequences were performed using T-Coffee53. Aligned sequences were then exported and viewed in Jalview54, and residues were color-coded using the ClustalX color map.

Statistics and reproducibility

Unpaired t-tests assuming a Gaussian distribution and equal standard deviation (SD) for all conditions were applied. The p-value was two-tailed. One-sample t-test was performed to compare different wild-type-normalized conditions, comparing means with a ‘hypothetical value’ of 100. Significance assessment between test samples and controls was performed using GraphPad Prism v.9 for P values < 0.05. All experiments were performed in at least three independent biological replicates.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.