Introduction

Serum anti-SARS-CoV-2 neutralizing antibody (nAb) titer and serum anti-Spike binding antibody (bAb) concentration are supported as correlates of protection (CoPs)1,2 against symptomatic SARS-CoV-2 infection3. However, the small numbers of severe COVID-19 cases in phase 3 COVID-19 vaccine efficacy (VE) trials have hindered characterization of CoPs against severe COVID-19, the most important outcome to prevent.

The ENSEMBLE trial was a randomized, placebo-controlled phase 3 trial of single-dose Ad26.COV2.S vaccine. A total of 44,325 participants were randomized 1:1 to receive Ad26.COV2.S or placebo on Day 1 (D1), with serum samples taken on D1 and D29 for antibody measurement (Supplementary Fig. 1). Results of the primary4 and final5 safety and efficacy analyses have been published. We previously showed that D29 50% inhibitory dilution neutralizing antibody titer (nAb-ID50), anti-Spike bAb concentration (Spike IgG), and anti-receptor binding domain bAb concentration (RBD IgG) were inverse correlates of risk (CoRs) of moderate to severe-critical COVID-19 through 83 days post-vaccination6. Correlate of protection (CoP) analyses provided strongest evidence for nAb-ID50 as a CoP6.

Here we applied an identical approach using final data from the double-blind phase to assess the same antibody markers as CoRs and CoPs against severe-critical COVID-19 starting 7 days post-D29 through 220 days post-vaccination, during which overall VE against severe-critical COVID-19 was 73.1% [95% confidence interval (CI) 58.7%, 84.1%]. We also assessed the same markers as correlates of moderate COVID-19 and of the primary endpoint in Sadoff et al.5, moderate to severe-critical COVID-19, through 220 days, whereas all previous correlates analyses restricted to 83 days follow-up6. Overall VE against the moderate endpoint and against the primary endpoint starting 7 days post-D29 was 41.3% (28.6%, 51.3%) and 48.6% (38.6%, 57.0%), respectively. We focus on results for D29 nAb-ID50, and summarize results for D29 bAbs in the main text, with details in Supplementary Information. We repeated all analyses restricting to Latin America, South Africa, and the United States, except severe-critical COVID-19 could not be studied for the latter two regions due to too few events (Supplementary Table 1).

Results

The correlates analyses used the final analysis database5, with data cut-off July 9, 2021. The moderate, severe-critical, and moderate to severe-critical COVID-19 endpoints were defined as in Sadoff et al.5, with minor differences as described in Methods. Correlates analyses were performed in per-protocol baseline SARS-CoV-2 seronegative participants, excluding participants with evidence of SARS-CoV-2 infection up to 6 days post-D29. Cases were participants with the relevant disease endpoint (onset both ≥ 28 days post-vaccination and ≥7 days post-D29) through to the cut-off date. Non-case vaccine recipients were sampled into the immunogenicity subcohort with no evidence of SARS-CoV-2 infection to the end of the correlates study period: 220 days post D1 (all regions, Latin America) or 140 days post D1 (South Africa, United States) but not later than the cut-off date.

Using a case-cohort design, participants were randomly sampled into an immunogenicity subcohort for D1 and D29 antibody measurements [see the Statistical Analysis Plan (SAP) for the previous ENSEMBLE correlates analyses6]. D1 and D29 antibodies were also measured from all moderate to severe-critical COVID-19 vaccine breakthrough cases (Supplementary Fig. 1). Supplementary Table 1 lists numbers of participants included in analyses; Supplementary Fig. 2 shows the study flowchart. Supplementary Table 2 provides demographic and clinical information of subcohort members (839 vaccine and 91 placebo recipients), and Supplementary Tables 35 provide region-specific breakdowns.

The SARS-CoV-2 variants causing the severe-critical cases varied over time and by region (Fig. 1, Supplementary Fig. 3). In Latin America, the most prevalent variants were Reference, Gamma, and Mu, causing 7, 9, and 4 of 23 cases in the vaccine arm and 29, 24, and 18 of 89 cases in the placebo arm, respectively. [As in Sadoff et al.5, “Reference” refers to the index strain (GenBank accession number: MN908947.3) harboring the D614G point mutation.] Most severe-critical cases in the United States were Reference (2 of 4 cases in the vaccine arm, 16 of 20 cases in the placebo arm), and all in South Africa were Beta (14 placebo, 2 vaccine). Half (21) of the 42 severe-critical vaccine breakthrough cases had between 10 and 12 symptoms (Supplementary Tables 6, 7).

Fig. 1: SARS-CoV-2 variants causing the severe-critical COVID-19 endpoints.
figure 1

Variants are shown by calendar date of severe-critical COVID-19 occurrence and are broken out by geographic region (a, Latin America; b, USA; c, South Africa) and treatment assignment. Endpoint counts do not require having D1 and D29 antibody marker data (see the flowchart provided as Supplementary Fig. 2). As in Sadoff et al.5, “Reference” refers to the index strain (GenBank accession number: MN908947.3) harboring the D614G point mutation.

The proportion of vaccine recipients with quantifiable D29 nAb-ID50 titer [lower limit of quantitation = 4.8975 International Units (IU)/ml (IU50/ml)] was lowest in severe-critical cases (30.9%), intermediate in moderate cases (39.7%), and highest in non-cases (46.1%) (Fig. 2a). Geometric mean nAb-ID50 titers were 4.28 IU50/ml (95% CI: 3.15, 5.82), 5.02 IU50/ml (4.48, 5.63), and 6.06 IU50/ml (5.50, 6.67) in severe-critical cases, moderate cases, and non-cases, yielding a severe-critical case:non-case ratio of 0.71 (0.51, 0.98) and a moderate case:non-case ratio of 0.83 (0.71, 0.96) (Table 1). D29 bAb response frequencies and levels were also lower in cases than in non-cases for both endpoints (Fig. 2b, c; Table 1).

Fig. 2: D29 antibody marker level by COVID-19 outcome status (moderate COVID-19 case, severe-critical COVID-19 case, or non-case).
figure 2

a 50% inhibitory dilution neutralizing antibody (nAb-ID50) titer, (b) anti-Spike IgG concentration, and (c) anti-RBD IgG concentration. Data points are from baseline SARS-CoV-2 seronegative per-protocol vaccine recipients. Violin plots contain interior box plots with upper and lower horizontal edges the 25th and 75th percentiles of antibody level and middle line the 50th percentile, and vertical bars the distance from the 25th (or 75th) percentile of antibody level and the minimum (or maximum) antibody level within the 25th (or 75th) percentile of antibody level minus (or plus) 1.5 times the interquartile range. Each side shows a rotated probability density (estimated by a kernel density estimator with a default Gaussian kernel) of the data. Positive response frequencies (Freq.) computed with inverse probability of sampling weighting. Positive response definitions: Spike IgG, IgG>10.8424 BAU/ml; RBD IgG, IgG>14.0858 BAU/ml. ULoQ: Spike IgG, 238.1165 BAU/ml; RBD IgG, 172.5755 BAU/ml. Positive response for nAb-ID50: D1 nAb-ID50 titer <LLOQ (LLOQ = 4.8975 IU50/ml) with detectable D29 nAb-ID50 ( ≥ LLOQ), or D1 nAb-ID50 > LLOQ with at least a fourfold increase in D29 nAb-ID50. ULoQ: ID50, 844.7208 IU50/ml. Moderate cases are baseline SARS-CoV-2 seronegative per-protocol vaccine recipients with the moderate COVID-19 endpoint (moderate COVID-19 with onset both ≥ 7 days post D29 and ≥28 days post-vaccination) up to 181 days post-D29 but not past data cut (July 9, 2021). Severe-critical cases are baseline SARS-CoV-2 seronegative per-protocol vaccine recipients with the severe-critical COVID-19 endpoint (severe-critical COVID-19 with onset both ≥ 7 days post-D29 and ≥28 days post-vaccination) up to 170 days post-D29 but not past data cut (July 9, 2021). Non-cases are baseline seronegative per-protocol vaccine recipients sampled into the immunogenicity subcohort with no evidence of SARS-CoV-2 infection up to the end of the correlates study period, which is up to 181 days post-D29 but not past data cut (July 9, 2021). BAU binding antibody units, IU international units, LLoQ lower limit of quantitation, Pos.Cut positivity cut-off, ULoQ upper limit of quantitation. Source data are provided as a Source Data file.

Table 1 D29 antibody marker response frequencies and geometric means by COVID-19 outcome status, for all geographic regions pooled as well as separately by geographic region

In vaccine recipients, D29 nAb-ID50 and bAb levels correlated inversely with severe-critical COVID-19 risk and with moderate COVID-19 risk. The hazard ratio (HR) of severe-critical COVID-19 for the High vs. Low nAb-ID50 tertiles was 0.21 (95% CI: 0.07, 0.67), with family-wise error rate (FWER) multiplicity-adjusted p value of 0.087 for a different hazard rate across the three tertile subgroups (Fig. 3e). For moderate COVID-19, the High vs. Low HR was 0.43 (0.25, 0.75), with FWER p = 0.052. Inverse correlations, less strong compared to those seen for nAb-ID50, were observed for the bAb markers with both COVID-19 endpoints (Fig. 3).

Fig. 3: Severe-critical COVID-19 risk and moderate COVID-19 risk by D29 antibody marker tertile.
figure 3

Plots show covariate-adjusted cumulative incidence of (a, c severe-critical COVID-19 or b, d) moderate COVID-19 by Low, Medium, and High tertiles of D29 (a, b) 50% inhibitory dilution neutralizing antibody titer (nAb-ID50) or (b, d) anti-Spike IgG concentration in baseline SARS-CoV-2–seronegative per-protocol vaccine recipients. e Covariate-adjusted hazard ratios of severe-critical COVID-19 or of moderate COVID-19 across D29 antibody marker tertiles. Endpoint counts for (ad) calculated by inverse probability of sampling D29 marker weighting. The overall p value is from a two-sided generalized Wald test of whether the hazard rate of the designated COVID-19 endpoint differed across the Low, Medium, and High subgroups. Analyses adjusted for baseline behavioral risk score and geographic region. BAU binding antibody units, CI confidence interval, FDR false discovery rate, FWER family-wise error rate, IU international units, Pt. Est. point estimate. Source data are provided as a Source Data file.

For the D29 quantitative markers, inverse correlations with endpoints were also observed, again stronger for severe-critical COVID-19, with HR per 10-fold increase in nAb-ID50 titer 0.35 (0.13, 0.90; FWER p = 0.098) compared to 0.53 (0.34, 0.82; FWER p = 0.031) for moderate COVID-19 (Table 2). Inverse correlations, less strong, were observed for both bAb markers with both COVID-19 endpoints (Table 2).

Table 2 Covariate-adjusted hazard ratios of severe-critical COVID-19 or of moderate COVID-19 per tenfold increase or per standard deviation increase in each D29 antibody marker

Cumulative incidence of severe-critical COVID-19 through 170 days post-D29 decreased across the analyzed ranges of vaccine recipient subgroups defined by D29 antibody levels at a specific value. For nAb-ID50, the cumulative incidence of severe-critical COVID-19 was estimated by a nonparametric method over values ranging from unquantifiable titer to the 90th percentile (30.2 IU50/ml). Estimated cumulative incidence was 0.70% (0.45%, 1.05%) at unquantifiable titer, 0.28% (0.07%, 0.90%) at just-quantifiable titer of 5.2 IU50/ml, and 0.09% (0.06%, 0.26%) at the 90th percentile titer 30.2 IU50/ml (blue curve, Supplementary Fig. 17c). Cumulative incidence of moderate COVID-19 also decreased: 5.2% (4.1%, 6.6%), 5.1% (3.2%, 5.9%), and 3.0% (1.8%, 4.4%) at the same values of unquantifiable titer, 5.2 IU50/ml, and 30.2 IU50/ml, respectively (blue curve, Supplementary Fig. 18c). A similar decrease in cumulative incidence with increasing concentration of each bAb marker was observed (Supplementary Figs. 17, 18).

We also estimated by nonparametric regression the cumulative incidence of severe-critical COVID-19 through 170 days post-D29 across ranges of vaccine recipient subgroups defined by D29 antibody levels exceeding a specific value. Cumulative incidence of severe-critical COVID-19 was 0.26% (0.18%, 0.34%) for all vaccine recipients; at the threshold 5.2 IU50/ml just above the nAb-ID50 assay’s lower limit of quantitation (LLOQ, 4.8975 IU50/ml), estimated cumulative incidence decreased to 0.18% (0.074%, 0.29%) (Supplementary Fig. 25c). No further decrease in estimated cumulative incidence with increasing threshold was seen, even at nAb-ID50 thresholds much higher than the LLOQ. For moderate COVID-19, there was a smaller decrease in risk for all vaccine recipients vs. those with nAb-ID50 titer exceeding any threshold above the LLOQ (Supplementary Fig. 26c). Supplementary Figs. 25, 26 also show results for the bAb markers.

With a proportional hazards model including both D29 nAb-ID50 and D29 Spike IgG, the HR of severe-critical COVID-19 per 10-fold increase was 0.31 (0.11, 0.89; p = 0.029) for nAb-ID50 and 1.22 (0.49, 3.02; p = 0.67) for Spike IgG (Table 3), supporting nAb-ID50 as the independent correlate. A similar result was seen for moderate COVID-19: HR 0.55 (0.33, 0.92; p = 0.023) and 0.92 (0.57, 1.50; p = 0.74) per 10-fold increase for nAb-ID50 and Spike IgG, respectively (Table 3).

Table 3 Covariate-adjusted hazard ratios, assessed using multivariable models, of severe-critical COVID-19 or of moderate COVID-19 per tenfold increase in each D29 antibody marker

VE against severe-critical COVID-19 increased with D29 antibody level. For nAb-ID50, estimated VE at unquantifiable titer, just-quantifiable titer of 5.2 IU50/ml, and 90th percentile titer of 30.2 IU50/ml was 63.1% (40.0%, 77.3%), 85.2% (47.2%, 95.3%), and 95.1% (81.1%, 96.9%), respectively (Fig. 4a). In comparison, estimated VE against moderate COVID-19 at the same values of unquantifiable titer, 5.2 IU50/ml, and 30.2 IU50/ml was 32.5% (11.8%, 48.4%), 33.9% (19.1%, 59.3%), and 60.7% (40.4%, 76.4%), respectively (Fig. 4b). For Spike IgG, estimated VE against severe-critical COVID-19 at negative response, just-positive concentration of 11.1 BAU/ml, and 90th percentile concentration of 125 BAU/ml was 65.4% (25.6%, 83.9%), 69.8% (41.8%, 84.6%), and 82.0% (74.4%, 92.5%) (Fig. 4c). In comparison, estimated VE against moderate COVID-19 at the same values of negative response, 11.1 BAU/ml, and 125 BAU/ml was 14.8% (–36.2%, 46.7%), 32.7% (–13.2%, 53.2%), and 59.2% (53.4%, 64.1%), respectively (Fig. 4d).

Fig. 4: Vaccine efficacy against severe-critical COVID-19 or against moderate COVID-19 by D29 antibody marker level.
figure 4

Vaccine efficacy estimates against (a, c) severe-critical COVID-19 and against (b, d) moderate COVID-19 through 170 (severe-critical) or 181 (moderate) days post-D29 were obtained using a nonparametric implementation of the method of Gilbert et al.30. Each point on the curve represents the estimated controlled vaccine efficacy at the given D29 antibody marker level: (a, b) 50% inhibitory dilution neutralizing antibody (nAb-ID50) titer and (c, d) anti-Spike IgG binding antibody concentration. Dotted lines indicate bootstrap pointwise 95% CIs. The green histograms are frequency distributions of D29 marker level, with maroon dots representing marker levels of individual cases. Analyses adjusted for baseline behavioral risk score and geographic region. Curves are plotted over the nAb-ID50 titer range from unquantifiable to the 90th percentile (30.2 IU50/ml) and over the Spike IgG concentration range from negative response to the 90th percentile (125 BAU/ml). The horizontal gray line is the overall vaccine efficacy against (a, c) severe-critical COVID-19 or against (b, d) moderate COVID-19 through 170 (severe-critical) or 181 (moderate) days post-D29, with the dotted gray lines indicating the 95% CIs. BAU binding antibody units, CVE controlled vaccine efficacy, IU international units, LLOQ lower limit of quantitation, nAb-ID50 50% inhibitory dilution neutralizing antibody. nAb-ID50 LLOQ = 4.8975 IU50/ml; Spike IgG positivity cutoff = 10.8424 BAU/ml. Source data are provided as a Source Data file.

Mediation analysis of the D29 markers showed that an estimated 28.6% (8.5%, 48.7%) of VE against severe-critical COVID-19 was mediated by nAb-ID50 titer (Table 4), with a similar proportion, 24.3% (–21.4%, 70.0%), mediated by Spike IgG concentration. In comparison, the estimated proportions of VE against moderate COVID-19 mediated by nAb-ID50 titer and Spike IgG concentration were 50.5% (0.8%, 100%) and 103% (–2.2%, 208%), with notably wide confidence intervals (Table 4).

Table 4 Mediation effect estimates for D29 quantitative markers with 95% confidence intervals

Table 5 scorecards D29 antibody marker correlate performance7, using three categories of correlate-quality criteria: (1) CoR, (2) CoP – VE modification, and (3) CoP – VE mediation (see “Methods”). (See Gilbert et al.8 for a recent summary of four statistical frameworks for assessing immune CoPs, including the VE modification and VE mediation frameworks). We focused on three comparisons. In (A), each D29 marker was compared as a correlate of severe-critical vs. moderate COVID-19. In this comparison, Spike IgG ranked slightly better as a CoR of severe-critical COVID-19 than of moderate COVID-19, was an equally good VE Modification CoP against severe-critical COVID-19 and against moderate COVID-19, and was a better VE Mediation CoP against moderate COVID-19. RBD IgG ranked better as a CoR and as a CoP against moderate COVID-19, whereas nAb-ID50 ranked better as a CoR and a VE Modification CoP against severe-critical COVID-19 but better as a VE Mediation CoP against moderate COVID-19. In (B), the three D29 markers were compared as a correlate of moderate to severe-critical COVID-19. nAb-ID50 ranked as the best CoR and as the best VE Modification CoP, while Spike IgG was the best VE Mediation CoP. Comparison (C) repeated (B) for severe-critical COVID-19. nAb-ID50 ranked as the best CoR and CoP.

Table 5 Scorecard for ranking D29 antibody marker performance in each of three categories of immune correlate-quality criteria

We also applied a third statistical framework for assessing CoPs, stochastic interventional VE (SVE)9, to assess the D29 markers as CoPs against moderate to severe-critical COVID-19. In this framework, VE is estimated under hypothetical immune marker shifts applied to all individual vaccine recipients, relative to their observed immune marker levels. For D29 nAb-ID50, estimated VE generally increased with successive shifts in titer: At no D29 nAb-ID50 shift, estimated SVE was 47.7% (95% CI: 44.6%, 50.7%), and with 1.6-fold, 4-fold, and 10-fold shifts, estimated SVE was 57.3% (53.4%, 60.8%), 54.4% (47.9%, 60.1%), and 62.9% (54.2%, 69.9%), respectively (Supplementary Fig. 49c). The p-value for testing the hypothesis that VE changes as a function of shift in D29 nAb-ID50 titer (see Methods) was <0.001, providing further evidence in support of D29 nAb-ID50 as a CoP against moderate to severe-critical COVID-19. A similar result was seen for D29 RBD IgG, with estimated SVE increasing to 49.7% (46.0%, 53.1%), 58.5% (51.5%, 64.5%), and 69.7% (54.0%, 80.0%), respectively, at the same shifts of 1.6-fold, fourfold, and tenfold in IgG concentration (p = 0.007 for testing the hypothesis that VE changes as a function of shift in D29 RBD IgG concentration) (Supplementary Fig. 49b). For D29 Spike IgG, the increases in SVE were smaller with shifted IgG concentration and the p-value for testing the hypothesis that VE changes as a function of shift in D29 Spike IgG concentration was 0.12 (Supplementary Fig. 49a).

Region-specific correlates analyses of severe-critical COVID-19 could only be conducted for Latin America (31 severe-critical vaccine endpoints) due to the low numbers of severe-critical vaccine endpoints in South Africa (5) and the United States (6) (Supplementary Table 1). While the Latin America-specific results for assessing D29 markers as correlates were similar to those of the region-pooled analyses, the point estimates for the severe-critical endpoint tended to indicate stronger correlates and the p-values tended to indicate greater significance. For example, the HR in the Latin America cohort of severe-critical COVID-19 per 10-fold increase in nAb-ID50 was 0.20 (0.05, 0.73; FWER-adjusted p = 0.048) and for moderate COVID-19 it was 0.53 (0.32, 0.90; FWER-adjusted p = 0.085) (Supplementary Table 12). Full results of the Latin America-specific analyses are reported in the Supplementary Text.

Given that this analysis assesses immune correlates through ~7 months post-vaccination, whereas our previous correlates analysis of ENSEMBLE5 assessed through ~2.5 months post-vaccination, waning of antibody levels over time is important to consider. Using a measurement error statistical method, we performed an exposure-proximal correlates analysis for a hypothetical scenario where the antibody marker under study was repeatedly measured from serum samples collected on every day of follow-up, and the analysis assesses how the current value of this daily measured marker correlates with the hazard of COVID-19 (i.e., the probability of COVID-19 occurrence over the next day) (see Methods for details). From these current-marker conditional hazard curves, we generated current-marker conditional VE curves (exposure-proximal VE) by dividing the conditional hazard curve by the hazard of COVID-19 for the whole placebo arm. Figure 5a shows that exposure-proximal VE against severe-critical COVID-19 rose as current nAb-ID50 titer increased across the range of analyzed values (unquantifiable titer up to the 97.5th percentile). Similar results were obtained for current Spike IgG concentration, albeit with a less steep increase and with a wider 95% CI at the left end of the curve (Fig. 5c). Similarly, exposure-proximal VE against moderate COVID-19 increased with current nAb-ID50 titer (Fig. 5b) as well as with current Spike IgG concentration (Fig. 5d). Latin America-specific exposure-proximal VE curves against severe-critical COVID-19 are shown in Supplementary Fig. 54 and against moderate COVID-19 in Supplementary Fig. 55; these results, which were similar to those in the pooled analysis, are discussed in the Supplementary Text.

Fig. 5: Exposure-proximal vaccine efficacy against severe-critical COVID-19 or against moderate COVID-19 by current antibody marker level.
figure 5

Analyses were performed in baseline SARS-CoV-2 seronegative per-protocol vaccine recipients. Exposure-proximal vaccine efficacy estimates against (a, c) severe-critical COVID-19 and against (b, d) moderate COVID-19 through 170 (severe-critical) or 181 (moderate) days post-D29 by current antibody marker level were obtained using the method of Huang and Follmann44, with “current” referring to the true underlying antibody marker level not subject to technical measurement error, in a hypothetical scenario in which the value was available from serum samples collected every day over the follow-up period (see “Methods”). Each point on the curve represents the vaccine efficacy at the given current antibody marker level: (a, b) 50% inhibitory dilution neutralizing antibody (nAb-ID50) titer and (c, d) anti-Spike IgG binding antibody concentration. The dashed lines are bootstrap pointwise 95% CIs. Analyses adjusted for baseline behavioral risk score and geographic region. Curves are plotted over the range from negative binding antibody response (or unquantifiable neutralizing antibody titer) to the 97.5th percentile of each current antibody marker level: Spike IgG, negative response to 352 BAU/ml; RBD IgG, negative response to 486 BAU/ml; nAb-ID50, unquantifiable to 43.4 IU50/ml. Positivity cutoffs: 10.8424 BAU/ml for Spike and 14.0858 BAU/ml for RBD; nAb-ID50 LLOQ = 4.8975 IU50/ml. BAU binding antibody units, CI confidence interval, IU international units, LLOQ lower limit of quantitation, nAb-ID50 50% inhibitory dilution neutralizing antibody. Source data are provided as a Source Data file.

Discussion

The substantial estimated VE against severe-critical COVID-19 (63.1% at unquantifiable nAb-ID50 titer and 65.4% at negative Spike IgG response) for vaccine recipients with very low D29 antibody levels, along with the finding that only 28.6% (8.5%, 48.7%) of estimated VE was mediated by nAb-ID50 titer (Table 4), suggest that: (1) low antibody levels (unquantifiable/undetectable by the immune assays used here) may protect against severe-critical COVID-19; and (2) markers of other immune functions are likely also correlates of protection against severe-critical COVID-19. Moreover, the lower estimated VE against moderate COVID-19 at unquantifiable nAb-ID50 titer, 32.5% (14.8% at negative Spike IgG response), is consistent with the idea that T-cell responses play an important role in preventing severe disease even at low antibody levels10. In support of this hypothesis, CD8 + T-cell count was shown to associate with survival in patients with both COVID-19 and hematologic cancer (and hence impaired humoral immunity)11. Other studies have also provided evidence that T cells may play a role in preventing severe COVID-19: both the magnitude and frequency of Spike-specific CD4 + T-cell responses measured in the acute phase of COVID-19 were shown to correlate inversely with disease severity, as did CD4 + T-cell response polyantigenicity12. Moreover, SARS-CoV-2-specific CD4 + T-cell response magnitude and SARS-CoV-2-specific CD8 + T-cell response magnitude were each inversely associated with peak disease severity in a cohort of consisting of patients with acute COVID-19 and convalescent donors13. Mechanistic insight into the beneficial role of CD8 + T cells against severe disease was provided by Peng et al., who reported that NP105–113-B*07:02-specific CD8+ T cell response magnitude associated inversely with disease severity and that NP105–113-B*07:02-specific CD8+ T cells showed a highly diverse TCR repertoire, high functional avidity, and antiviral activity as measured by suppression of SARS-CoV-2 replication14.

It is also possible that non-neutralizing Fc effector functions contribute to protection against severe COVID-1915. Although relatively little data are currently available to support this hypothesis, passively administered non-neutralizing antibodies were shown to confer protection against severe disease in a mouse model of SARS-CoV-2 infection, and this protection was linked to their Fc effector functions16.

Our findings based on individual-level correlates of protection analysis are consistent with those of previous studies using complementary approaches to investigate correlates of protection against severe COVID-19 disease outcomes. Khoury et al. took a population-level modeling approach including data from seven phase 3 COVID-19 vaccine efficacy trials and one convalescent cohort and reported that a given nAb level (expressed as fold of convalescent, due to assay differences across the studies) predicts higher VE against severe vs. symptomatic COVID-19, with this difference greatest at lowest nAb levels (Fig. 3a in ref. 17). For example, the nAb level associated with 50% VE against severe COVID-19 was approximately sixfold lower than that associated with 50% VE against symptomatic COVID-19. Subsequently, Cromer et al. applied the model developed by Khoury et al. to show that the prediction for a given nAb level of higher VE against severe vs. symptomatic COVID-19 was most apparent for Ancestral COVID-19, but also held for Alpha, Delta, and Beta COVID-19 (Fig. 3b vs. 3a in ref. 18). Cromer et al. also validated the model of Khoury et al. with vaccine efficacy/effectiveness estimates from one phase 3 randomized, controlled COVID-19 vaccine trial, seven test-negative design studies, and six retrospective cohort studies, showing significant correlation between the predicted vs. reported vaccine efficacy/effectiveness estimates against severe (Fig. 3B) and against symptomatic (Fig. 3A) COVID-1919. A limitation of the model used in refs. 17,18,19 is its assumption that nAbs alone are responsible for protection against severe disease, and thus potential contributions of T-cell responses or other non-neutralizing functions are not considered.

This study examined whether post-vaccination antibody levels correlated with a severe COVID-19 disease outcome, as well as whether they associated with vaccine protection against the same severe outcome. A strength is that the study analyzed individual-level data from a phase 3 randomized, placebo-controlled efficacy trial (RCT), considered “gold standard” data20 due to the lack of biases and confounding that can be present in other types of studies. Another strength of the study, given that all data are from the same phase 3 study, is that all severe-critical cases included in the analysis met the same definition for severe COVID-19 disease. Furthermore, multiple distinct statistical frameworks8 were applied to assess the D29 markers as CoPs of severe-critical COVID-19 (two frameworks used) and against moderate to severe-critical COVID-19 (three frameworks used), with the stochastic interventional vaccine efficacy (SVE) framework not previously applied to assess immune markers as CoPs in the ENSEMBLE trial. Moreover, this analysis of the ENSEMBLE trial assessed current marker levels (as opposed to marker levels measured on a given day post-vaccination, as in our previous correlates analysis of ENSEMBLE) as correlates of instantaneous COVID-19 outcomes, including a severe outcome. Limitations of the present analysis include the fact that the follow-up period considered here was in the pre-Omicron era such that it is unknown whether the antibody measurements analyzed here have the same statistical relationship with VE against Omicron COVID-19, or whether antibody measurements against e.g. Omicron SARS-CoV-2 would be better correlates of Omicron COVID-19. Another limitation is that the analysis was done in baseline SARS-CoV-2 seronegative participants, whereas the majority of the global population is now SARS-CoV-2 seropositive21.

Together, the results of this work build evidence for neutralizing antibody titer in particular as a surrogate endpoint for adenovirus-based COVID-19 vaccine protection from severe COVID-19. The analyses support that a lower nAb titer is needed to achieve a high level of vaccine-induced protection against severe-critical COVID-19 than against moderate COVID-19. A potential implication for nAb-endpoint immunobridging studies is that lower nAb titers may be able to be used to infer effectiveness against a severe-critical endpoint than would be required to infer effectiveness against a moderate or “any-severity” endpoint.

Methods

Ethics statement

All participants provided written informed consent before enrollment. The ENSEMBLE trial (NCT04505722) adhered to the principles of the Declaration of Helsinki and to the Good Clinical Practice guidelines of the International Council for Harmonisation. The protocol (available with Sadoff et al.5) and all amendments were approved by the relevant local ethics committees and Institutional Review Boards (see the Inclusion and Ethics section below for a comprehensive list) according to local regulations. Site PIs were invited as co-authors according to the enrollments performed in the study, and were given the opportunity for intellectual contribution. All experiments were performed in accordance with the relevant guidelines and regulations.

Trial design

The ENSEMBLE trial enrolled and randomized 44,325 participants 1:1 to Ad26.COV2.S vaccine or placebo, with enrollment beginning on September 21, 20205. Participants were not compensated for their participation. All participants were naïve to any investigational COVID-19 vaccine at enrollment. Participants were enrolled at sites in Argentina, Brazil, Chile, Colombia, Mexico, Peru, South Africa, and the United States. See Supplementary Fig. 1 for a schematic diagram of the trial, sampling time points, and blinded follow up period.

Study endpoints and correlates analysis cohort

Moderate, severe-critical, and moderate to severe-critical COVID-19 endpoints were defined as in section 8.1.3.1 of the study protocol of Sadoff et al.5, except with the differences outlined in the “Trial design, study cohort, COVID primary endpoints and case/non-case definitions” section of Methods in Fong et al.6. Moderate, severe-critical, and moderate to severe-critical COVID-19 endpoints starting ≥7 days post D29 and ≥ 28 days post vaccination up to the end of the correlates study period are included. The rationale for starting to count endpoints at 7 days is that the D29 antibody markers in participants diagnosed with a COVID-19 endpoint between 1 and 6 days post D29 might have been influenced by SARS-CoV-2 infection.

As in Fong et al.6, correlates analyses were performed in baseline SARS-CoV-2 seronegative participants in the per-protocol cohort, with the same definition of per-protocol as in ref. 4 Participants with any evidence of SARS-CoV-2 infection or any right censoring up to 7 days post D29 were excluded. Within this correlates analysis cohort:

  • Moderate, severe-critical, and moderate to severe-critical cases were the corresponding COVID-19 disease endpoints occurring in the time frame described above.

  • Non-case vaccine recipients were sampled into the immunogenicity subcohort with no evidence of SARS-CoV-2 infection up to the end of the correlates study period, which was up to 220 days post D1 (Latin America analyses) or 140 days post D1 (South Africa, United States) but not later than the data cut-off date of July 9, 2021.

Laboratory methods

Pseudovirus neutralization assay

Neutralizing antibody titers against lentiviral particles pseudotyped with full-length SARS-CoV-2 Spike (index strain MN908947.3 harboring the D614G point mutation) were measured using the PhenoSense SARS CoV-2 Assay (Monogram Biosciences). This assay has been validated to CLIA/CAP standards and a detailed methods paper has been published22. Briefly, lentiviral particles were produced by co-transfecting HEK 293 cells23 [source: Master Cell Bank (LC0027490) established by Monogram Biosciences in 2001] with a plasmid driving expression of Spike (pCXAS-SARS-CoV-2-D614G) and a lentiviral backbone plasmid (F-lucP.CNDO∆U324). The lentiviral vector contains a firefly luciferase reporter gene, such that the SARS-CoV-2 pseudotyped virus expresses firefly luciferase after infection of HEK 293 cells. Luminescence, measured in relative light units, is directly proportional to virus inoculum infectivity.

At 2 days post-transfection, pseudovirus stock was collected, filtered, and frozen at < − 70 °C). Next, pseudovirus was incubated for one hour at 37 °C with 10 serial three-fold dilutions of serum samples (all human serum samples were heat-inactivated at 56 °C for 60 min before assays were run). A suspension of HEK 293 cells that had been transiently transfected 24 h prior to assay day with plasmids driving the expression of the ACE2 receptor and of the TMPRSS2 protease was then added to the serum-virus mixtures (10,000 cells per well), after which plates were incubated for 3 days at 37 °C in 7% CO2. After the addition of Steady Glo reagent (Promega) to each well, followed by a brief incubation, luciferase signal (relative luminescence units) was measured using a Luminoskan luminometer. Neutralization titers represent the inhibitory dilution (ID) of serum samples at which RLUs were reduced by 50% (ID50) compared to virus control wells (no serum wells). Data analysis (inhibition curve fitting and ID50 determinations) was done using Monogram proprietary analysis software. Using the WHO International Standard for anti-SARS-CoV-2 immunoglobulin 20/13625, assay readouts were converted to standardized International Units (IU50/ml) as described in Fong et al.6. Assay limits are given in Supplementary Table 10.

Solid-phase electrochemiluminescence S-binding IgG immunoassay

Serum IgG binding antibodies against SARS-CoV-2 Spike or the Spike receptor binding domain (RBD) were measured using a validated solid-phase electrochemiluminescence S-binding IgG immunoassay as described26. The assay used custom MSD SECTOR plates precoated by Meso Scale Discovery with Spike, RBD, and BSA (as a control) and was performed with a Beckman Coulter Biomek based automation integration platform. Assay steps included heat inactivation of test serum samples (30 min at 56 °C); blocking of plates for 60 min at room temperature with MSD blocker A solution; plate washing; and addition of reference standard, quality control test sample, and human serum test samples to plates. Each test sera sample was assayed in duplicate (within a run) in an 8-point dilution series. Samples were incubated at room temperature for 4 h with shaking, plates were washed to remove unbound antibodies, and antibodies bound to Spike or to RBD were detected using an MSD SULFO-TAG anti-human IgG detection antibody (Meso Scale Diagnostics, R32AJ-1, goat polyclonal) diluted to 1X from a 200X vendor-provided stock. MSD Discovery Workbench software (version 4.0) was used for analysis. Using the WHO International Standard for anti-SARS-CoV-2 immunoglobulin 20/13625, assay readouts were converted to standardized binding antibody units (BAU) as described26. Assay limits are given in Supplementary Table 10.

Statistical methods

Plots of variants causing the severe-critical cases over time and by region were done in R (version 4.3.1)27. Code for generating these plots is provided in the Supplementary Software 1 file.

Immune correlates analyses were performed as pre-specified in the SAP for the previous ENSEMBLE correlates analyses6, with updates noted below to accommodate the longer follow-up, to accommodate separate correlates analyses against the three study endpoints moderate COVID-19, severe-critical COVID-19, and moderate to severe-critical COVID-19, and to include correlates of protection analyses using the stochastic interventional vaccine efficacy framework as well as to assess exposure-proximal correlates of risk.

Covariate adjustment

All analyses adjusted for a baseline risk score as described in ref. 6 The geographic region-pooled analyses (Latin America, South Africa, United States) additionally adjusted for geographic region.

Correlates of risk in vaccine recipients

As in Fong et al.6, the covariate-adjusted hazard ratio of the relevant COVID-19 disease endpoint (moderate, severe-critical, or moderate to severe-critical), across marker tertiles, per 10-fold increase in quantitative marker, or per standard deviation-increase in quantitative marker, was estimated using inverse probability sampling-weighted Cox regression models with 95% confidence and Wald-based p-values. Cox model fits were done in the R package survey (version 4.0)28, with 95% CIs calculated using the percentile bootstrap. For the plots of marker-conditional cumulative incidence of the relevant COVID-19 disease endpoint, Cox model fits were implemented with the R package vaccine (version 1.2.1)29 with 95% CIs calculated analytically. Nonparametric dose-response regression was also performed with influence-function-based, Wald-based 95% CIs30. Point estimates and 95% CIs for marker-threshold-conditional cumulative incidence were calculated using nonparametric targeted minimum loss-based regression31.

Correlate of protection analyses

Controlled vaccine efficacy

Vaccine efficacy by level of each antibody marker was estimated using a Cox proportional hazards implementation [done using the R package vaccine (version 1.2.1)29] as well as a nonparametric monotone dose-response implementation of the controlled effects approach of Gilbert et al30. and Kenny32 [implemented in the R package vaccine (version 1.2.1)29]. The approach estimates the causal parameter of one minus the probability of the relevant COVID-19 disease endpoint (moderate, severe-critical, or moderate to severe-critical) by 181, 170, or 181 days (all for the geographic region-pooled analysis), respectively, post Day 29 under a hypothetical assignment of all participants to receive the vaccine and to have their D29 marker set to a certain specified value, divided by this probability under a hypothetical assignment of all participants to receive the placebo (see the SAP for the previous ENSEMBLE correlates analyses6, for further details).

Controlled vaccine efficacy sensitivity analysis

To assess the robustness of the controlled vaccine efficacy results to potential unmeasured confounders, a sensitivity analysis was conducted. This analysis repeated the analyses noted above assuming the existence of a specified unmeasured confounder that would make it harder to infer a correlate of protection (see the SAP for the previous ENSEMBLE correlates analyses6 for further details). A sensitivity analysis was also performed based on E-values33 of the vaccine recipient antibody tertile subgroups. See the Supplementary Appendix of ref. 26 and the SAP for the previous ENSEMBLE correlates analyses6 for further details.

Mediation analysis

Each antibody marker was assessed as a mediator of vaccine efficacy using the methods of Kenny 202332, as implemented in the R package vaccine (version 1.2.1)29. With this method, interest lies in the proportion of the vaccine-induced risk reduction that is mediated through a given marker at D29. This proportion is defined as \(1-\frac{\log \left({NDE}\right)}{\log \left({RR}\right)}\), where NDE is the natural direct effect of the vaccine (i.e., the risk ratio of the vaccine group with the antibody marker set to the value if assigned placebo relative to the placebo group, the Non-marker mediated VE) and RR is the risk ratio of the vaccine group relative to the placebo group (RR = 1 – Overall VE). Nonparametric estimators are used to estimate the NDE and the RR, which are in turn used to estimate the proportion mediated. From the proportion mediated formula, it is evident that the proportion is not a true proportion, in that it can take values outside of the interval [0, 1]. That is, given RR < 1, NDE < RR implies the proportion mediated is negative (i.e., greater Non-marker mediated VE than Overall VE implies negative proportion mediated).

Stochastic interventional VE analysis

The USG COVID-19 Response Team / CoVPN Vaccine Efficacy Trial Immune Correlates Statistical Analysis Plan34 describes the use of counterfactual risk and VE measures for hypothetical (analyst-specified) changes to the observed distributions of a fixed set of candidate immune correlates of protection; this has been termed a stochastic-interventional (risk or) vaccine efficacy (SVE) for its relation to causal inference parameters that may be defined based on stochastic interventions35 or modified treatment policies36.

We estimated the counterfactual mean probability of moderate to severe-critical COVID-19 by 181 days post D29 under posited mean shifts in the measured D29 Spike IgG, RBD IgG, and nAb-ID50 levels. For each D29 marker, measured levels were hypothetically shifted along a grid {0, 0.2, 0.4, 0.6, 0.8, 1}, given on the log10 scale such that −1.0 represents a 10-fold decrease in nAb-ID50 titer or concentration and 1.0 represents a 10-fold increase in its titer or IgG concentration, allowing for stochastic-interventional risk of the moderate to severe-critical COVID-19 endpoint to be evaluated via the techniques of Hejazi et al.9 and then translated to the VE scale as detailed in the CoVPN Immune Correlates SAP34. When hypothetical values of such shifts resulted in more than 10% of participants’ counterfactual marker values being placed below an assay’s lower limit of detection (LLOD) (for nAb-ID50), or positivity cut-off (for Spike IgG and RBD IgG), the corresponding hypothetical shifts were omitted from the grid. For the grid of shifts considered for analysis, the trajectory of the estimates of stochastic-interventional (risk or) VE along shifts in GM titer or concentration was summarized by a nonparametric working marginal structural model, resulting in a summary measure of the predicted impact of D29 Spike IgG, RBD IgG, or nAb-ID50 levels on VE. Based on the slope of this linear working model, a hypothesis test for whether VE as a function of shifts in GM titer or concentration changes with the shifts is performed, where a small 2-sided p-value supports a correlate of protection.

Code for conducting the stochastic interventional vaccine efficacy analysis is available in the Supplementary Software 2 file. Analyses were implemented using the txshift (version 0.3.8)37,38 and sl3 (version 1.4.6) packages39 for the R language and environment for statistical computing40; these methods were previously applied to the Moderna COVE vaccine efficacy trial41,42.

Antibody decay and Cox modeling for exposure-proximal correlates

For exposure-proximal immune correlates analyses, a regression calibration43 based approach was adopted as described in Huang and Follmann44. A hazards model was considered for time to event (moderate, severe-critical, or moderate to severe-critical COVID-19):

$$\lambda (s)={\lambda }_{0}(s)\exp (Z\,[{\beta }_{0}+{\beta }_{1}X\{s\}]+{\beta }_{2}W)\,I(\tau \, < \, s),$$
(1)

where s is calendar time, τ is study entry time, Z is treatment indicator (0 and 1 for the placebo and vaccine arm, respectively), X(s) is the true underlying immune marker at calendar time s had the immunoassay been performed on a sample collected on that day, and W are the baseline covariates baseline risk score and region (for analyses pooling over regions). For each immune marker (Spike IgG, RBD IgG, nAb-ID50), a linear mixed effects model was used to model the immune marker trajectory over time, with fixed effect for time since D29, age, sex and random intercept for individuals, adjusting for the case-control sampling weights. As these analyses restricted to data during the blinded phase, the linear mixed effects model with random intercepts is based on two time points: D29 and D71 (see Supplementary Fig. 57 for marker trajectory from a random sample). Only participants with both D29 and D71 measurements contributed to model fitting [paired D29, D71 measurements from n = 719 participants were used to fit the bAb (Spike, RBD) trajectory models and from n = 259 participants were used to fit the nAb-ID50 trajectory model]. Based on the linear mixed effects model fit, the expected value of the immune marker at every day post D29 was estimated conditional on age, sex, and observed history of immune response measures. Cox model parameters were estimated by maximizing the partial likelihood based on the induced hazard43. The instantaneous-hazard vaccine efficacy curve conditional on the immune marker taking current value x, V E(x) = 1 − exp(β0 + β1x), was then estimated based on the β estimates. The nonparametric bootstrap with 500 samples was used to construct the 95% pointwise confidence interval for V E(x).

Scorecard for ranking antibody marker immune correlate performance

As we have previously done7, we systematically ranked the antibody markers as correlates of the different COVID-19 endpoints. Our comparison was based on three categories of correlate-quality criteria: (1) CoR, (2) CoP—VE modification, and (3) CoP—VE mediation. The two criteria for ranking within category (1) were: the HR point estimate per 10-fold increase and the HR point estimate for High vs. Low tertile. The three criteria for ranking within category (2) were: the fold-increase on the multiplicative scale of controlled VE at unquantifiable/undetectable marker level to the 90th percentile of the marker as estimated using a Cox model, the same increase as estimated using a nonparametric model, and the E-value for the point estimate based on the marginalized Cox model (High vs. Low). The criterion for ranking within category (3) was the point estimate of the proportion of VE mediated through the antibody marker, and for comparing antibody markers for a given COVID-19 endpoint, the lower 95% confidence limit for the proportion of VE mediated was used as a second criterion.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.