Abstract
Background
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been widespread since 2020 and will likely continue to cause substantial recurring epidemics. However, understanding the underlying infection burden and dynamics, particularly since late 2021 when the Omicron variant emerged, is challenging. Here, we leverage extensive surveillance data available in New York City (NYC) and a comprehensive model-inference system to reconstruct SARS-CoV-2 dynamics therein through August 2023.
Methods
We fit a metapopulation network SEIRSV (Susceptible-Exposed-Infectious-(re)Susceptible-Vaccination) model to age- and neighborhood-specific data of COVID-19 cases, emergency department visits, and deaths in NYC from the pandemic onset in March 2020 to August 2023. We further validate the model-inference estimates using independent SARS-CoV-2 wastewater viral load data.
Results
The validated model-inference estimates indicate a very high infection burden—the number of infections (i.e., including undetected asymptomatic/mild infections) totaled twice the population size ( > 5 times documented case count) during the first 3.5 years. Estimated virus transmissibility increased around 3-fold, whereas estimated infection-fatality risk (IFR) decreased by >10-fold during this period. The detailed estimates also reveal highly complex variant dynamics and immune landscape, and higher infection risk during winter in NYC over the study period.
Conclusions
This study provides highly detailed epidemiological estimates and identifies key transmission dynamics and drivers of SARS-CoV-2 during its first 3.5 years of circulation in a large urban center (i.e., NYC). These transmission dynamics and drivers may be relevant to other populations and inform future planning to help mitigate the public health burden of SARS-CoV-2.
Plain Language Summary
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in late 2019, causing the COVID-19 pandemic and multiple epidemics since. Using comprehensive surveillance data and mathematical tools, this study estimated SARS-CoV-2 infection burden and severity over time as well as examined key factors affecting the epidemic patterns, during its first 3.5 years of circulation in New York City. Study findings highlight the emergence of new SARS-CoV-2 strains and higher infection risk in winter as key epidemic drivers during the study period; these may be observed in other populations and could inform future planning to help mitigate the public health burden of SARS-CoV-2.
Similar content being viewed by others
Introduction
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in late 2019. Within months, it quickly spread worldwide, prompting the World Health Organization (WHO) to declare coronavirus disease 2019 (COVID-19) a global public health emergency on January 30, 2020, a designation lasting for 3+ years through May 5, 20231. Populations worldwide have experienced multiple COVID-19 pandemic waves, and will likely continue to endure recurring epidemics, even after the declared ending of the pandemic phase. Given the disease’s historical importance, high potential to cause future epidemics, and long-term health impacts (e.g., long-COVID2), it is important to better understand its transmission dynamics, infection burden, and severity over time.
Many studies have reported SARS-CoV-2 transmission dynamics during the initial and subsequent pandemic waves3. However, transmission dynamics after the Omicron BA.1 wave remain less characterized. Many Omicron subvariants have emerged after BA.1, causing outbreaks with varying magnitude and quickly supplanting one another4. While surveillance systems (e.g., registries of laboratory-reported cases and death certificates) can provide valuable information, potential biases (e.g., due to differential test-seeking behaviors) could limit the understanding of epidemic dynamics5,6,7. For example, underlying SARS-CoV-2 infection rates were not completely captured by surveillance based on clinical testing due to high rates of asymptomatic and mild infection8,9 and use of at-home testing (at-home test results are not reported to health departments10), nor were they captured by serologic surveys due to high rates of reinfection. Due to these limitations, to what extent populations are (re)infected by each subvariant and what drives the Omicron-subvariant waves—e.g., increased transmissibility and/or immune evasion—remain unclear.
In this study, we leverage extensive surveillance data available in New York City (NYC) and a comprehensive model-inference system to reconstruct the underlying SARS-CoV-2 transmission dynamics therein during March 2020–August 2023. NYC is a densely populated, large, urban center with 8+ million people that became one of the first pandemic epicenters in March 2020. We have previously reported model-inference estimates for the first two pandemic waves11,12,13. Here, we fit a more detailed model to age- and neighborhood-specific data of COVID-19 cases, emergency department (ED) visits, and deaths, and validate model-inference estimates using independent SARS-CoV-2 wastewater viral load data, i.e., measurements of population-level SARS-CoV-2 fecal shedding that are less subject to testing biases. The validated model-inference estimates allow quantification of weekly infection rates by each (sub)variant and key epidemiologic features including the underlying population susceptibility, variant-specific transmissibility, and infection-fatality risk (IFR) over time since the pandemic onset. Overall, we estimate a very high infection burden totaling twice the population size ( > 5 times the case count) but decreasing IFRs (a > 10-fold reduction across all age groups), and highlight several key factors driving transmission dynamics, during the initial 3.5 years of SARS-CoV-2 circulation.
Results
The model-inference system reconstructed underlying SARS-CoV-2 infection dynamics that are consistent with independent SARS-CoV-2 wastewater surveillance data
The model-inference system is able to recreate the epidemic curves of weekly cases, ED visits, and deaths, combining all ages (Fig. 1a–c) and for individual age groups (Supplementary Fig. 1). Given the large uncertainty due to changes in clinical testing and reporting requirements, we further validate the estimates using independent SARS-CoV-2 wastewater surveillance data not used for model inference. For this model validation, we aggregated the model estimates and wastewater surveillance data both to the city level. As shown in Fig. 1d–f, the estimated number of infectious people per 100,000 population per week closely tracked the measured SARS-CoV-2 viral load in wastewater. This close agreement is evident for all three major periods, i.e., the 2nd wave mostly due to the ancestral and Iota variants during Fall 2020/Winter 2021 (Fig. 1d; Pearson correlation coefficient r = 0.91, 95% confidence interval [CI]: 0.84–0.95), the Delta wave during Summer/Fall 2021 (Fig. 1e; r = 0.64, 95% CI: 0.29–0.84), and the Omicron period since late November 2021 (Fig. 1f and inset for recent months; r = 0.89, 95% CI: 0.84–0.93). These results indicate the model-inference system adequately accounted for changing infection-detection rates over time, and accurately reconstructed the underlying SARS-CoV-2 infection rates and transmission dynamics during the study period.
Upper panel shows model fit to weekly number of COVID-19 cases (a), emergency department (ED) visits (b), and deaths (c), during the week starting 03/01/20 (mm/dd/yy) to the week starting 08/27/23 (see x-axis). Blue lines show the median estimates and blue areas show 50% (darker) and 95% (lighter) credible intervals (CrIs); dots show the corresponding observations. Lower panel shows model validation using wastewater surveillance data, for the 2nd wave (d), Delta wave (e), the Omicron period (f). Lines and shaded areas show the estimated infection prevalence (i.e., the number of all infectious individuals including those not detected as cases; median, 50% and 95% CrIs; left y-axis). Dots show measured SARS-CoV-2 concentrations in wastewater (right y-axis, in million copies per day per population) for the corresponding weeks (black dots show measurements using RT-qPCR and red dots show measurements using RT-dPCR but converted to RT-qPCR equivalents; note that the wastewater concentrations are scaled for each wave/period to facilitate comparison with model estimates; see Methods for details).
Overview of the COVID-19 pandemic/epidemic dynamics through August 2023
During the study period (the week starting March 1, 2020, to the week starting August 27, 2023), 3.2 million confirmed and probable cases were reported to the NYC Department of Health and Mental Hygiene (NYC Health Department) (or 38.2% of the size of the city’s population; Table 1). However, estimated infections totaled 17.4 million [95% credible interval (CrI): 14.2–21.5], more than five times the documented case count. During the pre-Omicron period (March 2020–November 2021), the model-inference system estimated cumulative infections totaling 53.7% (95% CrI: 43.7–64.6%) of the size of the city’s population; these estimates include all infections, and do not distinguish between initial and subsequent infections for the same individual. Most of these infections were caused by the ancestral and Iota variants during the 1st and 2nd waves (estimated 38.0% of the size of the city’s population, 95% CrI: 19.7–75.4%), followed by Delta (8.4%, 95% CrI: 4.9–21.3%) and Alpha (2.9%, 95% CrI: 1.5–5.6%).
During the Omicron period (November 2021–August 2023), (re)infections by the Omicron subvariants alone tripled, totaling 12.9 million (95% CrI: 10.5–16.1), or 153.3% (95% CrI: 124.8–192.0%) of the size of the city’s population. The BA.1 wave was the largest Omicron-subvariant wave thus far, infecting around 40% of the size of the city’s population, or 3.5 million people (95% CrI: 2.2–5.6), within roughly two months (Table 1 and Fig. 2a). After BA.1 subsided, multiple Omicron subvariants circulated in NYC. By the end of August 2023, at least 14 Omicron subvariants including BA.1 had an estimated cumulative infection rate surpassing 1% of the city’s population size (vs. only four such variants prior to Omicron; Table 1). Multiple smaller Omicron-subvariant waves occurred, often with several subvariants cocirculating (Fig. 2a). Most notably, the BA.2/BA.2.12.1 wave occurred during Spring/Summer 2022, the BA.5 wave during Summer/Fall 2022, and the XBB.1.5 wave during Winter 2023, each infecting around 20% of the size of the city’s population (Table 1 and Fig. 2a).
a shows estimated infection rates; b shows estimated population immunity dynamics; and (c) shows estimated virus transmissibility. In (a), colored bars show estimated median weekly infection rates, for each variant (see legends). In (b), we overlay estimated population susceptibility [left y-axis; blue line = median, blue areas = 50% (darker) and 95% (lighter) CrIs], and proxies of cumulative infection (colored stacked bars from top to bottom, right y-axis; same legends as in A for different variants) and vaccine-induced immunity against infection (open bars; see Methods). In (c), we show estimated virus transmissibility [left y-axis; blue line = median, blue areas = 50% (darker) and 95% (lighter) CrIs] and infection rates [boxplot and right y-axis; middle bar = median, edges = 50% CrIs, and whiskers = 95% CrIs] for the corresponding weeks.
Key factors driving SARS-CoV-2 transmission dynamics
SARS-CoV-2 transmission dynamics have been driven by multiple factors, including use of nonpharmaceutical interventions (NPIs), population immunity (due to prior infection and/or vaccination), new variants, and seasonal risk of infection11,12,13, all of which were accounted by the model-inference system (see Methods). Since NPIs have become less prevalent during more recent waves (e.g., mask mandate in NYC schools was lifted in March 202214), here we focus on reporting the impact of the other aforementioned factors.
First, population susceptibility varies following surges in infections, vaccinations, and circulations of immune evasive variants (Fig. 2b), and in turn determines the epidemic trajectory. Before the Delta wave, mixed immunity from both prior infections and vaccinations collectively lowered population susceptibility such that the sums (stacked bars, from top to bottom, in Fig. 2b; see details in Methods) closely tracked the complement of estimated susceptibility (i.e., the estimated composite population immunity against infection; see blue line and shaded area in Fig. 2b). The Delta variant partially evaded both infection- and vaccine-induced immunity15,16 such that the estimated susceptibility substantially increased during the Delta wave; the estimated population immunity was lower than the sum of prior infections and vaccinations (note this sum would roughly reflect the maximum of expected population immunity, should there be no immune evasion; see the stacked bars dipping below the blue line in Fig. 2b). Nonetheless, strong mixed immunity at the time (>50%; see stacked bars and blue lines in Fig. 2b) likely helped to temper the intensity of the Delta wave in Summer 2021.
Omicron BA.1 was highly immune evasive against all pre-existing variants17,18,19. After adjusting for the lower vaccine effectiveness and weaker immunity from pre-Omicron infections (see Methods), the combined mixed immunity (stacked bars) closely matched the complement of estimated susceptibility (blue line) during the BA.1 wave (Fig. 2b). It is evident from Fig. 2b that rapid accumulation of BA.1 infection (pink bars) along with fast uptake of the 3rd vaccine dose (open bars) at the time substantially increased population immunity, which likely accelerated the decline of BA.1. The large BA.1-infection-induced immunity also appeared to curb immediate surge of subsequent Omicron subvariants—particularly, BA.2/BA.2.12.1 and BA.5, even though these subvariants were able to partially evade that prior immunity20,21 (Fig. 2b, see the stacked bars dipping below the blue line during Summer 2022). However, in Fall 2022/Winter 2023, infection-induced immunity appeared to come from a large number of post-BA.1 subvariants, accumulated through their continued spread (see increasing number of colors, each representing one subvariant, during the last part of the study period in Fig. 2a, b).
Second, virus transmissibility (RTX) can increase, helping newer variants to outcompete pre-existing ones. Here, to capture virus-specific transmissibility17,22, we separated the effects of changing population susceptibility, NPIs, and seasonal risk of infection. Unlike the effective reproduction number Rt (i.e., the average number of secondary infections23), which can fluctuate due to the aforementioned effects, changes in RTX closely followed the surge of major variants (see large drops in Rt around the pandemic onset due to NPIs in Supplementary Fig. 2a vs. the relative stable RTX in Fig. 2c). That is, here RTX is akin to the basic reproduction number R0 (i.e., the average number of secondary infections in a naïve population23) that measures the inherent transmissibility of a virus, and can be tracked over time for, e.g., new variants (vs. R0 being estimated only at the pandemic onset when the entire population is susceptible; see Methods and refs. 17,22).
In the above context, we estimate that virus transmissibility (RTX) has increased by nearly 3-fold in three years, but has appeared to level off since the latter half of 2022 (Fig. 2c). As reported previously13,24, Iota and Alpha increased virus transmissibility, allowing them to outcompete the ancestral variant. In NYC, RTX increased by ~20% during the 2nd wave largely due to the mixed circulation of Iota and Alpha (Table 2). The Delta variant further increased virus transmissibility by another ~30% (or ~60% compared to the ancestral variant; Table 2), which, along with its immune evasive ability, allowed it to spread during Summer and Fall 2021 despite the relatively high population immunity at the time (Fig. 2b). The Omicron BA.1 subvariant further increased virus transmissibility. In NYC, average RTX during the BA.1 wave was 2.3 times the 1st (ancestral variant) wave and further increased by ~20% post-BA.1. Importantly, RTX remained around the same level through August 2023 (Table 2 and Fig. 2c), suggesting immune evasion and waning immunity (Supplementary Fig. 2d) have been stronger drivers of the subvariant turnover since Summer 2022. Overall, these estimates are consistent with those reported in the literature (see, e.g., previous studies related to Iota in ref. 13, Alpha in refs. 22,24 Delta in ref. 25 and discussion therein, and Omicron subvariants in refs. 17,26).
Third, seasonal conditions such as humidity and temperature may modulate the transmission of respiratory viruses including SARS-CoV-227,28,29,30; in particular, low humidity and low temperature conditions commonly seen during the winter are conducive for SARS-CoV-2 survival27. In addition, indoor crowding with reduced ventilation may also facilitate transmission31. While infection rates could surge during summer months when new variants emerge, higher infection rates have occurred during winter months, peaking in December or January in NYC during the 3.5-year study period (Fig. 3a). This pattern is further evident from Fig. 3b, where the scaled infection rates during winter months were often more than twice as high as the summer months. This timing, despite multiple concurrent drivers including new variant emergence, highlights higher SARS-CoV-2 infection risk during the winter months in NYC during this study period.
a shows estimated infection rates (light blue bars, full height; i.e., not stacked) and reported case rates (darker blue portion) by month; error bars show estimated 95% CrIs. To examine the infection annual pattern, b shows the monthly infection rates scaled to the annual maximum (here, a year starts in September, the start of fall/cold months in the Northern hemisphere, and ends in the next August, the end of winter/cold months in Southern hemisphere). March–August 2020 is not shown due to the incompleteness. c shows estimated infection-detection rate [blue line = median, blue areas = 50% (darker) and 95% (lighter) CrIs], for each week. Note the vertical dashed line indicates the week starting 11/21/21 when Omicron BA.1 was first detected in NYC, and estimates to the right of the dashed line are for Omicron (sub)variants alone.
Changes in infection-detection rate
The infection-detection rate (i.e., case ascertainment rate) represents the proportion of infections detected as cases, and is crucial for accurate estimation of specific outcomes (e.g., infection rates and infection-fatality risk) to inform public health response32. Estimating the infection-detection rate of SARS-CoV-2 has been challenging due to multiple factors (e.g., undetected asymptomatic/mild infections)5,6,32. Here we resolved these challenges by comprehensive model inference (see Methods) and validated estimated infection-detection rates in NYC using independent wastewater surveillance data (Fig. 1). In NYC, estimated infection-detection rates were very low at the onset of the 1st pandemic wave and the Omicron BA.1 wave—only 2.1% (95% CrI: 0.2–4.5%) and 3.0% (95% CrI: 1.1–5.4%) of infections were detected as cases, respectively (Fig. 3c). Estimated infection-detection rate increased substantially after the initial weeks of the pandemic but fluctuated over time (Fig. 3c); the highest rates were estimated during the week of January 3, 2021 (42.7%, 95% CrI: 27.9–56.2%) before the Omicron variant emerged, and during the week of December 12, 2021 following the emergence of Omicron BA.1 (45.7%, 95% CrI: 28.4–67.2%). However, estimated infection-detection rate decreased steadily over time after Spring 2022 and was just ~5% since Summer 2023 (Fig. 3c), which is comparable to the initial weeks of the pandemic.
Examining the estimates by age group, the models estimated higher infection-detection rates for the two oldest age groups (aged ≥65 years; see the orange and red lines in Supplementary Fig. 3), particularly during the first wave given the higher disease severity and more limited testing capacity at the time. With high vaccination coverage by the Delta wave and the vaccine protection against severe disease, estimated infection-detection rates became lower among the elderly (note people ≥65 years were prioritized for COVID-19 vaccinations). During the Omicron period, the estimated age differences reduced and the changes over time in general followed the citywide estimates (Supplementary Fig. 3).
Changes in infection-fatality risk (IFR)
IFR is a key indicator of COVID-19 severity. As reported previously, IFR of SARS-CoV-2 increased log-linearly with age33,34, particularly before mass-vaccination. Thus, we estimated IFR by age group (see Fig. 4, Table 2 and Supplementary Table 1). Consistent with previous reports12,13,35,36, estimated IFR in NYC was highest during the 1st wave (March–May 2020). By the 2nd wave (roughly October 2020–May 2021), IFR had declined by more than half for most age groups, even though it increased transiently due to the circulation of variants such as Iota and Alpha (Fig. 4d, e for those older than 65 years; and Supplementary Table 2). During the latter half of 2021, IFR continued to decline, which likely was influenced by greater population immunity (Fig. 2b). Another substantial decline in IFR occurred following the circulation of Omicron BA.1, which was milder than pre-existing variants as reported previously37. In NYC, estimated IFR during the Omicron BA.1 wave (December 2021–February 2022) declined by around half compared to the Delta wave (July–November 2021), for most age groups (Fig. 4 and Supplementary Table 1). After the Omicron BA.1 wave, overall IFR continued to decline, mostly driven by the lowering IFR among those aged 75 or older (Fig. 4e, f; Table 2 and Supplementary Table 1). Starting the week of April 23, 2023, COVID-19 deaths were classified in NYC per a revised definition (see Methods). To examine the potential impact due to this change, we have also computed IFR including only weeks before the April 23, 2023 revision. The stratified IFR estimates were similar to those through the week of August 27, 2023 (Supplementary Table 1).
a–e show estimates by age group and (f) shows the estimates combining all ages. Blue lines and shaded areas show the median estimates and 50% (darker blue) and 95% (lighter blue) CrIs, for each week (see date of week start in mm/dd/yy in the x-axis). For clarity, insets show estimates during the most recent months.
Discussion
Using comprehensive model inference and data, we have reconstructed the transmission dynamics of SARS-CoV-2 in NYC during March 2020–August 2023. The detailed model-inference estimates, further validated using independent SARS-CoV-2 wastewater surveillance data (Fig. 1), can be used to inform future planning in the city (e.g., to gauge future SARS-CoV-2 infection burden and related healthcare needs). In addition, these estimates help to reveal the highly complex infection dynamics of SARS-CoV-2 and illustrate the key drivers in its continued spread, which may be shared by other populations. Below we focus on highlighting the general SARS-CoV-2 dynamics and key driving mechanisms.
By the end of August 2023, the estimated infection rate totaled twice the population size, indicating the majority of NYC residents may have had at least 2 (re)infections during the first 3.5 years. In addition, 81% of the population received the primary COVID-19 vaccine series, 40% have had an additional monovalent dose, and 23% have had either two additional monovalent doses or a bivalent vaccine (NYC vaccination data; as of 8/31/2023). In combination, these estimates and data suggest a high mixed population immunity. The high mixed population immunity would likely help mitigate the severity of future epidemics. Estimated IFR dropped by more than 10-fold for most age groups by August 2023, potentially attributable to multiple factors, including accumulated mixed immunity, access to improved treatment, and circulation of the milder Omicron subvariants. The potential long-term population health impact of this high infection rate is uncertain given the possibility of post-acute sequelae of SARS-CoV-2 infection (i.e., long-COVID2,38).
As noted previously3, the ability of SARS-CoV-2 to sustain continued spread in an already highly infected/vaccinated population has largely come from new variants, which can evade prior immunity and/or increase transmissibility. However, the dynamics and relative importance of these drivers have changed over time. Here, our estimates for NYC help to inform the interaction of these drivers during the first 3.5 years. When the underlying infection rate was relatively low (e.g., the first two waves), our estimates showed increased virus transmissibility predominantly drove SARS-CoV-2 variant dynamics (e.g., Alpha outcompeting pre-existing variants). As infections and immunity accumulated, we found stronger immune evasion allowed new variants to outcompete pre-existing and co-circulating subvariants, though transmissibility could also increase (Delta and BA.1 are both exemplars). By mid-2022, virus transmissibility appeared to stabilize after a nearly 3-fold increase (Fig. 2c). Meanwhile, immune evasion continued but appeared to occur across multiple subvariants, each with a smaller subset of mutations, which may have allowed them to co-circulate and traverse pockets of resusceptible subpopulations (Fig. 2b, see small changes in susceptibility after the Omicron BA.1 wave, despite substantial infections by 10+ Omicron subvariants). Whether this is a typical pathway of viral evolution to endemicity or whether another major Omicron BA.1-like new variant would emerge due to the nearly saturated immune landscape remains unknown.
The decrease in COVID-19 testing and data collection since early 2022 has raised concerns of timely situational awareness including new variant detection39,40,41. Since late 2022/early 2023, the United States national surveillance strategy42,43 has further shifted to mainly monitoring infection trends and severity (e.g., hospitalizations and mortality), along with genomic surveillance and wastewater surveillance. Here, we estimated very low initial infection-detection rates—roughly, only 1 in 50 infections were detected—at the onset of the first pandemic wave and Omicron BA.1 wave in NYC. In addition, we found that population mobility (an indicator of community mitigation via social distancing11,44,45) was inversely correlated with infection-detection rates during the initial weeks—that is, the lack of community mitigation coincided with low infection-detection rates at the time (see preliminary analysis in Supplementary Table 2). The low infection-detection rates may have facilitated unchecked silent spread of SARS-CoV-2 during those initial weeks, as it likely did in other places46. While the infection-detection rate increased by more than 10-fold during the pandemic, it has again declined to a very low level (Fig. 3c). A fuller appreciation of under-detection in the design and implementation of surveillance systems is thus needed, as are innovative approaches to increase detection and awareness (e.g., wastewater surveillance with timely data sharing47).
Lastly, we note several study limitations. First, we did not account for population migration, which could lead to overestimation of the increase in susceptibility. In particular, the increase in population susceptibility after the Omicron BA.1 wave could be in part due to incoming population with a higher susceptibility than local residents (as NYC likely had a larger Omicron BA.1 wave and higher vaccination coverage than elsewhere), rather than entirely due to immune evasion of subsequent Omicron subvariants. Second, due to the discontinuation of SafeGraph mobility data48, for weeks in 2023, we used mobility trends constructed based on historical data during the pandemic years 2020–2022 (vs. real-time mobility data for weeks before 2023; see Methods) to account for the impact of NPIs. However, we do not expect this to substantially affect the model-inference estimates, as the historical mobility trend was consistent with real-time subway ridership data (Supplementary Fig. 4). Third, the variant proportions among sequenced samples were used to estimate the variant-specific infection rates. However, as these samples may not be representative of the NYC population, estimates may reflect biases in the populations for which SARS-CoV-2 testing and sequencing were conducted. Lastly, per recommendation of the Council of State and Territorial Epidemiologists (CSTE), COVID-19-associated deaths were classified using a revised definition based solely on cause of death listed on the death certificate, for weeks from April 23, 2023 onwards. This revised definition could lead to missing COVID-19-associated deaths and thus underestimation of IFR afterwards. Nonetheless, a similar decline in IFR was estimated in the stratified analysis excluding weeks after the revision (Supplementary Table 1), indicating a true continued decline in IFR after the Omicron BA.1 wave.
In summary, using comprehensive epidemiological data and model inference, we have described potential transmission dynamics of SARS-CoV-2 during its first 3.5 years of circulation in NYC, a large urban center. Study findings highlight immune evasion, transmissibility increases, and higher infection risk during winter as key transmission drivers during the study period; these may be observed in other populations and could inform future planning to help mitigate the public health burden of SARS-CoV-2.
Methods
Data sources and processing
For the model-inference system, we utilized multiple sources of epidemiologic data, including confirmed and probable COVID-19 cases, ED visits, deaths, vaccination, and variant proportions49,50,51. As done and described previously11,12,13, we aggregated all COVID-19 confirmed and probable cases52,53, COVID-19-associated ED visits13,54, and COVID-19-associated deaths53 reported to the NYC Health Department by age group ( < 1, 1-4, 5-14, 15-24, 25-44, 45-64, 65-74, and 75+ year-olds), neighborhood of residence (42 United Hospital Fund neighborhoods in NYC55), and week of occurrence13. For mortality, we note a change in COVID-19-associated death definitions. From March 1, 2020 – April 2, 2023, COVID-19-associated deaths included (1) deaths occurring in persons with laboratory-confirmed SARS-CoV-2 infection (i.e., confirmed COVID-19-associated death) at any point (March 1, 2020–July 23, 2020), within 60 days (July 24, 2020–August 2, 2021), or within 30 days (August 3, 2021–April 2, 2023) of diagnosis; and (2) deaths with COVID-19, SARS-CoV-2 or a similar term listed on the death certificate as an immediate, underlying, or contributing cause of death but without laboratory-confirmation of COVID-19 (i.e., probable COVID-19-associated death)56. From April 3, 2023 through the week of August 27, 2023 (i.e., end of this study), COVID-19-associated deaths included any death where the death certificate included COVID-19 or a common variation of COVID-19, SARS-CoV-2, coronavirus, etc.49. For vaccinations, we included all available vaccine doses to date (i.e., 1st to 5th dose), and aggregated data for each vaccine dose to the same age/neighborhood strata, by date of vaccination50.
To model the impact of NPIs, as done previously11,12,13, we used mobility data from SafeGraph48 to adjust SARS-CoV-2 transmission rate. Note, however, the model-inference system also included a parameter to capture the overall impacts of NPIs not limited to mobility reduction (e.g., additional interventions such as masking; see below). The SafeGraph data were aggregated to the neighborhood level by week without age stratification, and available from the week of March 1, 2020 to the week of December 19, 2022. For the week of December 26, 2022 to the week of August 27, 2023 (i.e., end of our study period), a comparison of historical SafeGraph data (i.e., weeks during March 2020–December 2022, using the maximum mobility recorded for the corresponding week of year to account for seasonal changes) showed a close agreement with real-time subway ridership data (Supplementary Fig. 4). Thus, we used historical SafeGraph data for those weeks.
To compute the variant-specific cases, ED visits, deaths estimated infection rates, we used reported weekly percentage of individual variants among sequenced samples51,57. Variant proportion data started from the week of December 27, 2020, and likely did not fully capture the share of Iota, a major variant that emerged around Fall 2020. Therefore, we combined the ancestral and Iota variants when computing the total number of cases or infections attributable to these variants.
For model validation, we used SARS-CoV-2 wastewater surveillance data, available from August 31, 2020 onward. Note the measurement of SARS-CoV-2 concentration in wastewater during each week (here, averaged over samples from the same week given the roughly 1-week timespan from infection to recovery) provided a snapshot of the prevalence of SARS-CoV-2 presence in the population (i.e., a proxy of SARS-CoV-2 infection prevalence in the population). Specifically, SARS-CoV-2 RNA concentrations were measured at each of the city’s 14 wastewater treatment plants, often twice per week, using quantitative reverse transcription polymerase chain reaction (RT-qPCR) assays during August 31, 2020–April 11, 2023 and reverse transcription digital PCR (RT-dPCR) assays from November 1, 2022 through the week of August 27, 2023 (i.e., end of this study). For weeks after April 11, 2023 when the samples were measured using RT-dPCR alone, we converted the RT-dPCR measurements to RT-qPCR equivalents, by multiplying a simple conversion ratio (i.e., the mean of all RT-qPCR measurements dividing the mean of all RT-dPCR measurements during November 1, 2022–April 11, 2023 when both assays were conducted). To compute the citywide weekly per-capita SARS-CoV-2 wastewater concentrations, we first averaged the per-capita SARS-CoV-2 concentrations (i.e., normalized by sewershed flow rate and population size) for each week and sewershed, and then further aggregated the sewershed-level measurements to the city level (i.e., weighted mean per the population size).
This activity was classified as public health surveillance and exempt from ethical review and informed consent by the Institutional Review Boards of both Columbia University and NYC Health Department.
Model inference to estimate key epidemiological variables and parameters
We used a model-inference system to estimate epidemiological variables and parameters based on case, ED visit, and mortality data, accounting for NPIs, vaccinations, under-detection of infection, and seasonal changes. Built on an approach described in Yang et al.13, here the model-inference system additionally tracks the number of vaccinated individuals and accounts for all vaccine doses as done in ref. 58. Briefly, the model-inference system uses a metapopulation network SEIRSV (Susceptible-Exposed-Infectious-(re)Susceptible-Vaccination) model (Eq. 1) to simulate the transmission of SARS-CoV-2 across the 42 neighborhoods in the city by age group:
where Si, Ei, Ii, Ri, Vi, and Ni are the number of susceptible, exposed (but not yet infectious), infectious, recovered and immune (i.e., protected against infection), vaccinated and immune individuals, and the total population59, respectively, from a given age group (i.e., <1, 1–4, 5–14, 15–24, 25–44, 45–64, 65–74, or 75+ years) in neighborhood-i (i = 1,…42, for the 42 neighborhoods in the city). \({\beta }_{{city}}\) is the average citywide transmission rate; bs is the estimated seasonal trend12. The term bi represents the neighborhood-level transmission rate relative to the city average. The term mij represents the changes in contact rate in each neighborhood (for i = j) or spatial transmission from neighborhood-j to i (for i ≠ j) and was computed based on the mobility data12. Here, we did not explicitly model the impact of individual NPI such as masking, due to the lack of data and the minor impact of masking at the population level (estimated 5–20% reduction11,60). Rather, to account for the overall impact of NPIs including masking, we scaled the mobility data by a multiplicative factor to capture the overall NPI effectiveness when computing mij12. Z, D, and L are the latency period, infectious period, and immunity period, respectively. Note that as all state variables and parameters are time varying and for each age group separately, Eq. 1 omits time (t) and age in the subscripts.
To account for vaccination, \({\upsilon }_{i,k}\) is the number of neighborhood-i residents who were immunized after the k-th dose (k = 1, 2, …, 5 here for up to 5 doses of vaccines to date) at the time step (t), and was computed using vaccination data adjusting for vaccine effectiveness (VE) against infection61,62,63,64,65. Thus, the term \({\sum }_{k=1}^{k=K}{v}_{i,k}\) represents the total number of neighborhood-i residents immunized by any dose of vaccine at the time step. The term \({\sum }_{\tau =0}^{\tau ={{{\rm T}}}}{{\rho }_{\tau }V}_{i,t-\tau }\) accounts for the waning of vaccine protection against infection, where Vi,t-τ is the number of neighborhood-i residents who got vaccinated τ days ago and lost protection on day-t, and ρτ is the VE waning probability computed based on VE duration data64. Note, here we focused on modeling the impact of vaccination on population susceptibility, and that the posterior estimates of population susceptibility were made along with other factors (e.g., infection) using several data streams and model inference as described below.
Using the model-simulated number of infections occurring each day, we further computed the number of cases, ED visits, and deaths each week to match with the observations, as described in refs. 12,13. Using cases as an example, we multiplied the model-simulated number of new infections per day by the infection-detection rate (i.e., case ascertainment rate, or the fraction of infections reported as cases), and further distributed these estimates in time per a distribution of time-from-infection-to-case-detection (Supplementary Table 3); we then aggregated the daily lagged, simulated estimates to weekly totals for model inference.
Each week, the system uses the ensemble adjustment Kalman filter (EAKF)66 to compute the posterior estimates of model state variables and parameters based on the model (prior) estimates and observed case, ED visit, and mortality data per Bayes’ rule12,13. In particular, model posterior estimates include (1) the underlying infection rate including those not reported as cases, for each week (Fig. 2a and Fig. 2c); (2) the number of susceptible individuals (i.e., Si), which provides estimates of population susceptibility over time (Fig. 2b); (3) the citywide transmission rate (\({\beta }_{{city}}\)) and infectious period (see estimates in Supplementary Fig. 2B and C); (4) the time-varying virus transmissibility (RTX, a measure of variant-specific infectiousness as described in refs. 17,22; Fig. 2c), computed as the product of the citywide transmission rate and infectious period; and (5) other key parameters such as the infection-detection rate (Fig. 3c), IFR (Fig. 4), and the real-time production number (Rt; see estimates in Supplementary Fig. 2A). Computer code for a previous study22 using similar model-inference approach is available at Zenodo67.
We ran the model-inference system for the pre-Omicron and Omicron periods separately, given the large differences in variant characteristics and thus prior distributions for the model states and parameters (Supplementary Table 3). For the pre-Omicron period, we initiated the system at the week of March 1, 2020 (i.e., the week the first cases were detected in NYC), and ran it continuously through the week of December 5, 2021 (i.e., the week before the Omicron BA.1 variant was detected in >50% of sequenced cases). For the Omicron period, we reinitiated the system at the week of November 21, 2021 and ran it continuously through the end of the study period; given the initial overlap with the Delta variant in November/December 2021, we computed the number of cases, ED visits, and deaths due to Omicron based on the variant proportion data and used those variant-specific estimates for inference. To account for model uncertainty, we ran the model-inference system 10 times, each with 500 ensemble members randomly drawn from the initial prior ranges (Supplementary Table 3), and combined the posteriors from all runs, as done in ref. 12. Note the credible intervals (CrIs) were constructed directly using the ensemble members; as such, the 95% CrIs tended to be wide particularly during early weeks of the study period when data were sparse, which reflected the large model uncertainty (see, e.g., the wide lighter blue area in Fig. 2b showing the 95% CrIs for the susceptibility estimates during the 1st and 2nd waves) and model outliers (vs. the much tighter 50% CrIs in darker blue in Fig. 2b for the susceptibility estimates).
Model validation using SARS-CoV-2 wastewater surveillance data
To validate model-inference estimates, we compared the infection prevalence estimates (i.e., the estimated number of infectious individuals, including those not detected as cases, in the population each week) to independent SARS-CoV-2 wastewater concentration data (i.e., the collective SARS-CoV-2 viral shedding of the population, regardless of clinical testing practices). While both quantities represent the presence of SARS-CoV-2 in the population, the measurements are on different scales and viral shedding per infection could vary by the infecting variant. Thus, for comparison, we separated the data into three periods: (i) August 31, 2020 (i.e., the first day of wastewater surveillance) – June 26, 2021, predominantly the ancestral and Iota variants; (ii) June 27, 2021 (i.e., the first week the share of Delta exceeding 50% among the sequenced samples) – November 20, 2021, predominantly the Delta variant; and (iii) November 21, 2021 (i.e., the first week Omicron BA.1 was detected) – August 29, 2023 (i.e., the last wastewater sample during the study period), predominantly the Omicron subvariants. We scaled the wastewater measurements by multiplying the ratio of mean infection prevalence estimates and mean wastewater concentrations across all weeks of each period, and overlay the two time series for visual inspection (see Fig. 1b). For simplicity and given that wastewater viral load data and the infection prevalence estimates both represent active SARS-CoV-2 viral shedding, we used concurrent measures/estimates for the same week (i.e., with no lead/lag time) for this comparison.
Estimating variant-specific infection rates
The weekly infection rate estimates from the model-inference system are based on surveillance data combining all reported variants and thus represent infections by any variant circulating during the week. To estimate the variant-specific infection rates for each week, we multiplied the overall infection rate estimate by the proportion among the sequenced samples for each variant during that week. To compute the total variant-specific infection rate, we then summed the weekly estimates across all weeks that a given variant was detected. For each variant, to identify the main circulation period (i.e., calendar weeks when 95% of all infections occurred), we recorded the first week that the cumulative infection rate surpassed 2.5% (i.e., the start) and 97.5% (i.e., the end) of the total.
Qualitative illustration of immunity from vaccinations and infections by different variants
The model-inference system accounted for immunity conferred by prior infection and vaccination and waning (Eq. 1) to compute the posterior estimates of population susceptibility, using epidemiological data and the EAKF inference algorithm as described above. However, because the two immune components overlap (e.g., a recoveree could subsequently get vaccinated and have mixed immunity for both) and the EAKF may not perfectly preserve mass balance, it is difficult to separately quantify their contributions. Thus, to qualitatively examine the population immunity landscape, we used the rolling sum of prior infection as a proxy of infection-induced immunity and that of vaccinations as a proxy of vaccine-induced immunity (shown in Fig. 2b). Specifically, the rolling sum of prior infection was computed by adding all estimated infections during the preceding 0.5Trs days (i.e., the estimated half time of immunity period; see Supplementary Fig. 2d). The rolling sum of vaccinations was computed by adding vaccinations of the primary series, 3rd, and 4th, 5th dose during the preceding 0.5Tvax days (Tvax is the estimated vaccine-induced immunity period) and further multiplying the estimated variant-specific VE (Supplementary Table 3).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The SARS-CoV-2/COVID-19 cases, emergency department visits, mortality, and wastewater surveillance data were used with permission under a Data Use and Nondisclosure Agreement between the NYC Health Department and Columbia University. The NYC Health Department also has a comprehensive, publicly available data website here: https://github.com/nychealth/coronavirus-data. Additional data sources are detailed in the manuscript. Source data for Figs. 1–4 can be found in Supplementary Data 1.
References
Harris, E. WHO Declares End of COVID-19 Global Health Emergency. JAMA-J. Am. Med. Assoc. 329, 1817 (2023).
Davis, H. E., McCorkell, L., Vogel, J. M. & Topol, E. J. Long COVID: major findings, mechanisms and recommendations. Nat. Rev. Microbiol. 21, 133–146 (2023).
Koelle, K., Martin, M. A., Antia, R., Lopman, B. & Dean, N. E. The changing epidemiology of SARS-CoV-2. Science 375, 1116–1121 (2022).
Global Initiative on Sharing All Influenza Data (GISAID), Tracking of hCoV-19 Variants. https://www.gisaid.org/hcov19-variants/.
Riley, S. et al. Resurgence of SARS-CoV-2: Detection by community viral surveillance. Science 372, 990–995 (2021).
Elliott, P. et al. Design and Implementation of a National SARS-CoV-2 Monitoring Program in England: REACT-1 Study. Am. J. Public Health 113, 545–554 (2023).
Brainard, J. et al. Comparison of surveillance systems for monitoring COVID-19 in England: a retrospective observational study. Lancet Public Health 8, e850–e858 (2023).
Sah, P. et al. Asymptomatic SARS-CoV-2 infection: A systematic review and meta-analysis. Proc. Natl. Acad. Sci. USA 118, e2109229118 (2021).
Feng, Z. J. et al. The Epidemiological Characteristics of an Outbreak of 2019 Novel Coronavirus Diseases (COVID-19) - China, 2020. China CDC Wkly. 2, 113–122 (2020).
Rader, B. et al. Use of At-Home COVID-19 Tests—United States, August 23, 2021–March 12, 2022. Morbidity Mortal. Wkly. Rep. 71, 489 (2022).
Yang, W., Shaff, J. & Shaman, J. Effectiveness of non-pharmaceutical interventions to contain COVID-19: a case study of the 2020 spring pandemic wave in New York City. J. R. Soc. Interface 18, 20200822 (2021).
Yang, W. et al. Estimating the infection-fatality risk of SARS-CoV-2 in New York City during the spring 2020 pandemic wave: a model-based analysis. Lancet Infect. Dis. 21, 203–212 (2021).
Yang, W. et al. Epidemiological characteristics of the B.1.526 SARS-CoV-2 variant. Sci. Adv. 8, eabm0300 (2022).
The New York Times, New York City says it will end school mask and indoor proof-of-vaccination mandates. https://www.nytimes.com/2022/02/27/nyregion/new-york-mask-mandate-schools.html.
Mlcochova, P. et al. SARS-CoV-2 B.1.617.2 Delta variant replication and immune evasion. Nature 599, 114–119 (2021).
Bernal, J. L. et al. Effectiveness of Covid-19 Vaccines against the B.1.617.2 (Delta) Variant. N. Engl. J. Med. 385, 585–594 (2021).
Yang, W. & Shaman, J. L. COVID-19 pandemic dynamics in South Africa and epidemiological characteristics of three variants of concern (Beta, Delta, and Omicron). Elife 11, e78933 (2022).
Cele, S. et al. Omicron extensively but incompletely escapes Pfizer BNT162b2 neutralization. Nature 602, 654–656 (2022).
Pulliam, J. R. C. et al. Increased risk of SARS-CoV-2 reinfection associated with emergence of Omicron in South Africa. Science 376, eabn4947 (2022).
Wang, Q. et al. Antibody evasion by SARS-CoV-2 Omicron subvariants BA.2.12.1, BA.4 and BA.5. Nature 608, 603–608 (2022).
Cao, Y. et al. BA.2.12.1, BA.4 and BA.5 escape antibodies elicited by Omicron infection. Nature 608, 593-602 (2022).
Yang, W. & Shaman, J. Development of a model-inference system for estimating epidemiological characteristics of SARS-CoV-2 variants of concern. Nat. Commun. 12, 5573 (2021).
Anderson, R. et al. Reproduction number (R) and growth rate (r) of the COVID-19 epidemic in the UK: methods of estimation, data sources, causes of heterogeneity, and use as a guide in policy formulation. The Royal Society 2020, (2020).
Volz, E. et al. Assessing transmissibility of SARS-CoV-2 lineage B.1.1.7 in England. Nature 593, 266–269 (2021).
Yang, W. & Shaman, J. COVID-19 pandemic dynamics in India, the SARS-CoV-2 Delta variant and implications for vaccination. J. R. Soc. Interface 19, 20210900 (2022).
Chadeau-Hyam, M. et al. Omicron SARS-CoV-2 epidemic in England during February 2022: A series of cross-sectional community surveys. Lancet regional health Eur. 21, 100462 (2022).
Morris, D. H. et al. Mechanistic theory predicts the effects of temperature and humidity on inactivation of SARS-CoV-2 and other enveloped viruses. Elife 10, e65902 (2021).
Marr, L. C., Tang, J. W., Van Mullekom, J. & Lakdawala, S. S. Mechanistic insights into the effect of humidity on airborne influenza virus survival, transmission and incidence. J. R. Soc. Interface 16, 20180298 (2019).
Yang, W. & Marr, L. C. Mechanisms by which ambient humidity may affect viruses in aerosols. Appl Environ. Microbiol 78, 6781–6788 (2012).
Huynh, E. et al. Evidence for a semisolid phase state of aerosols and droplets relevant to the airborne and surface survival of pathogens. Proc. Natl Acad. Sci. USA 119, e2109750119 (2022).
Morawska, L. et al. How can airborne transmission of COVID-19 indoors be minimised? Environ. Int. 142, 105832 (2020).
Russell, T. W. et al. Reconstructing the early global dynamics of under-ascertained COVID-19 cases and infections. BMC Med 18, 332 (2020).
Levin, A. T. et al. Assessing the age specificity of infection fatality rates for COVID-19: systematic review, meta-analysis, and public policy implications. Eur. J. Epidemiol. 35, 1123–1138 (2020).
O’Driscoll, M. et al. Age-specific mortality and immunity patterns of SARS-CoV-2. Nature (2020).
Hoogenboom, W. S. et al. Clinical characteristics of the first and second COVID-19 waves in the Bronx, New York: A retrospective cohort study. Lancet Reg. Health Am. 3, 100041 (2021).
Davies, N. G. et al. Increased mortality in community-tested cases of SARS-CoV-2 lineage B.1.1.7. Nature 593, 270–274 (2021).
Wolter, N. et al. Early assessment of the clinical severity of the SARS-CoV-2 omicron variant in South Africa: a data linkage study. Lancet 399, 437–446 (2022).
E. National Academies of Sciences, Medicine, Long-Term Health Effects of COVID-19: Disability and Function Following SARS-CoV-2 Infection. P. A. Volberding, B. X. Chu, C. M. Spicer, Eds., (The National Academies Press, Washington, DC, 2024), pp. 264.
Ungar, L. Pandemic gets tougher to track as COVID testing plunges. https://apnews.com/article/covid-us-testing-decline-14bf5b0901260b063e4fa444633f4d31.
Kekatos, M. COVID call centers and testing sites close in further sign US is moving past the pandemic. https://abcnews.go.com/Health/covid-call-centers-testing-sites-close-sign-us/story?id=97580639.
Stein, R. As the pandemic ebbs, an influential COVID tracker shuts down. https://www.npr.org/sections/health-shots/2023/02/10/1155790201/as-the-pandemic-ebbs-an-influential-covid-tracker-shuts-down.
Silk, B. J. et al. COVID-19 Surveillance After Expiration of the Public Health Emergency Declaration-United States, May 11, 2023. Mmwr-Morb. Mortal. W 72, 523–528 (2023).
The Council of State and Territorial Epidemiologists, Association of Public Health Laboratories, Interim CSTE and APHL Strategic Framework for SARS-CoV-2 Infection and COVID-19 Surveillance: Priorities and Approaches for State, Territorial, Local, and Tribal Public Health Agencies. https://preparedness.cste.org/wp-content/uploads/2022/10/Interim-CSTE-APHL-COVID-Surveillance-Framework.pdf.
Lasry, A. et al. CDC Public Health Law Program, New York City Department of Health and Mental Hygiene, Louisiana Department of Health, Public Health Seattle and King County, San Francisco COVID-Response Team, Alameda County Public Health Department, San Mateo County Health Department, Marin County Division of Public Health, Timing of Community Mitigation and Changes in Reported COVID-19 and Community Mobility - Four U.S. Metropolitan Areas, February 26-April 1, 2020. Mmwr. Morbidity Mortal. Wkly. Rep. 69, 451–457 (2020).
Kraemer, M. U. G. et al. The effect of human mobility and control measures on the COVID-19 epidemic in China. Science 368, 493–497 (2020).
Li, R. et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science 368, 489–493 (2020).
DeJonge, P. M. Wastewater surveillance data as a complement to emergency department visit data for tracking incidence of influenza A and respiratory syncytial virus—Wisconsin, August 2022–March 2023. Mmwr. Morbidity Mortal. Wkly. Rep. 72, 1005–1009 (2023).
SafeGraph, Weekly Patterns: Foot Traffic Data To Understand The COVID-19 Pandemic. https://www.safegraph.com/weekly-foot-traffic-patterns.
New York City Department of Health and Mental Hygiene, NYC Coronavirus Disease 2019 (COVID-19) Data. 1/10/2024. https://github.com/nychealth/coronavirus-data.
New York City Department of Health and Mental Hygiene, NYC Coronavirus 2019 (COVID-19) Vaccine Data. https://github.com/nychealth/covid-vaccine-data.
New york City Department of Health and Mental Hygiene, Variants. https://github.com/nychealth/coronavirus-data/tree/master/variants.
Centers for Disease Control and Prevention, National Notifiable Diseases Surveillance System (NNDSS) - Coronavirus Disease 2019 (COVID-19). https://ndc.services.cdc.gov/conditions/coronavirus-disease-2019-covid-19/.
New York City Department of Health and Mental Hygiene, Defining confirmed and probable cases and deaths. https://www1.nyc.gov/site/doh/covid/covid-19-data.page.
Lall, R. et al. Advancing the Use of Emergency Department Syndromic Surveillance Data, New York City, 2012-2016. Public Health Rep. 132, 23s–30s (2017).
NewYork City Department of Health and Mental Hygiene, NYC UHF 42 Neighborhoods. http://a816-dohbesp.nyc.gov/IndicatorPublic/EPHTPDF/uhf42.pdf.
New York City Department of Health and Mental Hygiene (DOHMH) COVID-19 Response Team. Preliminary Estimate of Excess Mortality During the COVID-19 Outbreak — New York City, March 11–May 2, 2020. Mmwr. Morbidity Mortal. Wkly. Rep. 69, 603–605 (2020).
Thompson, C. N. et al. Rapid Emergence and Epidemiologic Characteristics of the SARS-CoV-2 B.1.526 Variant - New York City, New York, January 1-April 5, 2021. MMWR Morb. Mortal. Wkly Rep. 70, 712–716 (2021).
Yang, W. & Shaman, J. Development of Accurate Long-lead COVID-19 Forecast. PLoS Comput Biol. 19, e1011278 (2023).
New York City Department of Health and Mental Hygiene, NYC DOHMH population estimates, modified from US Census Bureau interpolated intercensal population estimates, 2000-2018. Updated August 2019.
Jingyi Xiao, Kevin Escandón, Benjamin J. Cowling. Re: Effectiveness of public health measures in reducing the incidence of covid-19, SARS-CoV-2 transmission, and covid-19 mortality: systematic review and meta-analysis. BMJ, 2021. Available at: https://www.bmj.com/content/375/bmj-2021-068302/rr-14.
Polack, F. P. et al. Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine. N. Engl. J. Med. 383, 2603–2615 (2020).
Baden, L. R. et al. Efficacy and Safety of the mRNA-1273 SARS-CoV-2 Vaccine. N. Engl. J. Med. 384, 403–416 (2021).
Haas, E. J. et al. Impact and effectiveness of mRNA BNT162b2 vaccine against SARS-CoV-2 infections and COVID-19 cases, hospitalisations, and deaths following a nationwide vaccination campaign in Israel: an observational study using national surveillance data. Lancet 397, 1819–1829 (2021).
UK Heath Security Agency, COVID-19 vaccine surveillance report (Week 17, 28 April 2022). https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1072064/Vaccine-surveillance-report-week-17.pdf.
Kirsebom, F. C. M. et al. COVID-19 vaccine effectiveness against the omicron (BA.2) variant in England. The Lancet Infectious Diseases. 22, 931–933 (2022).
Anderson, J. L. An ensemble adjustment Kalman filter for data assimilation. Mon. Weather Rev. 129, 2884–2903 (2001).
Yang, W. Data and model code for Yang & Shaman, Development of a model-inference system for estimating epidemiological characteristics of SARS-CoV-2 variants of concern, in Nature Communications 12, 5573 Zenodo. https://doi.org/10.5281/zenodo.5715611 (2021).
Acknowledgements
This study was in part supported by the National Institute of Allergy and Infectious Diseases (AI145883 and AI175747), the Centers for Disease Control and Prevention (CDC) and the Council of State and Territorial Epidemiologists (CSTE; contract no.: NU38OT00297), and the National Science Foundation (DMS-2027369). The authors thank Lauren Firestein for overseeing the data use agreement and facilitating data sharing for this project; Ramona Lall for providing syndromic surveillance emergency department data; Iris Cheng for providing immunization data; Jubayer Ahmed, Nelson De La Cruz, Brandon Nguyen, and Greta Ohanian for managing and providing wastewater data; Elizabeth Luoma and Rebecca Rohrer for their management and provision of variant data; the NYC Health Department COVID data team for overarching data management and provision of data for this project; and Shama Ahuja, Sharon Greene, Scott Harper, Elizabeth Luoma, Aaron Olson, Enoma Omoregie, Mamta Parakh, Celia Quinn, Ulrike Siemetzki-Kapoor, Faten Taki, and Gretchen Van Wye for their input on this manuscript.
Author information
Authors and Affiliations
Contributions
W.Y. designed the study, developed the model-inference system, performed the analysis, and wrote the first draft; H.P. and E.L. provided the COVID-19 case and emergency department visit data; W.L. provided the COVID-19-associated mortality data; E.A.W. provided the SARS-CoV-2 wastewater surveillance data. H.Y. compiled the mobility data. All authors contributed to the final draft.
Corresponding author
Ethics declarations
Competing interests
The authors declare no conflict of interest.
Peer review
Peer review information
Communications Medicine thanks Rafael lopes, Fuqing Wu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. [A peer review file is available].
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Yang, W., Parton, H., Li, W. et al. SARS-CoV-2 dynamics in New York City during March 2020–August 2023. Commun Med 5, 102 (2025). https://doi.org/10.1038/s43856-025-00826-6
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s43856-025-00826-6






