Introduction

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causal agent of the Coronavirus disease 2019 (COVID-19), a highly contagious infectious disease with a diversity of clinical presentations1 and that impacted the world as a pandemic2,3,4. By the end of December 2023, more than 773 million cases and at least 7 million deaths were reported worldwide (https://data.who.int/ dashboards/covid19/data ).

SARS-CoV-2 is part of the Coronaviridae family, enveloped and positive-sense single-stranded RNA viruses5. Its genomic sequence has been shown to share 80% sequence identity with SARS-CoV and 50% with MERS-CoV, coronaviruses previously associated with outbreaks5. The WHO has designated multiple variants based on their evaluated potential to expand and replace previous variants, for causing new waves of infection with greater circulation and for the need for adjustments in public health actions, highlighting variants of concern (VOC, such as Alpha, Beta, Gamma, Delta, and the original Omicron lineage), variants of interest (VOI) and variants under monitoring (VUM).

Hundreds of studies have provided insights regarding the pandemic’s biology, clinical outcome, and social impact. Even though the emergency was officially over in May 20236, we have witnessed many outbreaks in several regions. This behavior is due to the emergence of mutant variants of SARS-CoV-2 as part of the genetic evolution and adaptation, in which new genome versions are generated7.

COVID-19 has had a significant impact in Latin America and the Caribbean (LAC), extending beyond public health, where the spread of the virus presents different patterns between countries, which could be attributed to variations in health infrastructure and economic strength, as well as by the prevention measures adopted by each country during the pandemic8. Additionally, the ecological and social diversity present in Latin America is one of the causes of the presence of a wide variety of genotypes at the local level since it allows SARS-CoV-2 to spread and evolve in the region8,9,10. These factors have made Latin America one of the most strongly impacted regions, with more than 83 million reported infections and 1.7 million deaths until December 2023 (https://data.who.int/dashboards/covid19/data). The milestones of the development of the COVID-19 pandemic in Latin America are shown in (Fig. 1).

Fig. 1
figure 1

Timeline addressing the most relevant COVID-19 milestones from 2020 to 2024 for Latin America and the Caribbean.

Brazil became the first country in Latin America to confirm a case of COVID-19 on February 26, 2020, associated with a man who had traveled to Italy, where a significant outbreak had been recorded. Subsequently, Argentina reported the first death in the region on March 7 of the same year. The arrival of the pandemic in Latin America was accompanied by limitations in pandemic prevention and control measures due to the living conditions and public health limitations present in low- and middle-income countries11. This region experienced the highest infection rate in proportion to the population, corresponding to 17% of cases worldwide, a figure comparable to that of India, despite having a population almost one billion smaller12. Furthermore, many Latin American countries, especially those with low income, faced limitations in conducting genomic surveillance compared to developed countries12. A study conducted in the region demonstrated that the variants associated with multiple reintroductions and their mutation frequencies align with global trends, representing one of the first integrated analyses in the region13.

Considering the situation mentioned above, the CABANAnet initiative (Capacity building for Bioinformatics in Latin America, https://cabana.network/) has supported the development of a regional project entitled "Genomic, socio-environmental, and capability patterns regarding the circulation of the SARS-CoV-2 virus in Latin America until the year 2024 to support epidemiological surveillance using a multivariate approach". In this project, it was considered crucial to update the data from the last four years of the pandemic (2020 – 2023), which include not only genomic surveillance in Latin America and the Caribbean but also establishing possible relationships between COVID-19 parameters and country-level indicators (social, economic, environmental, and demographic; from now on "Socio-environmental"), as well as to assess sequencing capabilities. Consequently, this study aimed to determine genomic, socio-environmental, and sequencing capacity patterns associated with the circulation of the SARS-CoV-2 virus in Latin America until 2024 to support epidemiological surveillance at the regional level.

Material and methods

The study used data from 24 countries in Latin America and the Caribbean (Argentina, Belize, Bolivia, Brazil, Chile, Colombia, Costa Rica, Cuba, the Dominican Republic, Ecuador, El Salvador, Guatemala, Guyana, Haiti, Honduras, Jamaica, Mexico, Nicaragua, Panama, Paraguay, Peru, Trinidad and Tobago, Uruguay, and Venezuela). The data correspond to samples and information available in databases until December 31, 2023. For the general data (Table 1), parameters of demographics and COVID-19 indicators (cases and deaths) were retrieved from Worldometers platform (https://www.worldometers.info/coronavirus/), while sequencing data was obtained from GISAID database (https://gisaid.org/). From GISAID data, Recombinant lineages were identified based on the lineage ID (Pango) and the annotation found in “Pango-Designation” available at: https://github.com/cov-lineages/pango-designation/blob/master/lineage_notes.txt. See below for details of other databases used in specific analyses. The general pipeline is presented in Fig. 2.

Table 1 Absolute and relative COVID-19 metadata in Latin America and the Caribbean until December 2023.
Fig. 2
figure 2

General workflow based on the use of public databases to study genomic, socio-environmental, and capability patterns regarding the circulation of the SARS-CoV-2 virus in Latin America until December 2023.

Analysis of circulating genotypes of SARS-CoV-2

SARS-CoV-2 sequences and metadata from 24 countries were retrieved from the GISAID database (https://gisaid.org/), encompassing a total of 3 066 sequences based on a globally subsampled SARS-CoV-2 dataset from Nextstrain (https://nextstrain.org/ncov/gisaid/global/all-time?dmax=2023-12-31). All metadata from these records are available as Supplementary material. Utilizing the GISAID platform, the most prevalent variants were analyzed to study the divergence process of SARS-CoV-2 in the region. The analysis involved constructing a tree to elucidate the divergence process in relation to the accumulation of mutations. Additionally, the most prevalent variants and their metadata were cataloged by country, providing a visualization of the diversity and frequency of each variant.

Phylogenomic analysis

Phylogenomic analysis was performed using the 3 066 sequences. Mafft v7.50814 was employed to align all sequences. Construction of the phylogenetic tree was done using IQ-tree v2.1.215, including ModelFinder to select the best nucleotide substitution model (using the Bayesian Information Criterion BIC, the best model was GTR + F + R2). The phylogenomic tree was visualized with iTOL v.416, incorporating data of year and variant, as reported in14. Analyses were performed using the High-Performance Computing Cluster of the Center for Research in Materials Science and Engineering (CICIMA-HPC), University of Costa Rica.

Analysis of demographic, social, economic, environmental and SARS-CoV-2 indicators

A total of 155 demographic, social, economic, and SARS-CoV-2 indicators from the 24 countries were retrieved from several public databases (https://www.worldometers.info/coronavirus/, https://data.unicef.org/resources/resource-type/datasets/, https://hdr.undp.org/data-center/documentation-and-downloads, https://worldpopulationreview.com/country-rankings/average-iq-by-country, https://data.worldbank.org/indicator/AG.LND.PRCP.MM?most_recent_value_desc=true, https://tradingeconomics.com/country-list/temperature, https://countryeconomy.com/gdp, https://gisaid.org/). The indicators were assessed and selected based on their availability among all nations; otherwise, they were excluded from the subsequent statistical analysis. Following this data selection and completeness assessment, 89 socio-environmental and the five COVID-19 indicators (cases per 1 million people, deaths per 1 million people, deaths per 1 million COVID-19 cases, tests per 1 million people, and percentage of sequenced genomes) were selected for subsequent statistical analyses. The whole dataset of 155 and the 94 selected indicators by country is included as Supplementary Material. Using R software (https://www.r-project.org/) and the ‘corrplot’ package (https://cran.r-project.org/web/packages/corrplot/), a Pearson correlation between all socio-environmental indicators versus SARS-CoV-2 indicators was computed using the program’s 'corrplot.mixed’ function, based on values for all the 24 countries. Subsequently, indicators with a Pearson correlation coefficient of > 0.3 were selected after visualization and analysis with the ‘psych’ package (https://cran.r-project.org/web/packages/psych/index.html). After assumption validation, generalized linear models (GLM) were implemented to study the association between demographic, social, and economic indicators as possible predictors of SARS-CoV-2 parameters (normalized values for infections, deaths, and sequencing) using the ‘glm’ function in the R software. Predictors with a p-value < 0.05 were considered significant in each model.

Results

The analysis of COVID-19 cases from LAC countries (Table 1) shows that Brazil had the highest absolute values for cases and deaths and in the number of SARS-CoV-2 genome sequences. Nicaragua had the lowest number of reports for COVID-19 cases and deaths, while Cuba had the lowest number of SARS-CoV-2 sequences. When the data from each country are adjusted by population, the results tend to differ. Uruguay and Chile reached the highest rates of 297,799 and 276,925 COVID-19 cases per million inhabitants, respectively. Peru stands out with 6 595 deaths per million inhabitants and 48 823 deaths per million cases, which correspond to the highest rates associated with deaths attributed to COVID-19 in the region. Chile and Uruguay again stand out in tests performed per million population, with 2 605 252 and 1 749 083 tests per million population, respectively. According to the data obtained from official reports, Nicaragua has the highest number of SARS-CoV-2 genome sequences per case in the region. On the other hand, countries such as Haiti and Nicaragua reported the lowest rates of cases, deaths, and COVID-19 tests per million people, while Bolivia and Cuba had the lowest percentages of sequencing per COVID-19 cases (Table 1).

On the other hand, a phylogenomic analysis was performed with the sub-sampled sequences. The phylogenomic tree (Fig. 3) shows that during the first two years of the COVID-19 pandemic in Latin America and the Caribbean, a diversity was observed in the presence of SARS-CoV-2 variants, including Alpha, Delta, and Gamma (VOCs), as well as other non-VOCs. By 2022, the Omicron VOC became predominant, and in 2023, the Omicron and its sub-lineages were mainly present. This information is further complemented by the distribution and predominance of specific VOCs across the countries in the region (Fig. 4a), as well as by the SARS-CoV-2 mutation rate in the region, which is estimated at 8.39 × 10–4 substitutions per site per year (Fig. 4b).

Fig. 3
figure 3

Phylogenomic analysis based on maximum likelihood of SARS-CoV-2 virus sequences obtained until December 2023. The inner circle with the blue gradients indicates the years when the sequences were obtained. The outermost circle provides a visualization of SARS-CoV-2 variants.

Fig. 4
figure 4

Landscape of the SARS-CoV-2 variants circulating in Latin America and the Caribbean until December 2023. (A). Distribution of SARS-CoV-2 variants. Pie charts indicate the relative abundance of distinct SARS-CoV-2 variants in each country. (B). Mutation rate of the SARS-CoV-2 genomes during the pandemic, including different variants (colors) and the model (black line).

Regarding the analysis of the metadata of the COVID-19 sequences, it was observed that five of the total parameters exhibited several degrees of completeness. For example, “location” was reported with a completeness value as low as 40.5% (Table 2), indicating a loss of specific information about the sample’s origin. The “sex” parameter was represented in five categories: four due to differences in language and one due to a typographical error. Furthermore, three of these parameters lacked uniformity in metadata representation, with the same information being presented in two or more distinct formats.

Table 2 Metadata pertaining to COVID-19 sequences from Latin America and the Caribbean obtained from the GISAID database with missing and non-uniform data.

Finally, 89 socio-environmental indicators and five COVID-19 indicators were initially collected. Details are presented in the Supplementary Material. As shown in Table 3, four COVID-19 parameters (COVID-19 cases per million population, COVID-19 deaths per million population, COVID-19 deaths per million COVID-19 cases and COVID-19 tests per million population) presented strong correlations (positive or negative) and statistically significant association with at least one socio-environmental indicator. In contrast, the percentage of SARS-CoV-2 sequencing did not show a statistically significant correlation or association with any of the indicators under investigation.

Table 3 Socio-environmental indicators with a statistically significant correlation with COVID-19 indicators for Latin America and the Caribbean.

Discussion

Latin America and the Caribbean (LAC) is one of the regions most affected by the COVID-19 emergency, with 1.8 million cumulative deaths by the end of 2023, followed by Europe, which reported 2.3 million deaths (https://data.who.int/dashboards/covid19/deaths?n=c). LAC, home to eight of the 20 most unequal countries in the world, where three-quarters of the nations are classified as low or lower-middle income15, experienced a significant disparity in COVID-19 deaths, ranging from 74 to 6,595 deaths per million inhabitants. Additionally, the pandemic has intensified economic and social inequalities, disproportionately affecting vulnerable populations and highlighting the need for integrated strategies to strengthen health systems and improve resilience to future crises16. Social containment measures to reduce the spread of the virus had varying devastating impacts, exacerbating the vulnerability of informal workers and limiting their access to social protection. This inequality mainly affected the elderly and disadvantaged populations17, with an increased digital gap and limited access to education and essential services15, revealing a lack of preparedness for addressing the crisis18.

In this study, data from 24 countries in LAC were analyzed using public databases as the primary source of information. A sampling of SARS-CoV-2 sequences from cases in the region enabled the phylogenomic study to elucidate the temporal distribution of various VOCs, mutations, recombinants, and more. Additionally, correlation and association analyses were performed using GLM between socio-environmental and SARS-CoV-2 indicators, with data from each country included in the study.

Patterns of sequencing capability and genomic surveillance

Genomic surveillance, comprised of epidemiological analysis, high-throughput molecular technologies, and bioinformatics, has proven to be an essential tool for monitoring and managing infectious diseases. In the case of SARS-CoV-2, this surveillance has been crucial for tracking the evolution, lineage diversity, and global spread of the virus, as well as optimizing molecular tests, treatments, and vaccines to guide public health responses globally14,19,20,21. However, global SARS-CoV-2 genomic surveillance has shown imbalances influenced by socioeconomic factors, available surveillance, and laboratory capacity before the pandemic19. Europe and the Americas have led contributions of SARS-CoV-2 sequences to public repositories, mainly from high-income countries, which presented a proportion of confirmed cases 16 times higher than low- and middle-income countries22.

Regarding sequencing, it has been proposed that at least 5% of SARS-CoV-2 positive samples should be sequenced to detect viral lineages with a prevalence of 0.1 to 1.0%23. Regional reports indicate that this value has yet to be achieved in any country in LAC. In the present study, by the end of 2023, the sequencing percentage of total cases in LAC was 0.7%, surpassing the previously reported 0.4% in the region as of December 202113. Only four countries (Haiti, Jamaica, Trinidad and Tobago, and Nicaragua) sequenced more than 2% of confirmed cases, with Nicaragua being the only country with a sequencing rate above 5%. However, these numbers should be taken with caution, as the cases per million inhabitants in these countries correspond to the lowest rates in the region, which the underreporting of cases could explain. Thus, these data may not reflect the true extent of the disease, as was particularly observed in Nicaragua due to the minimization of the pandemic24. As indicated in our previous study, the data for most countries with high sequencing percentages are biased due to the small number of reported cases adjusted for population13.

Regarding metadata associated with sequences, variability was observed in the completeness and uniformity of data in the GISAID database, particularly in location parameters (40.5% completeness, i.e., 59.5% missing data) and gender (5 categories, non-uniform). This deficiency in data availability has been previously documented, with approximately 63% of sequences lacking demographic data, such as age and gender, being more frequent in high-income regions22. The United States and the United Kingdom are examples of this, as although they have uploaded many sequences, there is a high percentage of missing metadata for age and gender25. In contrast, Slovakia and Slovenia have provided limited sequences with high metadata completeness25. As noted in previous studies, the lack of metadata limits the proper interpretation of results26. Therefore, it is crucial to emphasize the importance of providing more complete metadata to maximize its utility in epidemiological analyses.

Phylogenomic patterns in the LAC region

Phylogenomic analyses were run with a subsample of 3 066 sequences until 2023. In the first period during the first two years of the COVID-19 pandemic in LAC, a diversity was observed in the circulation of SARS-CoV-2 variants, including VOCs such as Alpha, Delta, and Gamma, as well as variants not classified as VOCs, in line with previous regional reports13,27. The Gamma variant is the only VOC initially identified in LAC, specifically in Brazil28. Colombia and Venezuela played a significant role in its spread29. This variant stood out for its high transmissibility and prevalence in the region during the first half of 2021.

Other lineages, including the VOI Mu and Lambda were first reported in Colombia and Peru, respectively30, as well as unique lineages in Costa Rica and Central America14,31. In the case of Mu, the lineage outcompeted all other variants in Colombia, including the Gamma VOC, in May 202132. A similar pattern was found for Lambda in Peru during the period from March to June 202113.

Later, the Delta variant emerged as the dominant variant in the second half of the same year, possibly associated with border openings and its crucial role in lineage dispersal29, unlike the first half of 2021, where lineage dominance was more specific to each country13.

By 2022, the Omicron variant had become dominant, and in 2023, Omicron sub-lineages prevailed, consistent with the global dynamics of this variant’s spread28. This dynamic included reports of many sequences with more mutations and recombinant genomes, a phenomenon associated with the co-circulation of variants in the region, which highlights the complexity of virus transmission and evolution19,22, not only due to co-infections that promote recombination33,34 but also sustained increases in mutation rates. In the latter case, according to our analysis, the substitution rate estimated (8.39 × 10−4 substitutions per site per year) is similar to the reported in the literature in the range of 8–20 × 10−4 substitutions per site per year worldwide13,35,36,37.

Patterns between socio-environmental factors and COVID-19 development in LAC

A correlation and GLM association analysis were conducted using socio-environmental and SARS-CoV-2 indicators, totaling 94 parameters. Of the diverse socio-environmental indicators analyzed in this study, 9% (8/89) simultaneously showed a Pearson correlation above the threshold investigated and were statistically significant through GLM analysis with COVID-19 indicators. Temperature was the only environmental predictor associated with two COVID-19 indicators (cases and tests performed, each adjusted per million inhabitants). The negative correlation observed between COVID-19 cases and this indicator has been previously reported in LAC countries10 and globally38. This finding suggests that in countries where temperatures drop, COVID-19 cases tend to increase. This could be explained by the fact that at higher temperatures, SARS-CoV-2 in aerosols decompose more rapidly39, while at lower temperatures, the virus persists longer in the environment, increasing the possibility of transmission40.

COVID-19 deaths (adjusted for population) and the Human Development Index (HDI) showed a positive correlation, contrasting with expectations, as the HDI is associated with higher life expectancy, education, and income. This positive correlation may be due to the inequality gap existing in the region10. The number of deaths from COVID-19 (adjusted per number of cases) showed a negative correlation with years of schooling, a phenomenon previously documented41. People with lower education levels have limited access to accurate information on COVID-19 prevention and face more significant barriers to implementing recommended prevention strategies42. Moreover, they are often employed in informal jobs10, making it difficult for them to stay home during the pandemic. In this sense, some important considerations regarding COVID-19 parameters are that they may not accurately reflect the situation in different countries. For instance, the number of diagnosed cases can impact adjusted parameters, leading to potential biases. This discrepancy may be explained by factors such as a decline in patient visits for laboratory testing, self-diagnosis using home rapid tests, and public fatigue toward restrictive measures. As a result, certain indicators—such as deaths adjusted per million inhabitants and deaths per million cases—may exhibit different parameter associations in correlations and generalized linear models, despite an expected relationship.

On the other hand, the COVID-19 testing indicator showed a positive correlation with male life expectancy at age 80, average population age, and a negative correlation with mortality between ages 15 and 50 and mortality before age 40. These results suggest that increased COVID-19 testing could positively affect the population, as more tests allow for earlier diagnosis and early identification and treatment of infected individuals to prevent the virus’s spread10. Furthermore, the number of COVID-19 tests performed showed a positive correlation with the elderly dependency ratio, likely reflecting the heightened care needs of this age group during the pandemic to mitigate mortality risks. The negative correlation between COVID-19 tests and temperature could be because viral respiratory infections increase during colder months, leading to a rise in laboratory testing to confirm or rule out possible causative agents.

Last, the sequencing percentages were the only COVID-19 parameter that did not show correlation or significance in the GLM with any of the 89 possible predictors. These results may be related to the number of countries (only 24) and the general low sequencing percentage in the LAC region, complicating the identification of a pattern. Unlike our results, in a global study, sequencing percentages were linked to social and economic development per capita parameters, gross domestic product (GDP) per capita, socio-demographic index, and genomic surveillance capacity before the COVID-19 pandemic43.

Integration and final remarks

During the development of the COVID-19 pandemic, an unprecedented generation of genomic data was observed compared to other pathogens. Despite limited access to resources for genomic surveillance (infrastructure, human resources, consumables, etc.) in LAC, most countries in this region joined individual and collective efforts to generate information to aid epidemiological decision-making regarding the pandemic. Thus, countries in LAC implemented a diversity of strategies to combat COVID-19 pandemic, with mixed success and significant challenges. As summarized in44,45,46, Argentina’s early quarantine measures strengthened hospital services and fostered coordination, while Brazil utilized its surveillance system and rapidly increased hospital capacity. Chile achieved low death rates with dynamic quarantines and adequate ICU availability. In terms of genomic surveillance, Chile and Brazil provided protocols used regionally to amplify and sequence the SARS-CoV-2 genome47,48. Both also provided significant number of viral genome sequences in the region. Colombia expanded testing and ICU capacity and supported vulnerable populations through a national fund. Costa Rica’s Ministry of Health provided strategic leadership, facilitating public–private collaboration and mitigating socioeconomic impacts. Both, Costa Rica and Brazil consolidated bioinformatics protocols to assembly and compare SARS-CoV-2 genomes, which were used in several regional studies13,14,49,50,51. Ecuador centralized emergency response coordination, and Mexico maintained public awareness through daily updates by a health undersecretary. In Peru, strong presidential leadership prioritized health, expanded testing, and supported low-income groups. Besides, Peru helped Bolivia to obtain the first genome sequences of the country13. However, challenges abounded in LAC44. Weakened health system of Argentina strained businesses, and Brazil faced testing and ICU capacity issues exacerbated by political tensions. In Chile, regional policy inconsistencies and resource shortages heightened transmission risks. Colombia experienced political discord and inconsistent adherence to restrictions. Costa Rica grappled with resource shortages and economic pressures, while Ecuador’s response was undermined by economic recession and healthcare budget cuts. Mexico faced health service gaps due to reforms and socioeconomic disparities, alongside conflicting governmental messages and rising crime. Peru struggled with a fragmented health system, overcrowding, rural migration, and inconsistent mortality reporting.

These varied outcomes highlight the need for robust health systems, clear communication, and political unity to manage public health crises effectively, including regional collaborations13. In this sense, several efforts arisen from a diversity of organizations to face the pandemic in LAC. Coordination across the health system with international units was essential to controlling and then treating COVID-19 and will be key for future pandemics and global emergencies52. Collaboration and coordination among healthcare professionals, public health officials, researchers and policymakers aids efforts to deliver health services at all levels, especially in a public health crisis13. This includes sharing information, resources, and expertise, as well as implementing consistent measures and guidelines to prevent the spread and management of the virus46.

From a regional and global point of view, a diversity of organizations were pivotal in assisting LAC countries to control the transmission of SARS-CoV-2 and facing the management of the COVID-19 pandemic. The Pan American Health Organization (PAHO), in collaboration with reference and public health laboratories in the region’s countries, established the Regional COVID-19 Genomic Surveillance Network (COVIGEN), which aimed to improve sequencing and generate timely genomic data19,53.

From February 2020 to March 2022, COVIGEN fostered the generation of 126,985 SARS-CoV-2 genomes in 32 countries29. During the first two years of the pandemic, approximately 43% of all SARS-CoV-2 genomes in the LAC region were produced by PAHO-backed laboratories, highlighting the importance of international collaboration networks29. To December 2024, 591,977 virus sequences are reported in LAC members of PAHO54. Despite the high quantities of available genomes, deep and complete phylogenomic studies in the region are scarce, and there are still open discussions related to the quality of assembled genomes55.

The network, later extended to respiratory viruses and called Respiratory Virus Genomic Surveillance Regional Network (RESVIGEN), is an initiative that enhances sequencing capabilities to track virus evolution, support diagnostics, and inform vaccine development, as implemented for the SARS-CoV-2. The network includes 33 regional and national labs, in which seven are reference sequencing laboratories (they support sequencing of other external laboratories within the network) for LAC: Fundação Oswaldo Cruz (Brasil), Instituto de Salud Pública de Chile (Chile), Instituto Nacional de Salud (Colombia), Instituto Costarricense de Investigación y Enseñanza en Nutrición y Salud (Costa Rica), Instituto de Diagnóstico y Referencia Epidemiológicos (México), Instituto Conmemorativo Gorgas de Estudios de la Salud (Panamá) y University of the West Indies (Trinidad y Tobago). All the 33 participating laboratories receive training and technical support to analyze variants and mutations of public health concern, as well as promotes collaboration among countries to better understand the epidemiology of respiratory viruses54. In terms of vaccination, PAHO also supported the manufacture of COVID-19 mRNA vaccines and warrantee vaccine supplies within the LAC region45.

On the other hand, as viral genomic surveillance capacity was minimal in most Caribbean countries until December 2020, the project "COVID-19: Infectious disease Molecular epidemiology for Pathogen Control & Tracking" (COVID-19 IMPACT) was created in Trinidad and Tobago, which provided SARS-CoV-2 sequencing services in the region56.

Additionally, the CABANAnet initiative57, a project aimed at strengthening bioinformatics capacities in Latin America (https://cabana.network/), has supported SARS-CoV-2 surveillance in LAC since the pandemic began, coordinating different countries to analyze sequences and socio-environmental data, such as the present study.

Among many other initiatives, these networks are part of the regional effort to support genomic surveillance systems. This effort is crucial for low- and middle-income countries, where the infrastructure for local sequencing and human resources for data analysis must still be expanded, ensuring preparedness for future pandemics and monitoring endemic pathogens58, thereby improving the global capacity to respond to health emergencies20.

Besides, substantial portion of the global funds to address the pandemic’s impacts by the United Nations Children’s Fund (UNICEF) and the World Bank were destined to LAC to meet humanitarian needs related to COVID-19 and enhance emergency preparedness and response throughout the region59,60. Other organizations provided support and efforts focus on tackling the economic, financial, and social impacts of the COVID-19 crisis through ongoing initiatives, regional projects, and targeted guidance, while upholding internationally recognized standards61. These institutions included the International Labour Organization (ILO), Organization for Economic Cooperation and Development (OECD), Office of the United Nations High Commissioner for Human Rights (OHCHR), Special Rapporteurship on Economic, Social, Cultural, and Environmental Rights (REDESCA, by its Spanish acronym) of the Inter-American Commission on Human Rights (IACHR), United Nations Children’s Fund (UNICEF), United Nations Global Compact (UN Global Compact), and the UN Working Group on Business and Human Rights. Their aim has been to protect human, labor, and children’s rights, address gender equality, safeguard the environment, and promote anti-corruption practices in the management of the COVID-19 pandemic and its consequences. Thus, by working together, they stress the importance of strengthening principles to protect vulnerable communities, support sustainable recovery, and build long-term inclusive and resilient growth in the LAC region61.

Finally, this study presents several limitations that should be considered when interpreting the results. The main limitation lies in the reliance on data from public databases. In particular, the quality and completeness of the data extracted from the GISAID database were variable, which could influence the conclusions derived from the analysis. Additionally, the available sequences may only partially represent virus circulation in the region due to differences in sequencing rates between countries. A similar issue arises with the socio-environmental indicators, whose use was based on data availability, leading to the exclusion of others due to the lack of data for all countries, which may have resulted in the exclusion of potentially significant indicators. Lastly, using a subsample of SARS-CoV-2 sequences will only partially reflect the genetic diversity of the virus in the region, particularly in countries with limited data.

Conclusions

This study highlights the complexity of the COVID-19 emergency in the region, characterized by a diversity of variants with the predominance of some during specific periods, mainly VOCs and some recombinant cases, in line with other parts of the world. Despite the advances in sequencing and phylogenomic analysis, low sequencing rates in several countries underscore the urgent need to strengthen public health infrastructure and improve access to diagnostic technologies, such as genomic sequencing.

Additionally, the findings indicate significant correlations between 9 socio-environmental indicators across 24 LAC countries and four variables associated with cases, deaths, and diagnostic tests related to the virus in the region, although not for sequencing percentages. Our findings demonstrate that social inequalities have directly influenced the development of the COVID-19 disease. Therefore, emphasis must be placed on implementing an integrated epidemiological surveillance approach to improve preparedness for future infections that may affect the world. During the development of the COVID-19 pandemic, it became evident that investment in local capabilities and international cooperation is essential to address the challenges that may arise with the emergence of new variants.