Introduction

Although COVID-19 seems to be under better control since its initial outbreak in 2020, it remains a threat due to the unpredictability of new strains. This global viral pandemic has already resulted in over 700 million infections and 7 million deaths worldwide, including 22 million reported cases in the United Kingdom (UK). In this study, we aim to investigate the time-varying effects of outdoor air pollutant PM2.5 and other important factors on COVID-19 infection rates in the UK. We propose a causal graph-based deep-learning framework that accounts for various confounders, including meteorology, human mobility, public health interventions, demography, co-morbidities, and healthcare access.

Outdoor PM2.5 is the focus of our study for two primary reasons. First, COVID-19 infection rates may increase due to the potential deposition of viral droplets on PM pollutants (Setti et al. 2020). A rigorous study on COVID-19 aerodynamics found that viral aerosol droplets 0.25–1 µm in size could remain suspended in the air (Liu et al. 2020). When these viral droplets are combined with suspended particulates, they can travel greater distances, remain viable in the air for hours, and be inhaled deeply into the lungs, increasing the potential for airborne viral infection (Prather et al. 2020). Second, PM2.5 negatively impacts the human respiratory system, which may further increase COVID-19 infection rates (Xing et al. 2016). Additionally, we will investigate other important factors, such as NO2 and O3 pollutants (Zoran et al. 2020) and public health policy interventions (Mendez-Brito et al. 2021).

Previous evidence has suggested that air pollution can significantly contribute to COVID-19 infection (Conticini et al. 2020; Copat et al. 2020; Han et al. 2021; Lim et al. 2021; Lolli et al. 2020; Solimini et al. 2021). Interestingly, the relationship between outdoor PM2.5 and COVID-19 infection can vary over time. When infections reach a certain threshold, the effect of PM2.5 is likely reduced to a minimal level compared to factors such as mobility and population density. A study in the US suggested that PM2.5 pollution was a more significant contributor to the COVID-19 infection rate during the early outbreak compared to population density at the county level (Messner and Payson, 2020). A cross-country study suggested that PM2.5 was one of the most significant factors associated with the COVID-19 infection rate during the early stages of the outbreak worldwide (Duhon et al. 2021). However, more rigorous modeling and control methodologies that account for time-varying causal relationships and confounding effects are needed to investigate whether there is a causal relationship between outdoor PM2.5 and COVID-19 infection rates and how this relationship may change over time.

Statistical and machine learning methods have been utilized to investigate causal factors affecting COVID-19 infection. Regression-based statistical causal models were developed to examine the impacts of long-term air pollution exposure and socio-demographics on COVID-19 mortality in England (Konstantinoudis et al. 2021) and the US (Torrats-Espinosa, 2021), while considering confounders such as demographics and co-morbidities. Graph-based causal models, utilizing structural equations (causal paths), were used to assess the cumulative impacts of public health policy interventions on COVID-19 cases and deaths in the US (Chernozhukov et al. 2021). Moreover, causal graphs based on domain knowledge or learned from data were utilized to evaluate environmental factors and public health policy interventions affecting COVID-19 cases or deaths in the UK (Kyono and Van der Schaar, 2021), China (Kang et al. 2021), and Germany (Steiger et al. 2021). A graph neural network (GNN) model was developed to evaluate the dynamic causal effects of public health interventions on COVID-19 cases and deaths by capturing the relationship between different counties in the US (i.e., via a county-to-county distance network) while controlling for time-varying confounders (Ma et al. 2022). However, these causal studies often assume that the underlying causal graph, including known and hypothetical causal links, remains unchanged.

Discovering causal structures in dynamic systems has been a central topic in many fields such as physics, biology, and social sciences. Recent advances in deep learning methods have made it possible to explore causal relationships and effects based on causal structure learning and high-dimensional observational data (Vorbach et al. 2021; Xia et al. 2021). Recent studies have highlighted the discovery of causal graphs based on time-series data using graph neural networks (Kipf et al. 2018; Löwe et al. 2022). To account for the time-varying causal relationship between outdoor PM2.5 and COVID-19 infection rates and to incorporate expert knowledge into the data-driven learning process, a time-varying causal graph learning framework guided by domain-specific knowledge is needed.

While previous research has broadly explored the relationships between various outdoor air pollutants, public health interventions, and COVID-19 infection, the time-varying causal effect of outdoor PM2.5 on COVID-19 infection rates, and how other factors, such as public health policy intervention measures, affect infection rates over time, have yet to be thoroughly investigated. By investigating the time-varying causal effects, this study aims to provide targeted insights crucial for developing specific public health strategies to mitigate COVID-19 infection during different pandemic stages.

Motivated by the need to understand the complex dynamics among various factors, such as outdoor air pollutants, public health interventions, and COVID-19 infection rates, this study develops a Bayesian causal deep learning framework. This framework explores the time-varying causal relationships between outdoor PM2.5 concentration and COVID-19 infection rates in the UK. This innovative framework integrates a GNN and a recurrent neural network (RNN) to dynamically model and predict these relationships while accounting for confounding effects such as meteorology, human mobility, and public health interventions. Our findings demonstrate the superior performance of the proposed model in predicting infection rates over traditional methods, highlighting the evolving nature of the causal effects of PM2.5 and public health intervention measures on COVID-19 infection and transmission.

The proposed data-driven Bayesian causal deep learning framework offers several advantages for understanding the complex time-varying relationship between outdoor PM2.5 pollutant concentration and COVID-19 infection. First, the time-varying graph-based encoder-decoder framework can effectively capture dynamic changes in causal relationships over time. The GNN component excels at modeling time-varying causal graph structures, representing the causal connections between multiple variables over time, while the RNN component is capable of processing sequential data, providing more accurate time-series predictions of COVID-19 infection rates based on the evolving causal graph structures. Second, the proposed framework can improve the interpretability of data-driven AI models by uncovering and visualizing time-varying causal relationships over time, providing new insights into how the impact of outdoor PM2.5 pollutant concentration and public health interventions on COVID-19 infection rates evolves across different phases of the pandemic. By leveraging this advanced time-varying causal deep learning framework, our study not only achieves higher predictive accuracy but also generates more interpretable insights crucial for effective pandemic response and preparedness.

Understanding the time-varying causal impacts of various factors affecting disease infection is essential for developing effective public health policies. The findings from this study have significant implications for pandemic response and preparedness. Timely interventions targeting air quality management, such as reducing outdoor PM2.5 levels and enhancing indoor ventilation, can play a crucial role in mitigating respiratory infectious disease transmission. Furthermore, the time-varying effectiveness of public health intervention measures, including school closure and public transport closure, across different pandemic stages, highlights the need for more targeted and adaptive lockdown strategies that respond to changing epidemiological conditions. By leveraging interpretable AI-driven causal analysis, policymakers can make more informed decisions that optimize the timing and effectiveness of policy interventions, ultimately reducing the societal and economic impacts of pandemics. Our proposed approach can be utilized in future pandemic preparedness and applied to other environmental and public health challenges.

Method

Data Collection and Unit of Analysis

The data collected include the following categories across the UK, including England, Wales, Scotland, and Northern Ireland, covering the period from March 2020 to March 2022: (1) COVID-19 infection, testing capacity, and vaccination, (2) short-term and long-term air pollution concentrations, (3) meteorology, (4) human mobility, (5) lockdowns and public health interventions, (6) demography and socioeconomic status (SES), (7) co-morbidity, and (8) healthcare access. During the study period, the COVID-19 infection rate peaked and dropped (see Fig. 1), while the lockdown measures were strictly exercised, lifted, and re-exercised again (see Fig. S1 in Supplementary Information). Three COVID-19 waves, as defined by the latest report published by the Office for National Statistics (ONS) of the UK in January 2022 (ONS, 2022), were covered, including the first wave (from 14 March 2020 to 11 September 2020), the second wave (from 12 September 2020 to 11 June 2021), and the third wave (from 12 June 2021 to 21 January 2022). The third wave was extended to 14 March 2022 to cover the entire study period.

Fig. 1
figure 1

Daily average COVID-19 infection rate (cases per 100,000 population) at the lower tier local authority (LTLA) level in the UK.

The spatial unit of analysis was lower-tier local authority (LTLA), and the temporal unit was one day. Most data types were collected at the LTLA level (see Fig. S2 in Supplementary Information), although some health and census data were initially collected at the general practitioner (GP) catchment area, health trust/board, or national level and were mapped to the LTLA level during data preprocessing. Most data types (including confirmed cases, testing capacity, vaccination, meteorology, human mobility, and non-pharmaceutical public health interventions) were collected daily, while health and census data (including demography and SES, co-morbidity, and healthcare access) were collected monthly, quarterly, or yearly. In total, 380 LTLAs and 731 days were included in the analysis. All data are publicly available from governmental or authoritative sources and are of high quality (see Table 1 for a summary). The details are listed below.

Table 1 Data sources.

COVID-19 infection, testing capacity, and vaccination

Daily numbers of COVID-19 confirmed cases, tests, and vaccinations (first dose, second dose, and third injection) were collected at the LTLA level from Public Health England (PHE) (PHE, n.d.-a). All virus tests, including polymerase chain reaction (PCR) and lateral flow device (LFD) tests, were considered. The third injection included the third dose and booster dose. COVID-19 testing data at the LTLA level were unavailable for Scotland, Wales, and Northern Ireland, and vaccination data at the LTLA level were unavailable for Wales and Northern Ireland. Population-weighted estimates based on national-level data were used for these unavailable data types.

Air pollution

Daily air pollutant concentration data at the monitoring station level across the UK were collected from the Department for Environment, Food & Rural Affairs (Defra) (Defra, n.d.-a). Four pollutant types were considered, including PM2.5, PM10, NO2, and O3. Pre-pandemic daily air pollutant concentration data from March 2015 to March 2020 were also collected from Defra (Defra, n.d.-a).

Meteorology

Daily meteorological data, including temperature, humidity, air pressure, wind speed, precipitation, and sunshine, measured at the monitoring station level across the UK, were collected from the Met Office via the Centre for Environmental Data Analysis (CEDA) (Met Office, 2012). Daily ultraviolet (UV) index data at the monitoring station level were collected from Defra (Defra, n.d.-b).

Human mobility

Daily human mobility data at the LTLA level, including (1) the percent change in retail and recreation from baseline, (2) the percent change in grocery and pharmacy from baseline, (3) the percent change in parks from baseline, (4) the percent change in transit stations from baseline, (5) the percent change in workplaces from baseline, and (6) the percent change in residential from baseline, were collected from Google Maps mobility reports (Google Inc., n.d.). The baseline was determined by the median during a pre-pandemic period (from 3 January to 6 February 2020).

Public health interventions

The effective periods of non-pharmaceutical public health interventions introduced at the national level in the UK (four nations in total) were collected from the Oxford COVID-19 government response tracker (Hale et al. 2021). The following public health policy intervention measures at the daily level were included: school closure, workplace closure, public events cancelation, restrictions on gatherings, public transport closure, stay-at-home requirements, restrictions on internal movement, international travel controls, contact tracing, and facial coverings (see Figure S1 in Supplementary Information). The intervention measures were coded by non-negative integers, with zero indicating no policy intervention and higher values indicating stricter levels of intervention.

Demography and SES

Census data at the LTLA level were collected from the ONS. The most updated 2021 census results for England and Wales were collected, including population, population density, the percentage of young people (0–19 years old), the percentage of older people (≥65 years old), and the ratio of male and female (ONS, n.d.-a). Given that some statistics from the 2021 census were not published or unavailable for Scotland and Northern Ireland at the time of data collection, the 2011 census results were used to complement the 2021 data, including the percentage of the white population, the percentage of the black population, and the percentage of low education (people with no qualifications) (ONS, n.d.-b).

Co-morbidity

Co-morbidity data, including asthma, COPD, hypertension, diabetes, stroke, obesity, dementia, smoking, and HIV, were collected from the most updated annual data sources (up to 2020/21). Disease prevalence data at the GP catchment area level across the UK (excluding Northern Ireland) and at the health trust level in Northern Ireland were collected based on the Quality and Outcomes Framework (QOF) achievement data supplied by the NHS (QOF Database, n.d.). Smoking prevalence data were collected at the LTLA level across the UK (excluding Northern Ireland) (ONS, n.d.-c) and at the health trust level in Northern Ireland (Department of Health in Northern Ireland, n.d.-a). HIV prevalence (per 1,000 population) data were collected at the LTLA level in England from PHE (PHE, n.d.-b) and at the national level from the corresponding public health departments in Wales (Public Health Wales, n.d.), Scotland (Public Health Scotland, n.d.-b), and Northern Ireland (HSC Public Health Agency, n.d.).

Healthcare access

Monthly aggregated numbers of available hospital beds at the NHS trust level in England were collected from the NHS (NHS, n.d.). Quarterly or yearly aggregated numbers of available hospital beds at the health trust/board level were collected from the corresponding statistics or public health departments in Wales (StatsWales, n.d.), Scotland (Public Health Scotland, n.d.-a), and Northern Ireland (Department of Health in Northern Ireland, n.d.-b).

Data preprocessing and variable selection

Data preprocessing was performed as follows. First, the dependent (outcome) variable, COVID-19 cases, was normalized as the COVID-19 infection rate (i.e., cases per 100,000 population) (see Fig. 1 and Fig. S2 in Supplementary Information). The infection rate was log-transformed using log(x + 1), a commonly used transformation for count data preprocessing and analysis (Torrats-Espinosa, 2021). Throughout this paper, results are reported with the log-transformed infection rate reverted to the original scale. Independent variables (covariates) included the following categories: COVID-19 testing and vaccination (per 100,000 population), short-term and long-term air pollution concentrations, meteorology, human mobility, public health policy interventions, demography and SES, co-morbidities, and healthcare access, totaling 55 variables. Table S1 lists the definition of each variable.

Second, mismatches in geographical boundaries across years were aligned, given that LTLA definitions changed over time (e.g., some LTLAs in the 2011 census were later combined into larger LTLAs), resulting in 380 LTLAs in total.

Third, monitored environmental data were aggregated at the LTLA level using average values. Missing air pollution and meteorology data were first interpolated using a linear temporal interpolation via the nearest three-day observations and then spatially interpolated based on distance-weighed K-nearest neighbor (KNN) imputation via the nearest three locations. Daily air pollution data before the pandemic were averaged to represent the long-term air pollution level for each LTLA.

Fourth, daily time-series data were generated for each LTLA. Static data (unchanged during the study period but different across LTLAs), including long-term air pollution levels, demographics, SES, and co-morbidities, were converted to constant time series. Healthcare access data were converted to daily values by repeating the corresponding monthly/quarterly/yearly values. Time-series data were further preprocessed to account for (1) noise and fluctuations in COVID-19 data reporting by using smoothed COVID-19 infection rate, testing, and vaccination data based on three-day moving averages (Kang et al. 2021), (2) the relative importance of each numeric covariate (excluding public health policy interventions) by using feature standardization based on mean and standardized deviation, and (3) the lag effects affecting the COVID-19 infection rate by lagging all covariates by up to fourteen days (the optimal lag was chosen by cross-validation during model training and evaluation).

Finally, before further analysis, a greedy and hierarchical selection procedure was adopted based on the preprocessed data to reduce collinearity among factors affecting the COVID-19 infection rate. Spearman correlation coefficients and variance inflation factor (VIF) were used to quantify the levels of collinearity among variables. Specifically, for all variables in each category (including COVID-19 testing capacity and vaccination, air pollution, meteorology, human mobility, policy interventions, demographics and SES, co-morbidities, or healthcare), a greedy variable selection procedure was performed to first remove collinearity within the same category. The details are listed below.

  1. 1.

    Set the absolute correlation threshold (|Spearman correlation|<0.5) and the VIF threshold (VIF <10). The cut-off values were determined based on previous related literature (Han et al. 2021).

  2. 2.

    Variables are sorted by their F-statistics when predicting the COVID-19 infection rate using a univariate regression model.

  3. 3.

    Select the first feature.

  4. 4.

    Select the next feature if both the correlation and VIF constraints hold.

  5. 5.

    Keep adding features if the constraints hold.

The collinearity of the selected variables from each category was further reduced by selecting a final set of variables using the same greedy selection procedure. Each variable in the final set was removed if the p-value associated with its F-statistic was less significant (p value >0.1). Figure 2 shows the absolute Spearman correlations among the features before and after variable selection (a lighter color indicates a higher level of correlation). The two correlation matrices were generated from the smoothed and normalized time-series data. Before variable selection, variable clusters among air pollution, meteorology, human mobility measures, public health interventions, demographics, and co-morbidities were observed (see Fig. 2a). Some variables were highly correlated, such as PM2.5 vs. PM10, long-term NO2 vs. O3, human mobility vs. lockdown measures, and long-term pollution vs. co-morbidities. After performing the variable selection procedure described above, sixteen variables (including the outcome variable) were selected for further analysis (see Fig. 2b).

Fig. 2: Variable selection to remove collinearity among multiple factors affecting COVID-19 infection rate in the UK.
figure 2

a Variable correlation matrix before variable selection. b Variable correlation matrix after variable selection.

Time-varying graph-based encoder-decoder framework

Deep neural networks allow for capturing non-linear relationships across multifactorial temporal-spatial (T-S) datasets covering COVID-19 infection, environmental data, demography, SES, and health data. Building on recent work that combined causal graph techniques with deep learning methods and time-series data (Löwe et al. 2022), we developed an encoder-decoder framework to investigate if there is any causal relationship between ambient outdoor PM2.5 concentrations and COVID-19 infection rates across different areas in the UK, while addressing confounding bias (see Fig. 3 for an overview). At a high level, the encoder learns a directed acyclic graph (DAG) consisting of multiple nodes, where each node represents a time-series variable, and parent nodes have directed edges to, or “cause” their child nodes. A decoder performs one-step-ahead time-series forecasting using historical time-series data and the learned causal structure from the encoder. The learned causal structure is regularized by non-informed knowledge (e.g., graph sparsity and DAG constraint) and domain-specific knowledge (e.g., a known causal link between NO2 and O3).

Fig. 3
figure 3

A time-varying graph-based encoder-decoder framework for causal graph identification and time-series forecasting in the next time step.

In this study, two major modifications were made to the original encoder-decoder framework proposed by Löwe et al. (2022). First, the causal graph was assumed to be time-varying at every time step, depending on the input data (e.g., different phases during the pandemic), enabling the decoder to make more accurate predictions over time. Second, although the actual causal graph remained not fully understood, some known causal links were used for model regularization. Given the lack of ground truth causal relationships, the model was evaluated based on time-series forecasting performance.

Specifically, we adopted Pearl’s causal graph framework to characterize the underlying T-S data-generating process. The proposed causal discovery and inference framework made two major assumptions. First, we assumed that the causal effect of a variable (e.g., outdoor PM2.5 concentration) on the COVID-19 infection rate could vary over time. Second, we assumed no hidden confounders existed in the causal graph. The second assumption is often made in causal inference modeling using observational data (Hartford et al. 2017). Sensitivity analysis can be further performed to examine the unmeasured confounding effect (Kang et al. 2021).

Observational data points were measured at the daily LTLA level (i.e., local government at the district council level) across the UK over three COVID-19 waves spanning two years (from March 2020 to March 2022). The model output variable was the log-transformed daily number of confirmed cases per 100,000 population (i.e., COVID-19 infection rate). After variable selection (see “Data Preprocessing and Variable Selection” for more details), the model input variables included time-varying features, including testing capacity per 100,000 population, vaccination per 100,000 population, short-term outdoor air pollution concentrations (PM2.5, NO2, and O3), long-term outdoor air pollution concentrations (O3), meteorology (UV index), public health policy interventions and lockdown measures (school closure and public transport closure), and static features, including demographics and SES (the percentage of older people) and comorbidities (the prevalence of dementia, stroke, obesity, smoking, and HIV). The number of days since the start of the corresponding COVID-19 wave was also included to represent the time trends and infection cycles, accounting for unobserved confounding effects over time.

The proposed framework included two components: an encoder and a decoder. After data preprocessing, the dataset consisted of time-series data \({{X}_{i}^{1\ldots T}=\{x}_{i}^{1},{x}_{i}^{2},\,\ldots {x}_{i}^{T}\}\) observed for each variable \(i\), covering the time steps from \(1\) to \(T\), where \(t\) represents one day, and \(T\) is the total number of days during the study period. Each \({x}_{i}^{t}\) is a feature vector representing the observations of variable \(i\) across all LTLAs at time step \(t\). Using a sliding window of size \(L+1\) (i.e., the historical time steps from \(t-L+1\) to \(t\) plus the next time step \(t+1\)), the entire input sequence from \(1\) to \(T\) was converted to the model input dataset \(D\) consisting of \(N\) samples. The sequence length of each sample is \(L+1\).

The encoder performed a causal graph edge prediction task using a GNN model that propagated information from a fully connected bidirectional graph to find a best-fit causal graph consisting of nodes and uni-directional edges for downstream tasks (i.e., time-series forecasting). Each node represented one time-series variable, and each edge represented the potential causal relationship between two time-series variables, such as ambient outdoor PM2.5 concentration and the COVID-19 infection rate. For each time-varying variable observed across LTLAs, its past observations from time step \(t-L+1\) to time step \(t\) were fed into the encoder. For each static variable, such as demography and SES, a constant value was used to represent the fixed effect observed at each LTLA. Specifically, the encoder adopted a GNN model \({f}_{\theta }\) developed by Kipf et al. (2018), with multiple rounds of node-to-edge operation (concatenating node embeddings connected by an edge) and edge-to-node operation (aggregating edge embeddings from all incoming edges), implemented by four multilayer perceptron (MLP) modules with a final Bayesian fully connected layer to predict the edge embedding between one node \(i\) and another node \(j\) at time step \(t\): \({e}_{i,j}^{t}\) (see Eq. (1)). After applying a softmax activation function to the edge embeddings plus a Gumbel distributed noise \(g\) with temperature \(\tau\) to facilitate the learning of the edge distribution (Löwe et al. 2022), the output of the encoder became an adjacency matrix \({{\boldsymbol{Z}}}^{t}\) representing the predicted causal edges at time step \(t\), where each element \({z}_{i,j}^{t}=1\) or \({z}_{i,j}^{t}=0\) indicated the existence or non-existence of a causal link from node \(i\) to node \(j\) at time step \(t\), respectively (see Eq. (2)). The uncertainty in edge prediction was addressed by sampling the weight distribution of the final Bayesian layer in the encoder (Blundell et al. 2015).

$${e}_{i,j}^{t}={f}_{\theta }\left({{x}_{i}^{t-L+1,\ldots ,t},x}_{j}^{t-L+1,\ldots ,t}\right)$$
(1)
$${z}_{i,j}^{t}={\rm{softmax}}\left(\frac{{e}_{i,j}^{t}+g}{\tau }\right)$$
(2)

The decoder performed a one-step-ahead time-series forecasting task, utilizing the predicted causal structure and the historical values from time step \(t-L+1\) to the current time step t to forecast the time-series values at time step \(t+1\). An RNN model with gated recurrent units (GRUs) was adopted to aggregate the incoming information provided by the causal edges at each time step, ensuring that one time-series Granger-caused another only when the corresponding edge was predicted to exist. As a result, the predicted edges were expected to represent Granger causality across variables. A Granger-causal graph generally equates to Pearl’s causal graph structure under the no-hidden confounder assumption (Peters et al. 2017). Specifically, for each time step, the edge embedding \({h}_{i,j}^{t}\) of two nodes was first calculated based on a two-layer fully connected network \({f}_{\varphi }\) using the concatenation of their hidden states \({h}_{i}^{t}\) and \({h}_{j}^{t}\) (see Eq. (3)). For each node \(j\), its incoming information \({\text{MSG}}_{j}^{t}\) was then calculated as the sum of all incoming edge embeddings \({h}_{i,j}^{t}\) (see Eq. (3)). The input \({x}_{j}^{t}\), the aggregated message \({\text{MSG}}_{j}^{t}\), and the hidden state \({h}_{j}^{t}\) were used together to update the hidden state at the next time step \(t+1\) via the GRU-style gate aggregation mechanism (see Eq. (4)). Finally, a three-layer Bayesian fully connected network \({f}_{\omega }\) was applied to the final hidden state at time step \(t+1\) to estimate the variable \(j\)’s future value \({x}_{j}^{t+1}\) according to its previous value \({x}_{j}^{t}\) (see Eq. (5)). The uncertainty in the time-series forecasting was addressed by sampling the weight distribution of the final Bayesian layers in the decoder model (Blundell et al. 2015).

$${h}_{i,j}^{t}=\mathop{\sum}\limits_{{z}_{i,j} > 0}{z}_{i,j}^{t}* {f}_{\varphi }\left(\left[{h}_{i}^{t},{h}_{j}^{t}\right]\right),{{\rm{MSG}}}_{j}^{t}=\mathop{\sum}\limits_{i\ne j}{h}_{i,j}^{t}$$
(3)
$${h}_{j}^{t+1}={\rm{GRU}}\left({x}_{j}^{t},{\text{MSG}}_{j}^{t},{h}_{j}^{t}\right)$$
(4)
$${x}_{j}^{t+1}={x}_{j}^{t}+{f}_{\omega }\left({h}_{j}^{t+1}\right)$$
(5)

The loss function of the encoder-decoder framework included the mean squared error (MSE) loss for predicting time-series values at the next time step. Moreover, the loss function included a regularization term, calculated by the cross-entropy loss (for edge classification), to incorporate non-informed knowledge and domain-specific knowledge during the causal structure learning process. Specifically, for non-informed knowledge, a uniform weight \(\alpha \in [0,\,1]\) was assigned to each non-existent edge to enforce graph sparsity. For domain-specific knowledge, another uniform weight \(\beta \in [0,\,1]\) was assigned to each known link to penalize predicted causal edges that contradict prior knowledge. From a probability perspective, the assigned edge weights represented edge priors before learning from the data. The known causal links were determined by expert knowledge in previous literature, such as environmental science, epidemiology, public health emergency management. In particular, the following relationships were incorporated based on previous literature, including the link between NO2 and O3 (Kang et al. 2021), the links between surface-level UV and O3 and between surface-level UV and NO2 (Munir et al. 2014), and the links between COVID-19 testing and infection, and between vaccination and infection (Roy and Ghosh, 2020; Tregoning et al. 2021).

Finally, a DAG constraint was incorporated into the loss function and the encoder-decoder training iterations to ensure the learned causal graph was directed and acyclic (Yu et al. 2019). The acyclicity constraint was formulated as a function of the causal graph’s adjacency matrix, and the constraint problem was converted into an unconstrained problem by minimizing the augmented Lagrangian, where the Lagrange multiplier was updated iteratively when optimizing the neural network loss function (see Yu et al. (2019) for the detailed procedure). Although cycles were not allowed in the causal graph at each time step, they could exist in an overall causal graph that included causal links observed over a certain period.

Model training and evaluation

Due to the unavailability of the actual causal graph, the average error of forecasting the next-day COVID-19 infection rate was used to evaluate the model performance and generalization ability. The SMAPE, a commonly used metric for time-series forecasting tasks, was used for model evaluation and comparison (see Eq. (6)). A lower SMAPE indicates higher prediction accuracy.

$${\rm{SMAPE}}=\frac{1}{N}\mathop{\sum}\limits_{i=1}^{N}\frac{\left|{F}_{i}-{A}_{i}\right|}{\left(\left|{F}_{i}\right|+\left|{A}_{i}\right|\right)/2}$$
(6)

where \(i\) represents one sample, \(N\) is the total number of samples, \({A}_{i}\) is the actual value, and \({F}_{i}\) is the forecast value.

We used a random 80/10/10 split of the dataset as the training, validation, and testing sets. The Adam optimizer was used for model training. The validation set was utilized to select hyperparameters and the best model. Specifically, the optimal time lag (\(L=7\) or \(L=14\)), i.e., the length of past observations affecting the future value, including the COVID-19 infection rate, was selected. Other fixed hyperparameters included the number of hidden units (64), the learning rate (0.01), the maximum number of training epochs (50), the batch size (32), the regularization weights (\(\alpha =0.9\) and \(\beta =0.99\)), and the Gaussian mixture prior used in the Bayesian layers (\(\pi\) = 0.5, \(-\log {\sigma }_{1}\) = 1, and \(-\log {\sigma }_{2}\) = 3). An early stopping criterion was adopted if model performance did not improve on the validation set after five consecutive epochs.

During model training, the loss function consisted of four components, including (1) an MSE loss based on the predicted values from the encoder-decoder model and the ground truth values observed at the next time step, (2) the Kullback-Leibler (KL) divergence term to regularize the Bayesian layers and prevent overfitting, (3) a cross-entropy loss to penalize predicted causal links from the GNN encoder that led to a dense graph or contradicted known causal links (prior knowledge), and (4) a DAG constraint term to avoid cycles in the learned causal graph. The model achieving the minimum validation MSE was selected as the final model, which was further evaluated on the testing set.

To compare the performance of the proposed model to other relevant models, the following baselines were selected, including regression models, graph-based methods, and deep learning models (see Section 1 in Supplementary Information for more details). The same training/validation/testing datasets were used for the baseline models. For non-time-series models, covariates were averaged over the past \(L\) days to account for the lag effects. For graph-based methods (including PC, GES, GSP, LiNGAM, and linear Granger causality), variables that were predecessors (or “causes”) of the outcome variable were selected based on the identified causal graph (Piccininni et al. 2020). A gradient-boosting machine learning model, capable of non-linear modeling, was then trained for infection rate prediction based on the selected variables.

Model prediction and time-varying causal effect estimation

After model training and evaluation, the causal structure was identified, capturing the time-varying causal relationships among the selected covariates and the outcome variable. We performed counterfactual prediction on the entire dataset to estimate the causal effects of various variables of interest over time, including ambient air pollution concentrations and public health interventions, that could influence the COVID-19 infection rate across different LTLAs in the UK.

Specifically, we calculated the ATE of continuous and categorical variables across all LTLAs over time (see Eq. (7)). For a variable of interest, while keeping other factors unchanged, we added a specific level of perturbation \(\Delta\) to each unit of analysis (an LTLA at a time step \(t\)). The change in the predicted outcome was compared to the observed outcome, i.e., the COVID-19 infection rate (see Eq. (8)). For continuous variables (e.g., outdoor daily PM2.5 concentration), we used a one-unit hypothetical change (i.e., \(\Delta =1\)) to quantify the marginal causal effect, which is widely used for evaluating continuous variables (Li et al. 2020). For categorical variables (e.g., an indicator showing the status of a lockdown policy, where zero indicates “no policy” and one indicates “the policy is implemented”), using a one-unit hypothetical change to the intervention variable, the causal effect was calculated based on the difference between the predicted outcome and observed outcome, subject to the current intervention status (see Eq. (8)).

$${{\rm{ATE}}}_{i}^{t}=\frac{1}{N}\sum \left({Y}_{i,1}^{t}-{Y}_{i,0}^{t}\right)$$
(7)
$${Y}_{i,1}^{t}-{Y}_{i,0}^{t}=\left\{\begin{array}{c}F\left({x}_{i}^{t}+1,\,\ldots \right)-{y}^{t},{\rm{if}}\,{i}\,{\rm{is}}\;{\rm{a}}\; {\rm{continuous}}\; {\rm{variable}}\\ {F}\left(1-{I}_{i}^{t},\,\ldots \right)-{y}^{t},{\rm{if}}\,{i}\,{\rm{is}}\; {\rm{an}}\; {\rm{indicator}}\; {\rm{and}}\,{I}_{i}^{t}=0\\ {y}^{t}-{F}\left({I}_{i}^{t}-1,\,\ldots \right),{\rm{if}}\,{i}\,{\rm{is}}\; {\rm{an}}\; {\rm{indicator}}\; {\rm{and}}\,{I}_{i}^{t}\,>\, 0\end{array}\right.$$
(8)

where N denotes the number of LTLAs, t denotes a time step, i denotes the variable of interest, y denotes the observed outcome, x represents i’s observed value (for continuous variable), I represents i’s observed value coded by non-negative integers (for categorical variable such as policy indicators, where 0 represents “no policy”), \({Y}_{i,1}^{t}\) and \({Y}_{i,0}^{t}\) are the potential outcomes corresponding to “treatment” and “control” (only one outcome can be observed in the data), and F represents the trained encoder-decoder model for COVID-19 infection rate prediction using time-varying causal graph information.

Furthermore, we calculated the confidence intervals of the causal effect estimations. Since Bayesian layers were incorporated into the proposed encoder-decoder model, the uncertainty in COVID-19 infection rate predictions was addressed by the model’s weight parameters drawn from their posterior distributions. Given the potentially non-Gaussian distribution of the COVID-19 infection rate, we calculated the 95% confidence levels of ATE for each variable of interest based on bootstrapping using 1000 bootstrap samples.

Finally, to determine if any dominant causal variables other than ambient outdoor PM2.5 concentration existed, we also predicted time-varying ATE for other selected variables that changed over time, such as other air pollution concentrations and public health interventions. Since the numeric covariates were standardized by their mean and standard deviation during data preprocessing, we directly compared the relative importance of different variables over time using their ATE estimations.

Results

Time-varying causal graph neural network model evaluation

The average symmetric mean absolute percentage error (SMAPE) of forecasting the next day’s COVID-19 infection rate was used to evaluate model performance and generalization ability (see “Model Training and Evaluation” for more details). Lower SMAPE values indicate higher prediction accuracy. We utilized an 80/10/10 split of the dataset for model training, validation, and testing. The model was trained on the training set, and the best model was selected based on the validation performance. After model training, predictive performance was evaluated on the testing set. The baseline models used the same training, validation, and testing sets.

Figure 4 shows the average prediction error of the proposed time-varying causal GNN model versus the baselines. Graph-based models have generally outperformed regression models, given that the causal structure was utilized for prediction and that non-linear relationships were addressed. The GNN-based model has achieved a lower error (SMAPE = 0.233), likely due to the simultaneous incorporation of causal structure discovery and time-series prediction. This result indicates that GNN-based models are more effective in capturing latent causal relationships inferred from high-dimensional time-series data compared to traditional graph-based models. The proposed time-varying causal GNN model has achieved the lowest error (SMAPE = 0.204), indicating that domain-specific regularization and time-varying graph structures have positively contributed to improving predictive accuracy.

Fig. 4
figure 4

The average prediction error of the proposed time-varying causal GNN model versus the baselines, including the regression models, the traditional causal graph-based methods, and the GNN model adopting static causal graph over time.

The superior performance of the time-varying GNN model has highlighted the importance of incorporating temporal dynamics and time-varying causal structures in predicting COVID-19 infection rates. This time-varying modeling is crucial for characterizing pandemic-related data, where relationships between variables can change rapidly due to interventions, behavioral changes, and other factors. Specifically, the time-varying GNN model integrates graph structure learning and temporal dynamics to capture complex, non-linear relationships between variables that evolve over time, utilizing both data-driven causal discovery and domain-specific knowledge. It can dynamically discover and update the causal graph structure at each time step, adjusting its predictions based on the evolving causal structure, leading to improved predictive accuracy over time. In contrast, traditional models, such as regression models or static causal graphs, often fail to fully account for these dynamic changes, resulting in lower performance compared to time-varying GNN modeling.

Causal graph identification

Using the trained model, time-varying graphs were estimated at each time step using the whole dataset. For each COVID-19 wave, a summary graph was generated, consisting of nodes representing time-series variables and directed edges predicted by the causal GNN model at each time step. The wave-specific summary graphs were then combined into an overall summary graph covering the entire study period. Although the causal graph is assumed to be a directed graph (uni-directional) without any cycles at each time step, some cycles (bi-directional, e.g., the cause-and-effect relationship might reverse over time) can exist in the summary graph. Figure 5 shows the overall causal graph for the study period, highlighting key time-varying factors linked to COVID-19 infection, including short-term daily air pollution concentrations, meteorology, and public health interventions. The summary graph indicates that short-term PM2.5, NO2, and O3 pollutants and public health interventions can “cause” COVID-19 infection. Interactions between air pollutants and meteorology, consistent with domain-specific knowledge, can also be observed. For example, the formation of O3 from NO2 and UV based on photochemical reactions (UV - > NO2 - > O3) and the co-occurrence of PM2.5 and O3 (PM2.5- > O3 and O3- > PM2.5) (Miao et al. 2021). Moreover, public health interventions, such as school closure and public transport closure, can “cause” changes in air pollution levels, likely due to changes in emissions during lockdown periods (Jephcote et al. 2021).

Fig. 5: An overall causal graph covering the entire study period and showing the key time-varying factors affecting the COVID-19 infection rate.
figure 5

The other control variables were removed for visualization purpose, including the number-of-days variable and the COVID-19 vaccination and testing variables.

Moreover, the cause-and-effect relationships between PM2.5 and other variables change over time during different phases of an infection cycle. Specifically, Fig. 6 demonstrates the time-varying causal relationships between air pollution, COVID-19 infection, and other key variables, particularly public health policy interventions, before and after the peak in infections during each pandemic wave. The peak in infection was when the infection rate was at its maximum during each wave. Each subfigure in Fig. 6 represents a causal graph covering one key time-varying variable linked to its parent and child nodes. The cause-and-effect relationships are generally more similar during the second wave and more varied during the first and third waves. According to Fig. 6a, PM2.5 was a causal factor for the COVID-19 infection rate during the first wave and interacted more with other variables (especially NO2 and public transport closure) after the peak in the first wave. However, it became non-causal to the COVID-19 infection rate after the peak in the third wave. Similarly, O3 was a causal factor for the COVID-19 infection rate during the first wave and interacted more with other variables (especially public transport closure) after the peak in the first wave. However, it stopped being a causal factor for the COVID-19 infection rate in the later stage of the pandemic (see Fig. 6c). Unlike PM2.5 and O3, NO2 was affected by the public transport closure policy before the peak and remained a causal factor for the COVID-19 infection rate throughout the later stage of the pandemic (see Fig. 6b). Additionally, some interactions between PM2.5, NO2, and O3 were cyclic, especially after the early phase of the pandemic, suggesting complex interactions between different pollutants and meteorology over time.

Fig. 6: Time-varying causal graphs highlighting how the causal relationships between air pollutants, COVID-19 infection, and other key variables (in particular, public health policy interventions) change before and after the peak in infections during each pandemic wave.
figure 6

a Time-varying causal graphs highlighting PM2.5. b Time-varying causal graphs highlighting NO2. c Time-varying causal graphs highlighting O3. d Time-varying causal graphs highlighting school closure. e Time-varying causal graphs highlighting public transport closure.

Further, variations in causal relationships can be observed concerning different public health policy interventions, including school closure and public transport closure (see Fig. 6d, e). Both public health interventions were not causal to the COVID-19 infection rate before the peak in the first wave but became causal afterward. One possible explanation is that public health policy interventions introduced/implemented across the UK might experience a time lag in effectiveness due to institutional or personal behavioral constraints, particularly when policies covered large regions (country-level). The effects of these policy interventions also varied across different pollutants during different pandemic waves, especially the first wave. Before the peak, while school closure tended to affect secondary pollutants, i.e., PM2.5 and O3, public transport closure often affected NO2, suggesting that public policy interventions might impact different emission sources at different phases.

In addition to pandemic phases and waves, time-varying patterns in the cause-and-effect relationships between PM2.5 and other variables can be observed over different quarters of the pandemic stages. Figure 7 shows the time-varying causal graphs for key time-varying variables summarized quarterly. The patterns observed in Fig. 7 are generally consistent with the findings from Fig. 6. Specifically, PM2.5 and O3 became non-causal to the COVID-19 infection rate during the later stage of the pandemic (i.e., in the second quarter of 2021 and the first quarter of 2022). Public transport closure did not change the COVID-19 infection rate in the early stage (i.e., in the second quarter of 2020). However, it gradually became causal, affecting the COVID-19 infection rate as the pandemic evolved.

Fig. 7
figure 7

Time-varying causal graphs covering the three COVID-19 waves in the UK, highlighting the changes in causal relationships between air pollution, COVID-19 infection, and other key variables (in particular, public health policy interventions) at the quarterly level.

Overall and time-varying causal effects

With the help of time-varying causal graphs, guided by domain-specific knowledge and data-driven automatic searches over the causal orders of correlated variables over time, the causal effect of one variable of interest can be estimated individually, while accounting for other time-varying factors affecting the COVID-19 infection rate simultaneously. Average treatment effects (ATEs) were estimated using counterfactual predictions (see “Model Prediction and Time-varying Causal Effect Estimation” for more details). At a high level, we aimed to answer counterfactual questions such as what would have happened if the air pollution concentration had increased by one unit or if the lockdown policy had not been implemented. Daily ATEs were aggregated using mean values to calculate the overall causal effects, and confidence intervals were calculated using bootstrapping.

Figure 8 shows the average ATEs with 95% confidence intervals during the entire study period. Overall, higher levels of short-term outdoor PM2.5, NO2, or O3 pollution have resulted in an increase in the COVID-19 infection rate. However, the effects of public health interventions on the COVID-19 infection rate were mixed. On the one hand, in contrast to a review highlighting the effectiveness of school closure for COVID-19 infection control (Mendez-Brito et al. 2021), school closure failed to reduce the infection rate throughout the study period. On the other hand, public transport closure has led to a decrease in the COVID-19 infection rate, although the beneficial effect of these measures was relatively small, echoing an earlier study that highlighted the zero additional public health benefit gained from public transport closure (Islam et al. 2020).

Fig. 8: Overall causal effects during the whole study period.
figure 8

The error bars represent 95% confidence intervals.

In terms of the relative importance of the three air pollutants in affecting the COVID-19 infection rate, O3 was the most important factor contributing to the increase in infection rates. One possible explanation is that a decrease in other air pollutants due to lockdowns increases atmospheric oxidation capacity, accelerating the formation of secondary aerosols such as ozone and leading to a higher respiratory health risk (Zoran et al. 2020).

Moreover, different effects of public health interventions have been observed when considering the overall causal effects across different COVID-19 waves (see Fig. 9 and Table 2). Generally, the absolute effect size became larger as the pandemic progressed from the first to the third wave. Air pollutants consistently had a positive effect on the COVID-19 infection rate, except for PM2.5 pollutants during the peak of the second wave (see Fig. 10 for a more fine-grained visualization at the monthly level). The effectiveness of public health interventions varied across different waves. School closure was effective during the first two waves but failed to reduce the infection rate during the third wave. In contrast, public transport closure did not add beneficial effects to infection control in the first two waves but became more effective during the third wave.

Fig. 9: Overall causal effects across different COVID-19 waves.
figure 9

The error bars represent 95% confidence intervals; # indicates insignificant ATE at the significance level of 5%.

Table 2 Overall causal effects and confidence intervals (CIs) across different COVID-19 waves from March 2020 to March 2022.
Fig. 10: Average time-varying causal effects at the monthly level during three different COVID-19 waves during March 2020 to March 2022.
figure 10

a Time-varying causal effects of air pollutants. b Time-varying causal effects of public health policy interventions.

Furthermore, daily ATEs were averaged monthly to examine the time-varying causal effects across different COVID-19 waves. Figure 10 shows the estimated ATEs over time. For air pollutants, similar patterns were observed during different COVID-19 waves. The contribution of short-term air pollution to the COVID-19 infection rate increased during the initial phase but became less significant as the epidemic progressed to a certain stage. The inverted U-shape relationship appeared more than once within the same COVID-19 wave (e.g., the third wave), suggesting cyclical infection patterns. While O3 was a more important factor affecting the COVID-19 infection rate over the long term (see Fig. 10), the effect of PM2.5 was more significant during the first wave. Although the three air pollutants exhibited consistent patterns most of the time, a significant anti-correlation between O3 and PM2.5 was observed during the peak of the second wave, indicating that the effects of different pollutants on the COVID-19 infection rate may change over time. During the second wave, O3 continued to play an important role in affecting the COVID-19 infection rate, even as the number of infections accumulated. In contrast, the role of PM2.5 became less significant after the initial growth.

For public health interventions, their relative impacts on the COVID-19 infection rate varied across the three waves. During the first and the second wave, public transport closure failed to stop the infection rate from increasing, while school closure moderately reduced the infection rate during the first wave and countered the infection peak during the second wave. During the third wave, however, both public transport closure and school closure had minimal effects on the infection rate at the early stage. Public transport closure became more effective only after the infection numbers had begun to rise, suggesting that public health interventions applicable to a wider range of people might be effective in curbing the infection rate during the third wave. This trend is consistent with the previous findings that public health interventions were less effective during the early phase but became more effective as the growth rate increased and more public health interventions were introduced (Duhon et al. 2021). Additionally, during the third wave, the effectiveness of policy interventions was reduced after the peak ATE reduction in December 2021, indicating the start of a new infection cycle.

Discussion

Key findings revealing the causal relationships between outdoor PM2.5 pollutant and COVID-19 infection rate

Previous research has identified links between three outdoor air pollutants, public health interventions, and COVID-19 infection. However, the time-varying cause-and-effect relationships between these factors have yet to be fully explored. Utilizing a Bayesian deep learning-based time-varying causal GNN model, this study finds that the causal relationships between outdoor air pollutants (including PM2.5, NO2, and O3), public health interventions (including school closure and public transport closure), and the COVID-19 infection rate varied across different periods of the pandemic. Specifically, although PM2.5 and O3 were causal factors affecting the COVID-19 infection rate throughout the entire study period, they became non-causal during the late pandemic phase. While public transport closure did not cause any changes to the COVID-19 infection rate during the early pandemic phase, it became a causal factor in the later phase. Moreover, the causal relationships between air pollutants and public health policy interventions varied over time. During the early phase of the pandemic, NO2 levels were more likely to be affected by public transport closure, while the PM2.5 and O3 levels were more likely to be affected by school closure. These time-varying causal relationships provide insights into how individual factors can cause changes in the COVID-19 infection rate and how they interact over time.

Further, through counterfactual estimation performed by the time-varying causal GNN model, any time-varying causal relationships can be quantified by observing the time-varying causal effects on the COVID-19 infection rate across different pandemic phases. Specifically, an inverted U-shape relationship has been observed across different infection cycles and pandemic waves for the causal effects of outdoor air pollution on COVID-19 infection. Although PM2.5 and other pollutants contributed to early COVID-19 infection rate growth, their effects became less significant once infections reached a certain threshold. Moreover, the effectiveness of public health interventions varied across different phases of the pandemic. School closure was effective against the COVID-19 infection rate in the early phase, while public transport closure became effective in controlling the COVID-19 infection rate during the later phase when the number of infections started increasing rapidly. The time-varying effects of air pollutants and public health policy interventions provide insights into infection control policymaking for upcoming waves or future pandemics. Specifically, early interventions to improve air quality and indoor ventilation are recommended (Han et al. 2021; Han et al. 2020). School closure is recommended during the early phase of the pandemic, whereas public transport closure is recommended during the later stage of the pandemic when the infection growth rate is significant.

Advantages of interpretable AI-driven time-varying causal inference methodology for pandemic preparedness

Our model offers numerous advantages over those proposed in previous literature covering air pollution-related COVID-19 epidemiological studies in three aspects. First, our methodology can systematically address a wide spectrum of confounders and disentangle causal effects from associative effects. Potential factors affecting observations concerning the effect of air pollutants on the COVID-19 infection rate have been gradually incorporated into the proposed model, including the following categories: meteorological, mobility, public health policy interventions demography, SES, co-morbidity, and healthcare variables. Specifically, during data pre-processing, to tackle collinearities among various variables, the most representative variables from each category were selected based on their univariate predictive power. During model training, the relationships between these selected variables were learned from time-series data and regularized by domain-specific knowledge. Moreover, unobserved time-varying factors have been addressed by including an additional variable representing time trends and infection cycles. The learned causal graph has provided guidance for estimating causal effects. For example, although the associative effect of O3 on the COVID-19 infection rate was negative using a univariate regression model, likely due to its anti-correlation with other pollutants (e.g., increased O3 along with reduced NO2 (Brancher, 2021)), the causal effect of O3 became positive, suggesting that the time-varying causal graph could better address the time-varying interactions between multiple pollutants and meteorology.

Second, to the best of our knowledge, our study is the first to demonstrate time-varying causal relationships between outdoor pollutants and the COVID-19 infection rate and between public health interventions and the COVID-19 infection rate in the UK using a data-driven time-varying causal graph approach. Previous studies have exploited causal graph structures, constructed based on domain knowledge or learned from data, to evaluate the causal factors affecting COVID-19 infection or mortality in the UK (Kyono and Van der Schaar, 2021), China (Kang et al. 2021), and Germany (Steiger et al. 2021). In contrast to these approaches, which often assumed a static causal graph across the study period, this study proposes a time-varying causal graph model to explore the time-varying cause-and-effect relationships between various factors. The time-varying causal effects can reveal when and which factors are important in affecting the spread of COVID-19, providing valuable policy insights to facilitate environmental and public health policymaking at different stages and prepare for future waves or pandemics.

Third, our proposed methodology advances causality analysis for T-S data. It highlights the use of domain-specific knowledge in T-S causality analysis. Unlike previous data-driven deep learning-based causal discovery methods, which mainly aimed to learn structural information from observational data only (Kipf et al. 2018; Löwe et al. 2022; Yu et al. 2019), the causal diagram in this study has incorporated T-S patterns learned from observed data and expert-guided knowledge about how certain factors affect each other. Such a causal diagram can encode assumptions and prior knowledge about how one variable “causes” another in a transparent and interactive way, allowing for feedback loops. The causal diagram makes it possible to resolve causal directions through domain expertise when it is challenging to identify the correct causal relationships based on observational data. Moreover, it provides an integrated framework for causal discovery and inference using T-S data. Based on the time-varying causal graphs generated by latent graph representation learning, time-varying causal effects can be estimated based on counterfactual prediction while accounting for the time-varying confounding effects using the latent causal graph structures. This approach allows for using causal graphs and potential outcomes, the two major causality languages, together in T-S analysis to answer causal questions in epidemiology and beyond.

Contribution to explainable generative AI studies

Recent developments in Generative AI can enhance predictive models for identifying potential outbreak hotspots, forecasting pandemic trajectories, and informing preparedness efforts. On the one hand, large language models can analyze vast amounts of textual data from various sources, such as social media (Müller et al. 2023) and electronic health records (Homburg et al. 2023), to identify patterns indicating the emergence of new infectious diseases and track outbreaks over time. On the other hand, Generative AI can produce synthetic data and create digital twins that mimic real-world infection processes (Giuffrè and Shung, 2023). Such simulations can predict disease spread across different scenarios (Barat et al. 2021), using factors such as population structures, environmental conditions, and travel patterns, enabling more accurate and timely projections of impacts on various populations, especially the most susceptible groups, and allowing for faster emergency responses and more targeted interventions at different stages.

A critical challenge in using Generative AI models for infection control and public health decision-making is their black-box nature, which lacks transparency and interpretability. AI- and data-driven public health policymaking remains challenging without understanding the underlying mechanisms driving the spread of infectious diseases. The causal AI paradigm proposed in our study can help address this challenge by providing a more transparent and trustworthy approach to interpreting AI predictions. Causal graph and inference techniques can be injected into the Generative AI models to offer more interpretable and domain-specific insights into the key drivers of disease spread and the effectiveness of various control measures across different stages. This domain-specific explainable AI approach can further unlock the potential of Generative AI for informed decision-making, leading to more efficient and targeted public health control measures that can help mitigate the impact of future pandemics.

Limitations and future work

The current study does, however, present several limitations. Some directions for future work are listed as follows. First, although a time trend variable (the number of days) has been included to account for unobserved effects over time, other types of unobserved factors may need to be considered. In the future, sensitivity analysis can be performed to examine whether the no-hidden confounding assumption remains valid and to provide uncertainty quantification due to unmodeled hidden confounders (Jesson et al. 2022). Second, the current causal inference procedure is based on the counterfactuals predicted by the time-varying causal GNN model. The causal relationships are implicitly encoded as graph representations via message passing in the graph. In the future, traditional causal graph-based de-confounding techniques, such as the backdoor criterion, can be incorporated into deep learning-based causal inference to improve the interpretability of causal estimations. The explicit use of the causal structure information will also help distinguish the indirect effect (the effect of the treatment on the outcome mediated by other covariates) from the direct effect (the effect of the treatment on the outcome un-mediated by other covariates). Third, the effect of policy intervention combinations, including different interventions (e.g., targeting different populations) and sequences (e.g., implementing in different orders), has yet to be investigated in this study. In the future, intervention combinations can be encoded in the causal graph, e.g., based on a hierarchical graph structure. The effect of shared versus specific information for different policy interventions can also be pursued when evaluating the combinatory effect. Fourth, this study only focuses on a particular intervention outcome, namely, the COVID-19 infection rate. How different factors may affect other intervention outcomes, e.g., COVID-19 mortality, can be further investigated in the future.

Furthermore, building on the proposed time-varying causal deep learning framework, new research directions can be explored to develop more accurate and interpretable models for predicting and controlling infectious disease outbreaks. First, the current time-varying causal graph model focuses on how different variables affect each other over time. Leveraging insights from information propagation modeling (e.g., reaction-diffusion models) on how information spreads through networks (Ke et al. 2022; Ma et al. 2023; Zhu et al. 2024; Zhu and Yuan, 2023), analogous to how diseases spread through populations, future work can examine how various variables affect each other over space by considering social or contact network structures and identifying the causal pathways among different districts or regions over time (Mastakouri and Schölkopf, 2020). Second, the current time-varying causal graph model incorporates high-level domain-specific knowledge, such as which variable “causes” another variable in the causal graph. Future work can exploit more scientifically-driven domain-specific knowledge, such as physical models governing COVID-19 infection and transmission processes, whenever such information is readily available, for instance, the susceptible-exposed-infectious-recovered (SEIR) model for COVID-19 infection modeling (Qian et al. 2020; Qian et al. 2021).

Finally, while our study focuses on COVID-19 in the UK, the proposed methodology does not rely on country-specific assumptions. Our proposed Bayesian causal deep learning framework offers a versatile and adaptable approach to understanding the time-varying relationships between various factors, such as outdoor PM2.5 concentration and COVID-19 infection rate. It can be adopted in other countries/regions and validated in future pandemic scenarios. For our framework to be effectively implemented in different regions/countries or examined in future pandemic scenarios, similar types of data are required, including time-series data on infection rates, air pollutant concentrations, meteorology, mobility, public health intervention measures, and sociodemographic factors. Given that many countries have been collecting detailed data related to public health, the required inputs for our proposed framework are likely available. Future research can conduct comparative studies across different geographical contexts to identify common patterns and location-specific factors affecting COVID-19 infection and investigate the applicability of the proposed framework to similar pandemic scenarios.

Policy implications for pandemic preparedness

Our study offers insights beneficial to future infection control and public health policymaking in the UK and globally. Based on large-scale, multimodal, and readily available T-S datasets covering disease infection, environment, demography, SES, and health, our time-varying causal graph approach can simulate the path of a future pandemic and how different factors will affect the infection rate over time. The findings can help guide infection control and public health policymaking, ensuring that different public health policies catered to different time frames are provided to achieve the best policy outcome over time. For example, the possibility that COVID-19 is an airborne infection may lead to public health measures, such as requiring citizens to wear face masks, to reduce the possibility of COVID-19 infection during the early phase of the pandemic through the viral-airborne transmission pathway (Han et al. 2021; Han et al. 2020), especially in countries with high population density, mobility, and particulate pollution levels. Moreover, our research methodology can be transferable to other countries where evidence-based public health policymaking for curbing future waves of COVID-19 transmission or future pandemics is most needed.