Introduction

The COVID-19 pandemic has significantly strained global healthcare systems, highlighting the urgent need for accurate predictive tools and efficient resource allocation strategies. Accurate forecasting of infection trends is critical for proactive decision-making, particularly in managing limited resources such as ICU beds, ventilators, and healthcare personnel1,2. Neural networks, a class of machine learning algorithms, have demonstrated exceptional performance in modelling complex, nonlinear relationships and are well-suited for predicting disease dynamics3. The integration of neural networks into epidemiological modelling has been explored in various studies, showing their ability to incorporate diverse datasets for accurate disease forecasting4,5. For instance, recurrent neural networks (RNNs) have been successfully employed for time-series forecasting of infection rates, while convolutional neural networks (CNNs) have been applied to analyze spatial and temporal dynamics of disease spread6. These techniques provide actionable insights into disease transmission patterns, which are crucial for healthcare planning and resource management7.

Healthcare systems face significant challenges in managing limited resources, particularly during pandemics. Resource allocation models incorporating optimization algorithms and machine learning have been proposed to address these challenges, ensuring efficient utilization of available healthcare assets8,9. Integrating predictive modeling with optimization strategies can enable real-time decision-making and enhance the operational efficiency of healthcare systems10.

This study proposes a neural network framework that combines prediction and optimization to address these challenges. The model integrates epidemiological parameters, such as infection rates and vaccination data, with environmental factors and mobility trends to forecast COVID-19 case dynamics11. Additionally, the framework employs an optimization algorithm to allocate healthcare resources efficiently, reducing operational delays and minimizing resource wastage12.

The performance of the proposed framework is validated through extensive experiments using real-world datasets. The results demonstrate its ability to deliver accurate forecasts, improve resource allocation, and enable real-time decision-making13. By integrating computational techniques with engineering principles, this research contributes to developing scalable solutions for public health crises, with potential applications in future pandemic management and healthcare system optimization14.

This research investigates the application of neural networks for predicting COVID-19 transmission dynamics and optimizing healthcare resource allocation. The proposed model integrates epidemiological, mobility, vaccination, and environmental data to forecast case trends and estimate healthcare demands under varying conditions. A multi-layered neural network is employed to improve prediction accuracy, while an optimization algorithm is integrated to effectively allocate critical healthcare resources, such as ICU beds, ventilators, and medical personnel. The study incorporates engineering principles to ensure scalability and efficiency in real-time applications. The framework is validated using real-world datasets and assessed for prediction accuracy, robustness, and computational efficiency. Results demonstrate the proposed model’s ability to enhance decision-making, reduce resource mismanagement, and ensure optimal utilization during pandemics. Applying the framework to healthcare systems reveals its potential for improving operational efficiency and mitigating the impact of public health crises.

Related work

The application of neural networks in predicting COVID-19 transmission dynamics and optimizing healthcare resource allocation has gained significant attention, particularly in the context of the pandemic’s global impact. Several studies have explored machine learning models to predict the spread of COVID-19 and allocate resources efficiently in real time.

Early studies highlighted recurrent neural networks (RNNs) and long short-term memory (LSTM) networks for forecasting COVID-19 cases. Chimmula and Zhang (2020) proposed an LSTM-based model to predict COVID-19 cases across various countries, showing promising results for time-series forecasting in epidemiological contexts15. Similarly, Zhou et al. (2020) utilized RNNs to model the interaction between public health interventions, mobility, and COVID-19 transmission dynamics, underlining the potential of machine learning in epidemic forecasting16.

Recent research has advanced these efforts by integrating environmental and behavioural data into neural network models. Shahid et al. (2021) applied convolutional neural networks (CNNs) to assess the role of mobility data in predicting COVID-19 spread, finding that mobility patterns had a substantial impact on the accuracy of transmission forecasts17. Zhou and Wang (2022) further extended these models by integrating vaccination data and climate factors, improving the robustness of COVID-19 predictions in varying geographical contexts18.

Optimization techniques have been used alongside neural networks to enhance resource allocation during the pandemic. Basu et al. (2021) developed a hybrid decision-support framework combining machine learning and optimization algorithms to dynamically allocate healthcare resources, such as ICU beds and ventilators, in response to changing infection rates19. Chen et al. (2022) employed deep reinforcement learning to optimize the distribution of critical medical supplies, reducing delays and ensuring resource availability where needed most20.

In 2023, studies focused on real-time decision-making in healthcare systems. Vittorio et al. (2023) proposed an AI-based framework for healthcare resource optimization during pandemics, integrating predictive models with real-time data to enhance hospital management and minimize overcrowding21. Wang et al. (2024) also explored the use of neural networks in managing hospital capacities and healthcare personnel deployment, emphasizing the need for scalable models that can adapt to rapidly changing pandemic scenarios22.

Integrating machine learning with healthcare engineering has improved real-time applications and decision-making capabilities. Cui et al. (2023) combined deep learning with network optimization techniques to dynamically allocate healthcare resources in response to COVID-19 and other health emergencies23. These studies demonstrate the growing potential of neural networks in improving the operational efficiency of healthcare systems, especially in high-demand scenarios such as pandemics. Advanced machine learning models have significantly improved predictive accuracy across various domains. Alkhammash et al.24 demonstrated the efficacy of Optimized Multivariate Adaptive Regression Splines (MARS) in forecasting crude oil demand in Saudi Arabia, showcasing its potential for energy market predictions. Similarly, Alkhammash et al.25 employed an optimized Binary Particle Swarm Optimization (BPSO) model to predict COVID-19 spread, highlighting the relevance of machine learning in epidemiological modeling. In the agricultural sector, Elshewey et al.26 utilized deep learning algorithms, including the Waterwheel Plant Algorithm and the Sine Cosine Algorithm, for potato blight detection, offering promising advancements in plant disease monitoring.

Furthermore, Elshewey et al.27 introduced the hyOPTGB framework to enhance Hepatitis C Virus (HCV) disease prediction in Egypt, reinforcing the role of optimization techniques in medical diagnostics. Additionally, Alzakari et al.28 integrated Convolutional Neural Networks (CNN) with Long Short-Term Memory (LSTM) networks for early potato disease detection, illustrating the benefits of deep learning in precision agriculture. Collectively, these studies emphasize the transformative potential of machine learning in diverse fields, from healthcare and epidemiology to energy forecasting and agricultural sustainability.

Unlike previous studies focusing on time-series forecasting, our approach integrates real-time optimization with predictive modeling. While Chimmula and Zhang (2020) used LSTM for case prediction, our model incorporates environmental and healthcare capacity data for resource allocation, enhancing decision-making beyond prediction alone.

Methodology

This study employs a hybrid modelling approach to analyze COVID-19 transmission dynamics and evaluate the effectiveness of interventions using a combination of statistical models and machine learning techniques. The optimization process uses a Genetic Algorithm, where the fitness function is designed to minimize ICU shortages and maximize patient survival. Constraints include regional resource capacities. The algorithm iterates until convergence (defined as < 1% change over five iterations). A pseudo-code representation has been added to illustrate the workflow. The algorithm iteratively refines allocation strategies based on fitness functions that minimize resource wastage while maximizing healthcare system capacity.

Data collection

The approach integrates diverse datasets, including COVID-19 case data (daily confirmed cases and deaths by country/region, sourced from Johns Hopkins University COVID-19 Dashboard)29, vaccination data (daily vaccination rates and total coverage by country, available from Our World in Data)30, mobility data (trends in workplace and residential movements, sourced from Google Mobility Reports)31, healthcare infrastructure data (hospital beds, workforce, and healthcare spending, from the World Health Organization)32, and socioeconomic data (population density, GDP, urbanization indicators, from the World Bank)33. By incorporating these datasets, the study addresses the research gap in understanding real-world heterogeneity, enabling more accurate predictions of COVID-19 trends. The flow diagram of the research methodology below visualizes the overall process.

figure a

Table 1 Summarises the datasets used for the analysis in this Article, along with their sources.

Table 1 Datasets summary.

These datasets provide critical information for analyzing the impact of various factors on COVID-19 transmission, including mobility patterns, vaccination rates, healthcare infrastructure, and socioeconomic factors. They provide a comprehensive understanding of the pandemic’s dynamics across different regions.

Statistical modeling

To achieve a comprehensive framework for analyzing transmission dynamics and evaluating the efficacy of interventions. We have used the following models to perform the results using the R studio version (4.3.3).

The neural network consists of an input layer, three hidden layers (128, 64, 32 neurons), and an output layer. ReLU activation is used in hidden layers to introduce non-linearity, while the output layer employs Sigmoid for probability estimation. A dropout rate of 20% prevents overfitting. Hyperparameters were optimized via Bayesian optimization, selecting batch sizes from {16, 32, 64}, learning rates from {0.001, 0.0005}, and regularization terms from {0.001, 0.0001}.

Time-series analysis

To capture the temporal trends in COVID-19 cases and deaths, we apply time-series analysis using models like ARIMA (Auto-Regressive Integrated Moving Average) and STL (Seasonal and Trend decomposition using Loess).

ARIMA model

ARIMA forecasts future infection rates based on historical data. It combines three components: autoregressive (AR), differencing (I), and moving average (MA). The ARIMA model can be written as:

$$\:{Y}_{t}=c+\sum\:_{i=1}^{p}{\varphi\:}_{i}{Y}_{t-1}+\sum\:_{i=1}^{p}{\theta\:}_{j}{Y}\epsilon_{t-j}+\epsilon_{t}$$
(1)

where,

\(\:{Y}_{t}\) = is the observed time series value at time t,

\(\:{\varphi\:}_{i}\) = are the autoregressive coefficients,

\(\:{\theta\:}_{j}\) = are the moving average coefficients,

\(\epsilon_{t}\) = is the white noise error term, and

c = is a constant.

The ARIMA model predicts future values by considering past values (AR) and past error terms (MA).

STL decomposition

The STL method is used for seasonal decomposition and forecasting. The model splits the time series into trend, seasonal, and remainder (noise). The decomposition is given by:

$$\:{Y}_{t}={T}_{t}+{S}_{t}+{R}_{t}$$
(2)

where,

\(\:{Y}_{t}\)= is the original time series,

\(\:{T}_{t}\) = is the trend component,

\(\:{S}_{t}\) = is the seasonal component, and

\(\:{R}_{t}\) = is the remainder (residual component).

Regression analysis

Regression models quantify the relationship between COVID-19 transmission (dependent variable) and various independent variables such as vaccination rates, mobility trends, and healthcare capacity.

Multivariable linear regression

This method is used to examine the effect of several predictors on the transmission rate. The formula for multivariable regression is:

$$\:{Y}_{t}=\:{\beta\:}_{0}+{\beta\:}_{1}{X}_{1}+{\beta\:}_{2}{X}_{2}+\dots\:+{\beta\:}_{n}{X}_{n\:}+\epsilon_{t}$$
(3)

Where:

\(\:{Y}_{t}\) = is the dependent variable (COVID-19 transmission rate),

\(\:{X}_{1},{X}_{2},{\dots\:,\:X}_{n}\) = the independent variables (vaccination rates, mobility trends, etc.),

\(\:{\beta\:}_{0}\) =​ is the intercept,

\(\:{\beta\:}_{1}{\beta\:}_{2},{\dots\:,\beta\:}_{n}\) =​ are the coefficients, and

\(\epsilon_{t}\) =​ is the error term.

The regression coefficients (\(\:\beta\:\)) help assess the contribution of each predictor to the transmission rate.

Logistic regression

Logistic regression is used when the outcome is binary, such as predicting the likelihood of a COVID-19 outbreak (yes/no) in a given region.

The logistic model is expressed as:

$$\:P\left(Y=1\right)=\:\frac{1}{1+{e}^{-\left({\beta\:}_{0}+{\beta\:}_{1}{X}_{1}+{\beta\:}_{2}{X}_{2}+\dots\:+{\beta\:}_{n}{X}_{n\:}\right)}}$$
(4)

where,

\(\:P\left(Y=1\right)=\) is the probability of an outbreak,

\(\:{X}_{1},{X}_{2},{\dots\:,\:X}_{n}\)​ =are the predictors (e.g., vaccination rate, mobility),

\(\:{\beta\:}_{0}{\beta\:}_{1},{\dots\:,\beta\:}_{n}\)​ = are the model coefficients.

These regression models are used to identify the significant predictors of COVID-19 transmission and quantify their effects, thus providing insights into how factors like vaccination rates, mobility restrictions, and healthcare capacity influence the dynamics of the pandemic.

The proposed methodology integrates epidemiological modelling, machine learning, and optimization techniques to improve the prediction of COVID-19 dynamics and the allocation of healthcare resources. The synergy between these components will provide a robust and scalable solution for managing pandemics more efficiently.

One key limitation is potential biases in the data sources used. Variations in healthcare reporting systems across countries may affect model predictions. Addressing these biases through data harmonization techniques and sensitivity analyses is crucial for improving the model’s robustness.

Feature selection & importance

Feature selection was performed using PCA and Mutual Information (MI). PCA reduced dimensionality by 30%, while MI ranked feature importance, selecting vaccination rate, mobility index, ICU availability, case fatality rate, and hospitalization rate as the top predictors. These features contributed significantly to improving model accuracy.

Simulation results

Before presenting the results, the analyses for this study were performed using R version 4.3.3, leveraging a combination of statistical and machine learning techniques to forecast COVID-19 transmission and optimize healthcare resource allocation. The methodologies employed include time-series forecasting models, neural network designs, and optimization algorithms applied to various datasets from public health sources. The performance of these models was assessed based on their accuracy, robustness, and real-world applicability to healthcare management during the pandemic.

Table 2 ARIMA and STL Time-Series analysis.

This Table 2 summarizes ARIMA (2,1,1) and STL model parameters for analyzing COVID-19 cases. The ARIMA model indicates significant autocorrelations (AR and MA parameters) with a positive constant trend. The STL decomposition highlights the contribution of the trend, seasonal, and remainder components, all statistically significant (p < 0.001).

Table 3 Predictor importance (LASSO Regression).

Vaccination rate has the highest predictive importance for COVID-19 cases, with a significant coefficient (p < 0.001). Healthcare capacity is ranked second, while mobility has the least but still statistically significant influence (p = 0.015) (Table 3).

Table 4 Intervention effectiveness summary.

Vaccination showed the highest global effectiveness, with a 75% reduction in cases (p < 0.001). Lockdowns reduced mobility by 50% in Region A (p = 0.015), while mask mandates in Region B led to a 30% reduction in cases (p = 0.05) (Table 4).

Table 5 Forecast accuracy metrics.

Random Forest outperformed ARIMA and STL due to its ability to capture nonlinear dependencies and complex interactions between features. A t-test comparing forecast errors confirmed that Random Forest achieved significantly lower mean absolute error (p < 0.05), validating its superior predictive performance. Among the forecasting models, Random Forest outperforms others with the lowest errors (MAE = 10.8, RMSE = 17.1) and the highest R² (0.92). ARIMA also performs well (R² = 0.89), followed by STL + regression (Table 5).

Table 6 Counterfactual scenario simulations.

Scenarios with higher vaccination rates and mobility reductions demonstrate substantial reductions in predicted cases. The combined intervention yields the most significant reduction (50%), emphasizing the synergistic benefits of multiple strategies (Table 6).

Table 7 Hypothesis testing for regional differences.

Regional analysis shows significant vaccination and healthcare capacity effects in Region A. In Region B, none of the predictors demonstrate statistical significance, indicating variability in intervention impacts (Table 7).

Table 8 Linear regression summary.

Linear regression shows no statistically significant effects on vaccination rate, mobility, or healthcare capacity in cases. The coefficients are small, with wide confidence intervals overlapping zero (Table 8).

Table 9 Logistic regression summary.

Logistic regression analysis confirms non-significant effects of vaccination rate, mobility, and healthcare capacity on the likelihood of specific outcomes. The results suggest weak predictive power for these factors individually (Table 9).

Table 10 Healthcare resource allocation table.

Table 10 predicts the demand for key healthcare resources based on neural network forecasts. These estimates are critical for optimizing resource allocation and ensuring that healthcare systems do not exceed capacity, especially during surges.

Table 11 Optimization algorithm results (Resource Allocation).

Table 11 presents the optimal allocation of healthcare resources using optimization algorithms like Genetic Algorithms or Particle Swarm Optimization. These predictions are intended to reduce resource wastage and improve care quality during the pandemic (Chen et al., 2022).

Table 12 Baseline comparisons.

Table 12 compares The proposed model achieved an R² of 0.92, outperforming LSTM (0.89) and CNN (0.90). The Transformer-based model performed slightly better (R² = 0.93), suggesting that attention-based models could further enhance predictions. Future work will explore integrating transformer architectures (Table 13).

Table 13 Real-Time applicability & computational benchmarks.

The proposed neural network (NN) model has the shortest training time (5.2 h) and a moderate prediction time (120 ms), making it efficient for quick deployment. Despite having the longest training time (9.1 h), the Transformer model offers the fastest prediction time (90 ms), suggesting strong real-time performance capabilities.

Fig. 1
figure 1

ARIMA Forecast for COVID-19 Cases.

This Figure 1 presents the ARIMA model’s forecast, demonstrating a close fit for COVID-19 cases. The trend line effectively captures fluctuations, reflecting the model’s robustness in time-series prediction.

Fig. 2
figure 2

SLT Decomposition of COVID-19 Cases.

The STL decomposition visualizes the time-series data’s trends, seasonality, and residuals. The trend component dominates, suggesting persistent patterns, while seasonality captures periodic variations in cases (Fig. 2).

Fig. 3
figure 3

Impact of Vaccination Rate on COVID-19 Cases.

The scatterplot shows a weak association between vaccination rate and COVID-19 cases. The regression line indicates minimal change in cases with increased vaccination, highlighting the need for additional interventions (Fig. 3).

Fig. 4
figure 4

Predicted ICU Beds Required Over Time. Predicted ICU Beds Over Time (X-axis: Date, Y-axis: Number of ICU Beds). The dotted line represents the actual ICU bed demand, while the solid line shows the model’s prediction.

This time series plot shows the predicted number of ICU beds required in the USA over days. The predicted number of ICU beds increases gradually, indicating an escalating demand for healthcare resources as case numbers rise. By June 2021, the number of required ICU beds will reach 47,000, highlighting the potential strain on the healthcare system during COVID-19 surges. Implications: This plot underscores the importance of accurate forecasting for ICU bed requirements. It allows healthcare systems to prepare and allocate resources efficiently, particularly during peak periods (Fig. 4).

Conclusion, limitations and future scope

The proposed neural network framework improves forecasting accuracy and provides an effective strategy for optimizing healthcare resource allocation. Integrating predictive modeling with optimization techniques enhances pandemic preparedness and operational efficiency.

Future research aims to address these limitations by exploring the model’s applicability to other infectious diseases and incorporating more diverse datasets. Enhancing the model’s computational efficiency and developing strategies to handle data variability will be crucial for broader implementation. Additionally, investigating the integration of this framework with other public health tools and systems could further improve its utility and impact in managing healthcare resources during pandemics and other health crises.

Real-time deployment of the model requires significant computational resources, particularly for large-scale predictions. Future research will explore cloud-based solutions and distributed computing techniques to enhance scalability.