Introduction

Storm surge, the temporary increase in sea level caused by severe storms, is one of the most destructive components of tropical cyclones (TC), motivating the development of accurate tools and models for its prediction1,2. In recent decades, high-fidelity numerical models have been produced that are able to estimate storm surge generated by tropical cyclone wind fields with high accuracy. However, these hydrodynamic simulations typically are computationally intense, requiring high-performance computing resources, and still there can be considerable biases in their outcome3,4.

Because physically-based models (i.e., solving fluid dynamics equations) like the ADvanced CIRCulation (ADCIRC) model5 can be expensive in terms of computational cost, surrogate models have been developed to predict storm surge without incurring the same high computational cost6. The use of surrogate models or meta-models has increased rapidly in the field of coastal flood hazard research7. Surrogate models are useful tools in different fields because of their ability to emulate the behavior of complex models in their quest to approximate complex systems. Moreover, their computational efficiency makes them a convenient approach for tasks like optimization or modeling large ensembles of events and scenarios8,9.

Risk assessments, using techniques like joint probability methods to estimate a hazard curve, require simulation of a large number of synthetic storm events (i.e., thousands or tens of thousands). Computational limitations motivated advances such as the joint probability method with optimal sampling (JPM-OS)10,11,12,13 and the use of heuristic algorithms to further reduce the number of simulations required for probabilistic flood risk4,14.

However, these approaches have limits, and consequently, planning studies still face meaningful constraints imposed by computational budgets. When evaluating the benefits of risk reduction infrastructure with long useful life spans, it is indeed important to consider risk and risk reduction over long planning horizons. Protection systems (e.g., levees, dikes, seawalls, pumping stations) must be designed to withstand and mitigate the effects of extreme events over many decades. This increases the necessity to consider uncertainties in factors that, over time, reshape the coastal landscape (e.g., land subsidence, land-use change, impacts of saltwater intrusion on vegetation) and boundary conditions (e.g., sea level rise (SLR)) that determine risk to coastal communities; scenario analyses examine multiple future states of the world with different realizations of uncertain parameters. Integrated coastal management plans like Louisiana’s Comprehensive Master Plan for a Sustainable Coast (Coastal Master Plan) evaluate the performance of a range of flood protection and coastal restoration projects implemented in different sequences, necessitating the modeling of multiple future time periods15. Mokrech et al. (2011)16 stresses the importance of developing an integrated framework to assess long-term coastal impacts and thus make rational management decisions16. Wamsley et al. (2009)17 investigated the storm surge and wave reduction benefits of different environmental restoration features (e.g., marsh restoration and barrier island changes), as well as the impact of future wetland degradation on local conditions, concluding that “consideration of natural features is required” to properly assess flood risk17.

Studies that include multiple future time periods, states of the world, and project portfolios must evaluate risk on a large number of landscapes, and simple math dictates that under a fixed computational budget, the more landscapes planners want to model, the fewer events can be simulated per landscape. Using a lower-resolution mesh or a model simulating fewer physical processes may be an undesirable solution if it would introduce unacceptable biases and/or uncertainty in storm surge estimates or compromise the ability of the model to resolve the impact of protection features like levees.

In this paper, we introduce a surrogate model using artificial neural networks (ANN) that can be used to resolve this computational constraint. We train the model on synthetic storms simulated on multiple landscapes using the ADCIRC model, including not only the storm parameters as features, but also landscape features (e.g., topographic/bathymetric elevations, canopy) and boundary conditions (e.g., mean sea level). We evaluate the accuracy to predict peak storm surge elevations, as well as the accuracy of annual exceedance probability (AEP) distributions estimated using the predicted surge values, finding that the model is sufficiently accurate for use as a scenario generator in planning studies.

Previous studies have taken various approaches to applying surrogate models for prediction of storm surge and waves. Commonly, this means predicting peak storm surge elevations and peak significant wave heights at many points on a spatial grid as a function of the storm’s characteristics at landfall. Studies vary in their choice of geography and TC characteristics, with the latter typically including parameters such as landfall location, angle of landfall, central pressure, forward velocity, radius of maximum windspeed, Holland-B parameter and/or tide level. Techniques for the surrogate models include kriging4,18, kriging combined with principle component analysis19,20, support vector regression21, and ANN22. Al Kajbaf and Bensi (2020)21 provides a comparative assessment of the performance of these techniques21.

Many other studies have focused on the use of surrogate models for forecasting future water surface elevations over the course of a storm23,24,25,26,27,28,29. Recent works have incorporated SLR into predictions on a static landscape6, but landscape morphology plays a significant role in modeling flood inundation and flood risk accurately30,31. In areas exhibiting substantial land subsidence, erosion, barrier island migration, and other phenomena impacting morphology, incorporating SLR is necessary but insufficient for projecting future storm surges and inundation risks. Canopy and vegetation impact wind attenuation and surface friction, as shown in studies of mangrove forests and coastal wetlands on the Gulf coast of South Florida32. Mangrove forests reduced storm inundation areas and restricted surge inundation within a Category 3 hurricane zone, according to the study, finding that the width of the mangrove zone had a nonlinear effect on reducing surge amplitudes.

Although these previous studies investigated storm surge surrogate modeling from other perspectives, the impact of the combination of SLR, landscape and TC parameters on storm surge has not been thoroughly investigated. In this study, we aim to fill that gap by developing a surrogate model using deep neural networks for the prediction of peak storm surge elevations from synthetic TCs as a function of their characteristics at landfall in coastal Louisiana, four landscape parameters impacting storm surge, and mean sea level.

The synthetic TCs used in this study are characterized by their overall tracks and five parameters at landfall: forward velocity, radius of maximum windspeed, central pressure, landfall coordinates, and heading. The corpus of 645 synthetic storms was developed by the US Army Corps of Engineers for use in flood risk assessments based on the JPM-OS methodology33; each synthetic storm’s landfall parameters serve as input data for the predictive model and are provided in Supplementary Information Table S1. We utilize only the landfall parameters because of the synthetic (i.e., idealized) nature of the storm tracks and parameters; synthetic TCs follow their heading at landfall and exhibit similar patterns of decay in intensity as they move inland, so the variability in synthetic storm behavior and potential to generate storm surge is reasonably captured by their characteristics at landfall. Note that this means our approach would likely yield lower accuracy if the trained models were asked to infer peak storm surge elevations from historic or more realistic TCs.

Hydrodynamic simulations from a coupled ADCIRC + SWAN model were available from Louisiana’s 2023 Coastal Master Plan for all 645 synthetic storms, simulated on the plan’s “Existing Conditions” landscape (i.e., 2020). A subset of 90 synthetic storms was simulated on each of 10 future landscapes representing decadal snapshots (i.e., 2030, 2040, 2050, 2060, 2070) under two different scenarios, a Lower and Higher Scenario, that vary in their assumptions about the rate of SLR, land subsidence, and other environmental factors15,34. For each synthetic storm and landscape, peak storm surge elevations were extracted from the ADCIRC + SWAN simulations at 94,013 locations representing grid points from the Coastal Louisiana Risk Assessment model (CLARA) not located within fully-enclosed protection systems; such points were excluded from analysis because of the additional complexity of flood dynamics in enclosed polders (e.g., overtopping, levee fragility/breaches, pumping, rainfall). The points form a mixed-resolution grid consisting of a regular 1-km resolution grid, with more points added so that every 2010 U.S. Census block contains at least one grid point35. This has the effect of adding higher resolution in densely populated areas. Grid cell polygons are associated with each grid point as Thiessen polygons formed within each census block (i.e., polygons consisting of all pixels in the census block closer to a given grid point than any other in the same block).

Each landscape is characterized by a digital elevation model defining the topography and bathymetry of the study region, as well as rasters defining other inputs to the ADCIRC model (with a resolution of 30 meters): the Manning’s \(n\) value (i.e., bottom roughness coefficient), free surface roughness \(z0\), and a surface canopy coefficient that captures the reduction in wind stress on water surfaces produced by local vegetation. All landscape characteristics were represented as GeoTIFFs with values extracted at each of the 94,013 grid point locations for use in the surrogate model. Full details regarding the Integrated Compartment Model used to develop the landscape representations are found in White et al. (2019)36 and Reed and White (2023)37, and details regarding the ADCIRC + SWAN model and Louisiana mesh are found in Cobell and Roberts (2021)34 and Roberts and Cobell (2017)38. Mean sea level (NAVD88 m) assumed for each scenario as a boundary condition is provided in Table 1.

Table 1 Summary of mean sea level assumptions and statistical outcomes for all cases evaluated

Methods

This study used feed-forward ANN models with multilayers and multiple outputs to predict storm surge at each location under current and future landscape conditions. A range of models varying from 128 neurons to 256 neurons was evaluated before selecting the models described here, as specifying too few neurons could impede the learning process while using too many neurons could result in overtraining/overfitting39. Moreover, for all hidden layers, the RELU activation function was chosen with a learning rate of 0.001, and for the last layer, a linear activation function was selected to predict surge values.

Firstly, we examined the value of including landscape parameters in a predictive model of storm surge for a single landscape only. ANN models were trained for current conditions on 645 synthetic storms: a multi-layered feed-forward architecture with four hidden layers and an output layer of 1 dimension was used for predicting peak storm surge elevations at different locations. A “storm-only model” at each location only included the synthetic storm parameters at landfall as inputs. The “full model” included the synthetic storm parameters but also all grid points’ landscape parameters from the current landscape (latitudinal and longitudinal coordinates, topo/bathy NAVD88 elevation, surface canopy, \(z0\), and Manning’s \(n\)). Sea level was excluded from the full model in this test because the local mean sea level was assumed constant throughout the study region, and thus only has variation when multiple landscapes are taken in as input data. For the storm-only and full models, predicted accuracy is reported for 15% of storms randomly held out in a cross-validation procedure with the remaining 85% of storms used for training and validation.

Next, to investigate the impacts of climate change and the slowly evolving landscape, we trained the full model using the 645 synthetic storms from the 2020 landscape condition, as well as the 10 future landscapes, each with the same 90 synthetic storms. Predictive accuracy of the full model was evaluated utilizing leave-one-landscape-out cross-validation (LOOCV) on the future landscapes; in other words, for each fold of the CV procedure, the model was trained on the 2020 landscape and 9 of the 10 future landscapes (\(n=1455\) storms), with predictions made on the tenth future landscape (\(n=90\) storms). We did this to reflect a real-world use case in which the full model could serve as a scenario generator, training on a set of landscapes run through ADCIRC and then predicting outcomes in novel landscapes. The current conditions landscape’s 645 storms were included to represent a realistic case in which a larger suite of synthetic TCs could be run on a single landscape as an input to a storm selection process that would identify the subset of 90 storms to run on other landscapes. Using 100 epochs to train the model, the entire process was executed on an AMD Epyc 7662 CPU at 2.0 GHz, taking less than 7 h for training to be completed in preparation for making predictions on a new landscape. For all folds in the cross-validation process, it took 70 h. Once trained, less than 4 min is needed to generate predictions for a novel landscape.

Finally, we also wanted to know how errors in the predicted peak storm surge from each synthetic storm propagate to differences in the estimated annual probability distribution of experiencing storm surge of varying elevations. This is ultimately what planners may care about when making decisions about flood protection projects. For this task, we employed the Coastal Louisiana Risk Assessment (CLARA) model, an implementation of JPM-OS which is the model used to estimate flood hazard for Louisiana’s Coastal Master Plan35,40. Full details on the CLARA model’s methodology are in Johnson et al. (2023)35; in this analysis, we compared peak surge elevation exceedance curves (i.e., surge elevations as a function of AEP) generated from the simulated surge elevations from ADCIRC to exceedance curves generated from the predicted surge elevations from the LOOCV procedure. The resulting empirical distributions were compared using a two-sample Kolmogorov-Smirnov (K-S) test, which calculates the maximum difference between two empirical samples’ cumulative distribution functions to test a null hypothesis that they have been drawn from the same underlying probability distribution function41. CLARA produces estimates of surge exceedances at 23 return periods ranging from a 50% AEP to 0.005% AEP (i.e., the 2-year event to the 2000-year event), so the two-sample KS test dictates that the null hypothesis be rejected at significance level \(\alpha\) if

$$\mathop{\sup }\limits_{x}\left|{F}_{{ADCIRC}}\left(x\right)-{F}_{{ANN}}\left(x\right)\right| > \sqrt{-\mathrm{ln}\left(\frac{\alpha }{2}\right)\,\cdot\, \frac{1}{23}}$$
(1)

where sup is the supremum over x, \({F}_{{ADCIRC}}(x)\) and \({F}_{{ANN}}(x)\) are the sample CDFs associated with the ADCIRC simulations and ANN predictions, respectively.

Results

The ANN model that includes landscape parameters performs markedly better than the model with only storm parameters when predicting surge from relatively intense storms, as shown for two illustrative storms in Fig. 1. Each pane plots ADCIRC-simulated values against the ANN-predicted values at ~3500 points on a west-to-east transect at 29.\(8^{\circ}\) N, a latitude selected for its nearly continuous series of grid points uninterrupted by major water bodies or enclosed polders. It is worth to mention that the sudden decrease in points with simulated surge below 0.36 NAVD88 m is due to this being the mean sea level assumed for the current conditions landscape. Grid points over water are initialized at this level, leaving very few points along the chosen transect with lower topographic elevation, typically due to being pumped and drained. Blue points represent the Full Model which includes landscape parameters, and red points represent the Storm-Only Model which excludes them. The left-hand pane shows Storm 495, a weaker TC with central pressure of 975 mb at landfall, while the right-hand pane shows the much stronger storm 11 with a landfalling central pressure of 905 mb (landfall parameters for all 645 synthetic storms are provided as Supplementary Table 1).

Fig. 1: Simulated versus predicted storm surge for representative synthetic storms.
figure 1

Simulated versus predicted storm surge for storm 495 (left pane) and storm 11 (right pane) in the cases where the ANN model input includes only storm parameters (red) and both storm and landscape parameters (blue) grid points along a transect at 29.8° N. Note: Axis ranges vary between left and right panes.

Across all 645 synthetic storms and grid points in the current conditions landscape, the Storm-only Model reached an overall RMSE of 0.31 m, while the Full Model achieved an RMSE of 0.28 m (Table 1). While this does represent an improvement of over 10 percent, primarily the result of greater accuracy for larger surge elevations, we expected the difference between these models to be minimal when trained only on the current conditions landscape. This is because of the lack of variation in landscape parameters over the synthetic storms at each point and contrastingly greater variation in TC parameters.

Examining the spatial pattern of the Full Model when trained on current and future scenarios, we see that points with higher RMSE over all storms and landscapes are generally further inland (Fig. 2). This is expected, given that such points generally have fewer storms in the corpus that produce wetting, and we did not employ any dry-node correction techniques like those used in ref. 42 or ref. 18; instead, non-wetting observations were simply removed from the training set. The model also performed less accurately in areas with more complex topography and hydrodynamics, such as in unpopulated wetlands in the Atchafalaya River Basin (between 91° and 92° W longitude on the northern portion of the model domain), where the ADCIRC model also has greater uncertainty and bias when validated against historic TCs34.

Fig. 2: RMSE values across all landscapes and synthetic storms for all grid points in the study domain.
figure 2

Darker colors indicate lower RMSE values, i.e., locations of greater agreement between simulated and predicted storm surge.

Considering the RMSE of the Full Model averaged over all landscapes and synthetic storms, the RMSE at 90% of grid points is less than 0.18 m, at 99% of grid points less than 0.38 m, and at 99.9% of grid points less than 0.79 m (Fig. 3). Over the ten future scenarios used in the LOOCV procedure, the Full Model produced a grand RMSE of 0.086 m and grand mean absolute error (MAE) of 0.050 m (Table 1). These results compare favorably to the calibration and validation results from the ADCIRC + SWAN model used to generate the hydrodynamic simulations, which reported a standard error in simulated high-water marks of ~0.46 m over seven historical storms (hurricanes Katrina, Rita, Gustav, Ike, Isaac, Nate, and Harvey)34. Further analysis incorporated into the 2023 Coastal Master Plan estimated an average standard error of 0.15 m in peak surge elevations over the grid cells included in this analysis (authors’ own calculations).

Fig. 3: Exceedance percentage of RMSE values by grid point.
figure 3

Each point represents the RMSE for a particular spatial grid point, averaged across all landscapes and synthetic storms.

Figure 4 further disaggregates the model predictions to show the frequency distribution of predicted versus simulated surge elevations in each of the future landscapes over the synthetic storms and grid points. The overall distributions appear nearly indistinguishable except in the Higher scenario’s 2070 landscape, the most extreme scenario with respect to its assumptions about mean sea level and cumulative land subsidence. That this scenario would be an outlier compared to the others is intuitive, given its more extreme assumptions about environmental conditions; in this sense, the 2070 Higher scenario landscape is subject to the common difficulty of extrapolating beyond training data in the leave-one-landscape-out experimental design. That said, the directionality of the difference is somewhat counterintuitive. In this scenario, the predicted storm surge is on average, greater than the simulated values, though the primary non-linear difference in the scenario is an accelerating rate of SLR. Despite this acceleration, it appears that the Full Model overestimates the gradient in storm surge associated with changes in mean sea levels.

Fig. 4: Frequency distribution of peak storm surge elevation for all future landscapes.
figure 4

Red indicates the distribution of predicted values, while blue indicates the distribution of simulated values.

To fully understand the RMSE error over a range of storm surge levels with respect to a fixed datum and relative to prevailing ground, Fig. 5 and Fig. 6 are provided. They show the RMSE over all grid points by surge elevation relative to NAVD88 and above prevailing ground, respectively. Prevailing ground elevations are calculated as the median topographic elevation of land pixels in each grid cell polygon. In grid cells containing open water, such as over Lake Pontchartrain, the prevailing elevations used are the mean sea level associated with the specific landscape, representing the prevailing water surface elevation storm surge would build upon. Simulated ADCIRC observations are binned into 0.1-m intervals with respect to their surge elevations or depths, and RMSE of predictions for the locations and storms in each bin are shown on the vertical axis. Based on Fig. 5, it can be seen that all scenarios show approximately similar performance, except for 2070 of the Higher scenario, which has slightly larger RMSE values. Additionally, it can be observed that larger surge elevation values do not necessarily correspond to larger RMSE values. Moreover, the overall performance of all scenarios in the lower scenarios was similar, with a maximum RMSE of 0.4 m and a mean RMSE of less than 0.1 m. Furthermore, in the higher scenario, except for Scenario 2070, the rest have a similar mean RMSE of less than 0.1 m. Generally, the model is least accurate for smaller surge elevations, meaning that extremes which would cause more damage are still captured relatively well.

Fig. 5: RMSE over all grid points and storms, as a function of surge elevation.
figure 5

Each line represents a different year (varying by color) of the Lower (left pane) and Higher (right pane) scenarios.

Fig. 6: RMSE over all grid points and storms, as a function of surge depth relative to prevailing ground.
figure 6

Each line represents a different year (varying by color) of the Lower (left pane) and Higher (right pane) scenarios.

This is further confirmed by Fig. 6, which is similar to Fig. 5 but with the bins on the horizontal axis representing surge elevation relative to prevailing ground elevation (i.e., surge depth instead of surge elevation relative to a fixed datum). We see in these results that most scenarios exhibit an RMSE below 0.1 m consistently for surge elevations above prevailing ground. In general, the model predictions are actually less accurate for storm surge values below prevailing ground, likely due to a relatively smaller set of observations in the training data. Based on Fig. 5 and Fig. 6, it can be concluded that for most scenarios, the performance of the model is acceptable across a range of surge elevations, including extreme values which would cause more damage.

Examining the hazard aggregated over multiple TCs, the errors associated with surge predictions do not appear to meaningfully compound once aggregated to AEP curves, in the sense that the RMSEs over all grid points at a range of return periods are in a similar range to the RMSEs over all grid points and synthetic storms (between 0.05 and 0.1 m for all landscapes but the most extreme, as shown in Fig. 7). The RMSE generally is larger at lower AEPs, consistent with an intuition that prediction is more challenging for extreme events associated with storm surge values near the upper bounds of observations in the simulated training sets.

Fig. 7: RMSE over all grid points, by annual exceedance probability and landscape.
figure 7

Each line represents a different year (varying by color) of the Lower (left pane) and Higher (right pane) scenarios.

Considering the full AEP distribution of storm surge at each point and in each landscape, the two-sample K-S tests further indicate that the surge predictions are accurate enough to usefully inform probabilistic risk studies. In eight of the ten future landscapes, the null hypothesis, that the empirical distributions generated with the ADCIRC simulations and the ANN predictions are drawn from the same underlying probability distribution, is rejected at level \(\alpha =0.05\) for less than one percent of the grid points (Table 1). This table also reports a MAE over the grid points below 0.05 m and correlation between simulated and predicted values over 0.99 for all landscapes but the 2070 Higher Scenario.

From Table 1, it is evident that by incorporating landscape parameters into the ANN model, storm surge can be predicted accurately for a variety of different scenarios.

Figure 8 highlights the spatial pattern of points that rejected the null hypothesis of the two-sided K-S test for an illustrative landscape, the Higher scenario in 2060. Red points indicate the locations where the test rejects the null hypothesis at \(\alpha =0.05\), while the blue points indicate locations where the evidence fails to rule out the possibility of the hazard estimates coming from the same underlying AEP distribution.

Fig. 8: Results of two-sample K-S test for the year 2060 of the Higher scenario.
figure 8

Red points indicate locations where the null hypothesis is rejected.

Discussion

We have presented a machine learning-based surrogate model of peak storm surge elevations that yields predictions of comparable or greater accuracy, when compared to ADCIRC simulations, than the ADCIRC model relative to historic observations used for model calibration and validation. The addition of future landscapes with variation in landscape parameters and mean sea level conditions provides more features and heterogeneity in training data to improve the model’s accuracy. In a LOOCV exercise, the model produced a MAE of ~0.04 m in nine out of ten of the future landscapes, with the exception being the year 2070 of the Higher scenario, the most extreme landscape with respect to having the greatest SLR and land subsidence. A two-sided K-S test failed to reject the null hypothesis, that points on hazard curves generated by the ADCIRC simulations and ANN predictions are drawn from the same underlying distribution, at less than 1% of grid points in eight out of ten of the future landscapes. It is worth mentioning that, in addition to the capabilities of the developed surrogate models, part of the residual sensitivity is attributable to variability in the training dataset being used; as such, the accuracy of surrogate models trained on simulation data from ADCIRC will be limited by the accuracy of the calibrated and validated simulations.

This highlights an important caveat: this analysis utilized ADCIRC simulations that were readily available from Louisiana’s 2023 Coastal Master Plan, meaning that the scenarios and time periods were not chosen with the idea of using ADCIRC to train a surrogate model already in mind. This work therefore represents a proof of concept where the ANN model produced predictions suitable for planning studies from a training set of convenience. If planners are interested in estimating risk over a 50-year planning horizon ending in 2070, it may be that accuracy could be improved for the same computational cost by replacing one of the “intermediate” landscapes with a landscape corresponding to the year 2080 instead, to mitigate the challenges of ML models in extrapolating beyond data in their training set.

This also has implications for the storm selection process, given that the 90 storms simulated in each future landscape were chosen by comparing hazard curves to the curve associated with the full 645-storm suite in the current conditions landscape43. Prior research has suggested a difficulty in using ML methods to predict extreme storm surge elevations44,45. While accurate reproduction of extreme individual events is important in controlling the overall RMSE and MAE of the model, extreme storms (i.e., with lower central pressures at landfall) are relatively more rare in occurrence, thus having smaller probability masses when contributing to AEP distributions and making smaller contributions to expected annual damage calculations.

Consequently, adoption of surrogate models as scenario generators would also benefit from a rigorous consideration of how optimal sampling techniques could be extended to include heterogeneity in landscape parameters and boundary conditions. When planning an analysis that will span a range of future states of the world, it is likely that greater computational efficiency could be achieved by sampling different synthetic storm events on each landscape, rather than simulating the same 90 storms as was done for the Coastal Master Plan. We note that in this analysis, the surrogate model does benefit from predicting storm surge on the same 90 storms in the left-out landscape that were present in the training data from other landscapes during the LOOCV procedure. Accuracy would almost certainly be lower if called upon to predict storm surge in both novel landscapes and for storms with novel landfall parameters.

All the future landscapes used for training our ANN came from the Coastal Master Plan’s Future Without Action scenarios (i.e., no additional projects implemented on the landscape). This means we have restricted our predictions to analyzing a slowly evolving landscape without major coastal management interventions. As such, developing a surrogate modeling framework to capture the impacts of projects such as constructing or upgrading levees and floodwalls is a more challenging problem and subject for future research. However, the surrogate model developed in this study would also have utility in evaluating the flood risk impacts of coastal restoration projects that affect landscape morphology over time scales ranging from the immediate (e.g., beach nourishment) to decadal (e.g., river diversions).

This study enables better modeling of future climate and environmental conditions by policy makers and water resource managers. Moreover, the developed model makes it possible to evaluate risk under a greater number and range of future scenarios and time periods, opening the door to the use of computationally expensive models like ADCIRC for planning studies utilizing techniques for decision-making under deep uncertainty (DMDU) that require or benefit from the use of large ensembles of future states of the world46. We note, though, that not all institutions have the resources and planning capacity to adopt DMDU approaches or to generate the same number of landscapes used for training data in this study. Even with a more limited set of future scenarios, the concept introduced here, of incorporating landscape and boundary condition features to train a surrogate model over multiple futures, could be applied to improve risk assessments that currently employ simplifying assumptions such as linearly interpolating storm surge or hazard over time or applying a bathtub model of SLR.