Introduction

The fourth edition of the ESA-ECMWF Workshop on Machine Learning for Earth Observation and Prediction (ML4ESOP) took place from 7 to 10 May 2024 at ESA Frascati, Italy. Reflecting the trend of steadily growing public interest in machine learning (ML) technologies and their applications, this edition of the workshop hosted a record number of over 190 attendees on site plus a large and active virtual audience with about 800 registered participants. The attendance count, together with 246 abstract submissions and accepted work presented in nine oral and two poster sessions, highlighted the large interest in ML for Earth system science applications and the growing popularity of the ECMWF–ESA workshop series as a reference meeting and discussion venue in this area. The workshop content (recordings, slides, and e-posters) is available on the workshop’s webpage: https://www.ml4esop.esa.int/.

In line with the successful format of prior workshop editions, two leading experts were invited to provide broad overviews of the state-of-the-art in their domain of expertise, as well as current opportunities and challenges.

Professor Xiaoxiang Zhu’s keynote presentation showcased recent advancements of large-scale data and ML models for Earth Observation (EO). Empowered by weather modelling and spaceborne assets, current approaches enable urban flood forecasting1, glacier monitoring2 and solar potential simulations3 for climate action. Finally, the talk highlighted the significance of EO and climate foundation models4, an emerging trend in ML4ESOP to address a wide range of societal grand challenges.

Professor Martin Schultz’s talk contextualized the recent progress of ML models for weather forecasting as well as the significance of recent Artificial Intelligence (AI) developments for weather and climate modelling5. Current limitations and future opportunities for advancement were outlined: As analysed in recent works, physically implausible predictions6 and forecasting instabilities7 pose important challenges to be resolved in the future, while issues such as forecasts’ blurriness are readily receiving increased attention and were vividly discussed at the workshop5,7.

Both keynote presentations underscored the growing importance of large-scale and data-driven Earth system models for academic research and operational deployment alike. While many advances in the broader field of AI and related ML-driven domains prove equally beneficial in ESOP, tailored solutions are nonetheless required and adapted by the community at a rapid pace and with compelling results.

To cover the major trends in the field, the workshop was structured into multiple thematic areas (TA) featuring the main application of ML in EO, numerical weather prediction (NWP), and climate modelling as follows:

  • TA1: ML for Destination Earth

  • TA2: Multidomain ML for Earth System Observation and Prediction

  • TA3: End-to-End ML for Weather and Climate Prediction

  • TA4: Hybrid ML-NWP/Climate

  • TA5: New Generation Computing for AI

To distil the workshop’s numerous presentations and discussions, the following five sections summarize in greater detail the outomes of the working groups on each TA.

TA1: ML for Destination Earth

Current ML applications

The working group was chaired by Mariana Clare (ECMWF) and Rochelle Schneider (ESA) and explored how ML can be used within the context of data lakes, platforms, and the digital twins in the Destination Earth (DestinE) project8. ML is already being used to enhance data quality and data harmonisation e.g.9 and for super-resolution and downscaling e.g.10. It is also proving invaluable for the quantification of uncertainty in km-scale simulations where traditional methods for probabilistic forecasting are too computationally expensive to run operationally. This ability of AI to add uncertainty information further motivates the development of data-driven Earth-System models as part of the DestinE Digital Twin. Finally, a central principle behind DestinE is to bring data to the users, be it scientists at universities or policymakers in local governments. AI is playing a crucial role by enhancing user experience on web-based platforms and improving user interactivity through large language models (LLMs)11.

Limitations, opportunities, and challenges

Despite the current ML applications within the DestinE digital twins, there are clear limitations and challenges. At the forefront is ML models’ need for open data that is easily available to all users, in a harmonised manner. Ideally, both the data and models should be available to both users on laptops (local applications) and cloud-based services (global applications), which poses its own set of challenges. This data availability offers the opportunity to build high-quality benchmark datasets, which are very important for progressing research12.

Whilst integrating fully data-driven models into the DestinE digital twin is fairly simple, trying to integrate partial AI models into existing larger numerical models poses a challenge. However, the partial emulation of physical processes by ML also poses large opportunities as discussed in TA4 so we encourage the development of packages like infero13 which can help with this integration.

AI models offer opportunities for scientific research with higher model performance and more generalisability to different tasks than traditional modelling approaches, which is extremely useful for DestinE. We would also like to highlight the clear socio-economic impacts that DestinE components (e.g., digital twins, data lake, and digital platform) offer and the opportunities for improving Earth Sciences education, both at school and university level.

Future directions

It is interesting to ponder how an ML for DestinE session might look like at the next workshop. We will likely see foundational models for digital twins similar to those already designed by Aurora14 and AtmoRep15. We will also see ML-built Digital Twin engine segments coupled into a universal Digital Earth System. There is already research into ocean models in the wider literature16, and the potential of AI for other Earth System components is currently being explored at ECMWF.

The generalisability of ML means there is also the potential for data-agnostic models available on the DestinE platform, developed by ESA, leading to service harmonisation. AI will also improve the efficiency and interactivity of the platform and, if implemented correctly, could help verify the integrity of data on the platform. Moreover, the DestinE digital twins rely greatly on the hosting systems, and AI can also help perform service performance quality control and optimise high-performance computing (HPC) service demand and predict outages. Finally, harnessing ML and virtual reality applications could make the DestinE platform an invaluable tool for Earth Science’s education on extreme weather events and changing climates.

TA2: multidomain ML for Earth system observation and prediction

The working group was chaired by Maryam Pourshamsi (ESA) and Anna Jungbluth (ESA). The focus of the session was on showcasing novel ML approaches integrating diverse sources of data (e.g., different modalities of EO data, or combining images and text). The group hosted 10 oral presentations and a series of posters that highlighted current applications, tools, and future directions of multi-domain ML in the context of Earth system observation and prediction.

Current ML applications

The presentations covered a range of innovative ML applications, such as school detection and connectivity prediction aimed at improving educational infrastructure17, monitoring power lines to enhance grid reliability and safety18, and detecting multi-layer clouds to improve weather models and predictions. They also addressed extreme events detection and prediction to aid disaster preparedness, detecting and quantifying CO2 plumes from power plants to monitor emissions and environmental impact19, and monitoring tropical cyclones to enhance early warning systems. Additional applications included monitoring volcanic clouds for improved air travel safety20, detecting Antarctic Sea ice to understand climate change effects21, monitoring methane super-emitter plumes to track greenhouse gas emissions22, and biomass estimation and modelling to support carbon cycle studies.

Across these diverse topics, various ML approaches were employed. These include classification, segmentation, and prediction for identifying schools with internet access17, segmentation and classification for creating risk maps of trees around power lines18,23, and classification to identify multi-layer clouds. ML-based segmentation and prediction were used to detect atmospheric extreme events, inversion techniques were used to detect and quantify CO2 plumes19, and classification approaches were used to monitor tropical cyclones24. Furthermore, ML-based detection and regression were applied to monitor volcanic clouds, state-of-the-art segmentation methods25 were used to detect Antarctic Sea ice, and multi-data fusion techniques aided in the monitoring of methane plumes22 and estimating above ground biomass.

Limitations, opportunities, and challenges

Key themes emerged from the discussions, emphasising the need for high quality labels for supervised model training. Multiple presentations discussed the benefits of transfer learning after model pre-training on large image datasets. This pre-training was generally done on natural images, rather than the scientific data used in the final detection or prediction tasks. Furthermore, in line with the theme of the session, many presentations discussed the benefits of fusing diverse data types for more robust predictions (e.g.21,22). The connection between research, operational services, and decision-making was also highlighted.

Several advantages and opportunities of using ML were identified, including improved data analysis and forecasting, e.g., through automated feature extraction, integration of geospatial information, and the fusion of vast amounts of diverse data types. The advantages of ML techniques over traditional statistical methods were discussed, highlighting the ability to leverage vast amounts of diverse data and bridge complicated analysis pipelines. This offers significant potential benefits for various (scientific) applications.

However, several limitations and challenges were also noted. Data quality and availability, especially for labelled examples, still limit the usefulness of supervised methods for EO applications. In addition, differences in data characteristics across sensors and data types present a challenge for the effective integration of diverse data sources. Beyond this, the need for costly computational resources and limited scalability, transferability, and explainability were highlighted as limitations. In addition, the usefulness of pre-trained models was questioned, especially when models are pre-trained on data formats (e.g., RGB color channel images with pixel values between 0-255) that are different to the format of scientific data.

Future directions

Looking to the future, there is a clear need for open-access methods, models and datasets, especially including labelled training examples. Increasing spatial and temporal overlap between different data sources is essential to maximise returns from multi-domain approaches. Improved spatial and temporal resolution, access to real- or near-real-time data, and better global data quality and coverage, especially outside Europe and North America, are also necessary. A general gap in data and methods for extreme event applications was noted, along with the need for partnerships with non-governmental organizations governments, and local entities to translate research into action.

An additional discussion on cloud computing platforms pointed out that they are more expensive than in-house resources, though tensor and graphics processing units greatly speed up computation. Multiple user interactions and open online services with private backends were discussed, with concerns about lock-in by these services, rapid scale-up capabilities, lack of support, and data protection and security.

TA3: end-to-end ML for weather and climate prediction

Current ML applications

The working group was chaired by Mihai Alexe (ECMWF) and Matthew Chantry (ECMWF) and explored end-to-end ML solutions. This was noted as being an area of rapid growth since the last ECMWF-ESA ML4ESOP workshop, which occurred just after the Pangu Weather paper26 had been uploaded as a preprint. The topic of benchmarking was recurring from the last workshop, with a follow up presentation from Stephan Rasp on WeatherBench212. Since the last workshop, data-driven weather forecasts are now being run daily allowing direct comparison of their outputs with those of traditional physics-based models. Operational weather forecasting centres are now exploring the technology and developing their own models27. Data-driven models for climate prediction were a new area, with many open questions but early signs of promise7. The more mature field of nowcasting was also represented, utilising technologies from language and vision transformer models. Probabilistic prediction, most prominently through the use of diffusion models, was a regular feature for atmospheric forecasting [jl24] but also other domains such as sea-ice modelling28. Data-assimilation was increasingly being explored in developing truly end-to-end observations to future state systems, but with many open questions. Bringing physics into these systems remained a topic of interest from the previous event, with an increasingly blurry divide between end-to-end and hybrid ML-NWP solutions e.g.7.

Limitations, opportunities, and challenges

Headline opportunities for end-to-end weather and climate models are clear and obvious. Once trained, the energy/time/compute to make a forecast are heavily reduced compared to high-resolution physics-based models. There is also evidence that for some applications, e.g. tropical cyclones' track forecasting, data-driven weather forecasting models can outperform physics-based models. Many current limitations were also discussed. Data-driven models typically satisfy physical balance and realism (e.g. energy spectra) to a worse degree than physical systems [mb24]6. They are even harder to interpret than physics-based systems, and currently they struggle to make predictions of small-scale extreme events (e.g. tropical cyclone intensity). Broadly there was optimism that these were current limitations, rather than fundamental ones, and that adopting a probabilistic framing of the problem could help overcome at least some of these challenges.

Future directions

The TA discussion foresaw further growth in the exploration of data-assimilation or observations to future state prediction with ML, embracing the fullest end-to-end problem. It was foreseen that generative methods would become the default approach in the field, with mean-squared-error or similar minimisation metrics becoming rare. The topics of extended range forecasting and atmospheric composition forecasting have both seen early activity, but both were expected to grow. Connecting these threads, the TA panel anticipated that foundation models, trained on many and varied datasets for a variety of downstream activities, will be more present at future events. Although it was noted that the compute resources to train such models might limit who can actively contribute to their research.

TA4: hybrid ML-NWP/climate

Current ML applications

The working group was chaired by Massimo Bonavita (ECMWF) and Patrick Ebel (ESA) and explored the combination of data-driven and numerical approaches. Two main applications are identified within the recent literature: First, the usage of physical knowledge as guidance for ML. Such guidance can be more or less explicit, and ranges from softly encouraging to strictly enforcing physically plausible predictions, e.g. via specifically designed loss functions or dedicated hybrid modules7,28. Furthermore, physical priors may facilitate forecasting plausible sub-seasonal to seasonal dynamics and the simulation of climate or unobserved state variables29. The second major use case of hybrid approaches is for AI to internalise dynamics30, by emulating physics models or learning slowly evolving trajectories to complement knowledge-based models. Notably, the surveyed ML models differ substantially in their complexity, ranging from simple perceptrons to diffusion and foundation models. The discussion highlighted that there’s no one-size-fits-all solution, as the choice and design of neural network architectures depends greatly on the task as well as its available training data quality and quantity.

Limitations, opportunities, and challenges

Hybrid approaches allow for a trade-off between physical guidance and data-driven flexibility. Any insufficient amount of training data or its quality may limit ML approaches and risk overfitting, which can be compensated via physics-informed priors30. Yet, ML approaches are flexible and can facilitate fusing diverse data sources. This versatility and data-driven fusion are particularly valuable in the context of Earth system modelling (also see the related discussions on DestinE in TA), where combining heterogeneous components is needed to capture the system dynamics in its entirety. ML hybridization can facilitate large-scale simulations by providing predictions in short time and enables modelling complex structures in forecasts, not limited to unimodal data distributions31. While powerful, ML models are a black box and this limits their interpretability compared to the clear semantics of physical simulations, which poses a hurdle for current applications. Finally, the discussion highlighted that the combination of physical models and data-driven neural networks for end-to-end pipelines and training is not trivial, as there is no established framework so the benefits and implementation strategies of hybridization solutions vary on the individual use cases.

Future directions

The TA discussion underlined the potential of hybrid approaches, particularly for AI components to complement our physical models of emission and sensing processes. Learning directly from raw observations and unsupervised learning to compensate for limited label availability may be promising approaches for addressing aforementioned limitations. Vice versa, further research on how knowledge-based models may compensate for the incompleteness of observations and facilitate the interpretation of black-box models have been identified as further directions. Moreover, the exploration of generative AI for making ensembling in assimilation more efficient and the hybridization of impact forecasting models have been highlighted as promising avenues. Altogether, decades of ESOP research generated valuable insights and physical understanding to guide future model development, which may ensure plausible outputs and in return benefit from ML approaches enabling fast inference and thus large ensembles of planet-scale simulations.

TA5: new generation computing for AI

Current ML applications

The working group was chaired by Alessandro Sebastianelli (ESA) and Marcin Chrust (ECMWF). The themes discussed in this session were related to HPC, edge computing, with focus on on-board AI, and quantum computing32. The presented AI applications ranged from methane (CH4) super emitter detection33 and storm nowcasting34 to El Niño forecasting35. While the talks offered an outlook on the evolution of computing platforms for AI applications, the ML tools and approaches presented in this thematic area mostly focused on the use of Transformers, Graph Neural Networks, and Convolutional Neural Networks, applied to tasks such as classification, detection, segmentation, and noise filtering.

Limitations, opportunities, and challenges

The development of ever more efficient HPC systems, the advent of Quantum Computing (QC), the rapid progress in satellite on-board processing capabilities all present an indisputable opportunity to push the boundaries of ML applications in the Earth system sciences. Higher computational throughput at a reduced energy-to-solution will democratize the access to ML based applications to a broad range of users. At the same time the development of the data-driven models is simpler and easily customizable depending on user requirements, allowing for quick adaptation. On the other hand, the fact that current ML models rely solely on data, and therefore lack a mechanism to ensure their physical consistency, was identified as their main limitation. To some extent, this limitation can be addressed by Physics-Informed Neural Networks that open an opportunity to embed physical laws in the learning process. Another identified challenge concerns tackling heterogeneous data with non-stationary errors. Development of novel data fusion techniques and the adaptation of AI architectures that can effectively exploit such data, such as transformers, opens new opportunities. Although some technological hurdles persist, for instance lack of low latency and cost-effective communication for the Low Earth Orbit in case of on-board processing, further progress in hardware development will open a path towards more “green”, easily adaptable deep learning solutions that promise improved performance at a reduced cost.

Future directions

The panellists, together with the lively and engaged audience, clearly identified the future directions for the field of computing for AI. The need for further integration of physical laws within ML and deep learning models, the utilization of diverse and complementary data sources, the deployment of more efficient hardware allowing execution of sophisticated models for diverse applications, the encouragement of AI on-board applications due to their potential to significantly benefit society, and the advancement of quantum computing-based solutions were all identified as promising avenues. These efforts will collectively enhance the capabilities and impact of computational technologies across various domains.

Conclusion

This year’s edition of the ML4ESOP workshop set new records in terms of submission and attendance, highlighting a growing interest in the field and the event itself. The growth in interest and uptake is partly driven by the customizability and approachability of ML along with the accessibility of its underlying data, as evidenced by the accelerating progress and variety of applications showcased at the event. In light of such fast-paced advances and a thriving community, we consider the workshop’s importance as a platform for exchange and discussions greater than ever before. In addition to topical themes such as end-to-end forecasting or multi-modal and hybrid modelling, the workshop featured a thematic area on the emerging topic of digital twins and planet-scale simulations, coinciding with the launch of the seminal Destination Earth initiative. In line with increasing computational demands and data requirements to fuel the next generation of ESOP models, we perceived a vivid interest in the topic and consider it a promising platform to tackle the community’s needs. Across all technical areas, the working groups agreed on the increasing relevance of ML. Interestingly, we saw several challenges and opportunities outlined in past workshop editions getting addressed through innovative research. For example, drawbacks of data-driven methods such as their lack of enforcing physical plausibility were compensated for via bespoke hybrid solutions. As such, there may be no optimal one-size-fits-all recipe, but many domain-specific adaptations tailored to individual needs and data availability. Furthermore, the discussions highlighted the extent to which areas are interlinked: examples are the close relation between Hybrid ML and End-to-End approaches, while the Multidomain ML session emphasized the importance of unified data access as addressed via the DestinE platform. Vice versa, digital twins and underlying platforms benefit from progress in HPC and further directions covered in the New Generation Computing session. Overall, we witnessed how the future of ESOP is increasingly cross-disciplinary, encouraging researchers and developers to build upon progress among adjacent areas. Consequently, we envision future editions of the ML4ESOP workshop which will further facilitate a lively exchange of knowledge and expertise across ML and ESOP communities.