Introduction

Climate change is a global threat to human and natural systems, and the ocean has been receiving increasing attention for its potential to provide ‘ocean-based solutions’ to mitigate atmospheric greenhouse gases and bolster adaptation to climate change impacts (e.g.,1,2,3,4,5,6). Concerning mitigation actions, the ocean supports sources of renewable energy7, and is also a large natural carbon sink where proposed interventions seek to increase its capacity to sequester CO28,9. Maritime industries also provide opportunities to implement technologies and operational changes to increase efficiency and reduce net emissions10,11,12. In addition, ocean-related hazards exacerbated by climate change— termed ‘climatic impact-drivers’13,14 (e.g., sea-level rise, climate extremes including storm surges and heat waves, increased sea surface temperatures, etc.)—have given rise to actions promoting resilience in natural systems (which may also deliver mitigation co-benefits) and societal adaptation. Not all ‘solutions’ share the same level of societal acceptance, nor consensus concerning how they compare in terms of effectiveness, co-benefits, or disbenefits. To acknowledge this uncertainty, we hereby propose the nomenclature ‘ocean-related options’ (OROs, c.f. Fig. 1).

Fig. 1: Ocean-related option (ORO) typology.
Fig. 1: Ocean-related option (ORO) typology.
Full size image

The typology is inspired by Gattuso et al.2 and specifies two hierarchical levels of ORO types (represented by the centre circle). The first level classifies all OROs into three main ‘branches’ based on their mechanisms of mitigating climate change or adapting to climatic impact-drivers (from left to right): to mitigate climate change by reducing atmospheric greenhouse gases; improving the resilience of natural systems to adapt (with potential mitigation co-benefits); and improving societal adaptation. These branches are further classified in a second level specifying ORO ‘type’. NB. to avoid double-counting interventions, each intervention is coded into only one ORO type. Since ‘natural resilience’ OROs often target multiple climate objectives (e.g., conservation of blue carbon habitats may target mitigation and adaptation), we allow for this in our typology.

The mounting urgency to increase the development, implementation, and up-scaling of climate actions15 alongside the exponential growth of the peer-reviewed climate change literature16,17, demands the rapid synthesis of large amounts of evidence. Scientific assessment reports (e.g., IPCC) demonstrate one effective method for meeting this need; compiling contributions from domain-specific experts across diverse fields through a rigorous multi-stage process. There is also value in approaching a research question using complementary methodologies, allowing us to triangulate more accurate and nuanced conclusions. Systematic reviews aim to be a truly comprehensive alternative, minimising publication bias with the goal of assessing the full spectrum of available evidence through a specific structured, objective, and transparent protocol18,19. The first exploratory step of a Systematic review is often a ‘Systematic map’ which catalogues and describes all relevant evidence. However, the time-intensive protocol is logistically difficult to apply to broad research questions—recent syntheses on OROs are therefore limited in scope to specific OROs or a subset thereof (Supplementary Table 1). As a first step to addressing this gap, a Systematic map covering all OROs under the unified umbrella of one methodology is needed.

Large-scale systematic maps are now achievable using predictions generated from large language models (LLMs)20,21. By adapting a previously validated and effective protocol21, we map scientific evidence on OROs with the overall goal of informing research and policy for climate action interventions. Using a hand-coded subset of articles, we first fine-tune a pre-trained LLM to screen for relevant ORO research, then on separate classification tasks to categorise each relevant article according to ORO type, research methodology, environmental characteristics, climatic impact-driver and outcome type. These descriptive variables shed light on the following topics of enquiry:

  • How many articles are relevant to each type of ORO? Does this distribution change over time? Is ORO research an increasingly popular topic within the domain of ocean and climate science as climate actions take increasing prominence in the modern zeitgeist?

  • Which research methods are employed to study OROs? In what environments are these studies conducted? Is the ORO linked to a particular climatic impact-driver and are climate-relevant outcomes studied?

  • Do global patterns in national-level mitigation vs adaptation ORO research correspond to indicators that may motivate the prioritisation of one of these climate goals over another?

Results

We search citation-indexed databases (Web of Science and Scopus) using keywords related to ‘OROs’ AND ‘climate change’, except for marine renewable energies for which ‘climate change’ is implicit by definition of what they are intended to address (c.f. ‘Methods’ Section 4.3, Supplementary Fig. 3). This search returns 265,436 unique results, from which our screening LLM predicts 44,193 ± 11,615 relevant articles using article abstract, title, and keyword text (hereafter ‘article text’). We classify each relevant article according to metadata variables that we pre-define, using predictions from separate LLMs (i.e., one model per variable). ORO type is classified using a typology we adapted from ref. 2 (Fig. 1), which first differentiates OROs into one of three ‘branches’ corresponding to the climate challenge addressed: mitigation, adaptation through improved natural resilience (with possible mitigation co-benefits), and societal adaptation. Each branch is subdivided further into ‘types’ (hereafter italicised in text). Articles are also classified according to study methodology, environmental characteristics, climatic impact-driver and outcome type. These LLM classifications are complemented by bibliometric data retrieved from the citation-indexed databases, such as author affiliation and publication year, as well as geographic location extracted from the unstructured article text using a geoparser22.

The suitable performance of all LLM classifiers presented here was assessed using F1 score (the harmonic mean between recall and precision, cf. Supplementary Fig. 1, Supplementary Data File “summary_model_scores.csv”), in keeping with common practice in the machine learning and computer science domains23. F1 scores that indicate a classifier is fit-for-purpose often range between 0.7 and 0.9 for relatively simple binary classification tasks (i.e. relevance screening), and lower (e.g., 0.45–0.7 in ref. 24) for more challenging multi-label classification tasks (i.e. metadata coding, cf.23 and the references therein). We used these thresholds to guide our decisions of whether to include an LLM in our results (cf. ‘Methods’ Sections ''Deviations from the protocol'' and "Training the LLM classifier: Fitting model predictions"). Of note, even an F1 score within these accepted ranges implies a certain amount of prediction error, which informs how these models should be applied. Therefore, our use of all machine learning model predictions (from our classification LLMs and geoparser) is in accordance with the recommendations in ref. 25, which positions the role of LLM classifiers as being well-suited to characterising the extent and distribution of an evidence base.

ORO research is unbalanced but exponentially increasing

Research effort is unevenly distributed across the different types of OROs (Fig. 2a). We find 80% of all ORO publications study mitigation OROs (n = 35463), in particular marine renewable energy (n = 25,093). In contrast, natural resilience OROs account for 12% of publications (n = 5098), and societal adaptation for 14% (n = 6228). We find a mismatch between the foci of the scientific literature, which emphasises mitigation, and the IPCC AR6 reports, where ORO citations (totalling only 0.41% of all ORO articles) focus disproportionately (67.6%) on societal adaptation (mitigation = 11.6%, natural resilience = 20.8%). Despite the large number of ORO articles identified, only 3% (n = 1342) of articles are classified as relevant to different OROs from different branches (Fig. 2b). At a finer scale, 8% of articles (n = 3716, Supplementary Fig. 10) combine multiple ORO types; most frequently built infrastructure and technology with socio-institutional OROs (n = 2035); conservation with CO2 removal and storage (n = 480), and conservation with built infrastructure and technology (n = 401). Since 1980, annual publication rates have increased exponentially for all ORO branches alongside the wider ocean and climate literature (Fig. 2c, Supplementary Fig. 7 and Supplementary Table 4). The rate of this increase has been 10% more per year for adaptation OROs compared to mitigation OROs (p 0.01).

Fig. 2: Distribution of research by ORO type.
Fig. 2: Distribution of research by ORO type.
Full size image

a The number of articles predicted relevant for each ORO type. Error bars indicate lower and upper bounds of the prediction confidence interval. A single article may be counted more than once if predicted relevant for multiple ORO types. The proportion of each bar that is composed of conference proceedings is shown in a lighter colour. The number inside the doughnut plot indicates the percentage of articles relevant to OROs (n = 44,193, ±11,615) cited in the IPCC AR6 reports (0.41%), and the size of the bars in the surrounding doughnut plot shows the distribution of these citations by ORO type. b Interactions between ORO branches. The numbers inside each circle indicate the number of articles predicted relevant to a given set. An interaction shown in the intersecting areas of the Venn diagram represents a single article mentioning different OROs belonging to different branches. For example, an article studying a conservation ORO (branch: Natural resilience) would also need to mention a second (or third) ORO from a different branch, such as a marine renewable energy (branch: Mitigation). Interactions found between all ORO types are shown in Supplementary Fig. 10. Note articles predicted relevant but not classified to a particular branch (n = 1435) are not shown. Numbers of total articles per branch in panel b may not equal the sum of all ORO type articles within a branch in panel a, as in panel a articles can be double-counted if predicted relevant for multiple ORO types within the same branch. c The change in the log-transformed number of articles published per year according for each ORO branch, shown in contrast to the broader field of ocean and climate science. NB. Since the y-axis is log-transformed, a linear relationship indicates an exponential increase. The number of ocean and climate science articles was obtained from the search results returned using the ‘population’ and ‘climate change’ term blocks in the ‘General’ search sub-string, (see Supplementary Data File “search-string-blocks.xlsx'').

Insights and knowledge gaps from ORO research methodologies

In addition to ORO type, LLM predictions classify articles across variables describing study methodology, environmental characteristics, climatic-impact driver and outcome. The distribution of research across methodology varies by ORO type (Fig. 3, Supplementary Table 7 and Supplementary Fig. 11). Modelling studies form a large and growing proportion of marine renewable energy and increasing efficiency research. Empirical approaches form approximately half of CO2 removal and storage, conservation, built infrastructure and technology and socio-institutional studies, but dominate the literature on human-assisted evolution (84%). The empirical studies on societal adaptation OROs are comprised mostly of social-science approaches (e.g., surveys, focus-groups or interviews: 83% for built infrastructure and technology and 96% for socio-institutional).

Fig. 3: Study methodology across ORO types.
Fig. 3: Study methodology across ORO types.
Full size image

The height of each bar is proportional to the percent contribution of each method type towards the total number of publications predicted relevant for each ORO type (labelled in black numbers on the top of each bar). The combined percent contribution of both types of empirical studies is labelled in green on the top of each bar. If an article is not classified as relevant for any of the coded method types, it is classified as ‘other’.

LLM classifications also uncover a land-to-sea gradient in ORO research (Fig. 4a, b). Sea-level rise is the most common climatic impact-driver identified in connection with the high concentrations of adaptation OROs (i.e. natural resilience and societal adaptation branches) studied in coastal land and ocean, except for human-assisted evolution OROs for which over half the articles are linked to temperature. Zooming in on the coastal marine system (n = 3147 articles), mangroves (n = 1453) and coral reefs (n = 1044) are the most-studied; mangroves for CO2 removal and storage ORO research, and coral reefs for natural resilience (Fig. 4c, Supplementary Fig. 13). As ORO objectives can include mitigation and/or adaptation, we disentangle outcomes predicted to be relevant for each of these objectives. Mitigation outcomes studied within coastal ecosystems are most frequently linked to CO2 removal and storage, making coastal nature-based research 35% of all CO2 removal and storage publications. Outcomes linked to natural resilience OROs are predominantly in line with their primary objective of natural resilience (58% and 81% of all publications for conservation and human-assisted evolution, respectively), and demonstrate relatively sparse evidence on societal adaptation co-benefits. When societal adaptation co-benefits are identified, they are most often linked to mangrove ecosystems. Overall, we find ORO research is limited to specific topics at the intersection of different marine systems, climatic impact drivers and outcomes.

Fig. 4: Distribution of ORO research across different marine systems, climatic-impact drivers, ecosystem types and outcomes.
Fig. 4: Distribution of ORO research across different marine systems, climatic-impact drivers, ecosystem types and outcomes.
Full size image

a An infographic describing the three marine system classifications: coastal land, coastal ocean and open-ocean. b The distribution of literature between the different ORO types (x-axis), climatic impact-drivers (y-axis) and marine systems (vertical panels). The percent of all articles for each ORO type attributed to each climatic impact-driver × marine system combination is represented by the blue colour scale, with the absolute number of articles labelled in black in each cell. c The distribution of literature between the different ORO types (x-axis) and ecosystem types (y-axis) with either mitigation or adaptation outcomes. Only articles are shown with a predicted relevance for the coastal ocean marine system (i.e., middle row in (b)), a mitigation outcome for mitigation ORO types, and an adaptation outcome (decomposed by human vs. natural adaptation) for natural resilience and societal adaptation ORO types. The percent contribution of each ecosystem type × ORO type combination out of all coastal ecosystem articles is shown by the green colour scale. All the interactions shown were found to be statistically significant using Poisson regression models (p 0.01 for all models). Red and black bars in (b, c) delimit ORO branches. For the article counts represented in this figure, see Supplementary Tables 9 and 10.

Matches and mismatches between research effort, responsibilities, and risks

To describe the geographical distribution of research effort, we attribute a country or territory to each article using two different methods: (1) first author affiliation (indicating national research investment or prioritisation), and (2) geoparsing geographical entities from article text (indicating where the research is conducted). We then contrast this distribution with indicators of historical responsibility in climate change and adaptation need.

Authors affiliated with tropical coastal nations and small island developing states (SIDS) are over 5 times more likely to prioritise ORO research within the ocean & climate sciences (Fig. 5, Supplementary Table 8 and Supplementary Fig. 14). Comparing the country of first author affiliation to the location identified by geoparsing reveals that authors affiliated with institutions from Asia, Europe and North America conduct the most research abroad (Fig. 5d), while regions with many SIDS in Oceania, South America, Africa, the Caribbean and Central America receive the most foreign-led research (Supplementary Fig. 18).

Fig. 5: Geographical distribution of ORO research based on country of first author affiliation.
Fig. 5: Geographical distribution of ORO research based on country of first author affiliation.
Full size image

a The colour scale represents the relative importance placed on OROs compared to the wider ocean and climate literature. The size of the points in the centre of each country indicates the total number of ORO publications. b The colour scale indicates the slope of the inter-annual (1980–2022) trend in the number of ORO publications. A greater slope indicates annual ORO publications increased more over time compared to a slope closer to zero indicating the number of ORO publications remained stable. The slope is determined using a linear regression fit using generalised least squares with an AR1 temporal autocorrelation coefficient. ‘NS’ indicates that the trend was not significant, and ‘NA’ indicates that the model was not able to converge. c The relative importance of mitigation OROs in comparison to adaptation OROs (i.e. natural resilience and societal adaptation branches). This is expressed as the percent of all ORO publications that are mitigation OROs; thus 100% indicates all publications are mitigation OROs, 50% an equal partitioning, and 0% all adaptation OROs. d The links between the geographical region of first author affiliation (nodes on left-hand side) with where the research was conducted (nodes right-hand side, determined from geoparsing). The size of the grey shaded paths connecting the nodes indicates the size of the link.

We further hypothesise that each country’s coastal hazard risk motivates prioritisation of adaptation ORO research over mitigation. To quantify coastal hazard risk, we classify each country as land-locked, coastal (non-SIDS) or SIDS, as well as according to per-capita gross domestic product (GDP). These variables disentangle two main drivers of coastal hazard risk (where risk = exposure × vulnerability):

  • Exposure: Coastal countries and SIDS are exposed to coastal hazards (compared to land-locked countries), motivating investment in adaptation OROs. Exposure is likely elevated for SIDS and developing coastal countries with low GDP that have higher population densities in low-elevation coastal zones26,27,28.

  • Vulnerability: SIDS comprise countries with developing status, which increases vulnerability to coastal hazards compared to coastal countries with higher GDP and land-locked countries, due to inequality and poverty (C.3.1 in ref. 29).

Authors affiliated with a SIDS institution or coastal developing country prioritise adaptation research (Supplementary Figs. 16, 17 and Supplementary Table 6), with SIDS-affiliated authors being 42% more likely to study adaptation OROs compared to land-locked countries (p 0.01, Supplementary Table 5 and Supplementary Fig. 15). Therefore, authors publish ORO research that largely matches with the coastal hazard risk of the countries to which their affiliate institutions belong. This indicates institutional research strategies are aligned with national research needs, but does not necessarily represent where empirical research on OROs is conducted.

To disentangle these differences, we examine locations geoparsed from article text (Fig. 6). The number of mitigation ORO publications shows an overall match scaling with cumulative (1850−2022) CO2 emissions (c.f. ‘Methods’ Section ‘Analysis and modelling’ for metric calculations). The statistical significance of this match is confirmed using a generalised linear model (p 0.01), but the model also reveals outliers (the USA, China, and Russia) where less mitigation research is published than would be expected given the countries’ historically high emissions (Supplementary Fig. 19a). In contrast, adaptation ORO research shows no significant relationship between coastal hazard risk and research effort (p = 0.08, Supplementary Fig. 19b). Most concerning are the mismatches created from low research effort despite high risk from coastal hazards, particularly in countries/territories in northern Africa and SIDS (Fig. 6).

Fig. 6: Drivers of empirical research effort.
Fig. 6: Drivers of empirical research effort.
Full size image

Empirical research effort (y-axis on bivariate colour scale) is represented by the number of ORO articles predicted relevant for mitigation or adaptation branches and attributed to a country/territory using geoparsing (see ‘Methods’ section ‘Identifying study locations: text geoparsing’). This is compared to hypothesised drivers (x-axis on bivariate colour scale). a Mitigation research is compared to cumulative CO2 emissions (including land use change, from 1850 to 202278, c.f. 4.8 for metric calculations) as an indicator of ‘historical responsibility in climate change’. b Adaptation research is compared to coastal risk (see ‘Online Methods’ Section ‘Analysis and modelling’) as an indicator of ‘adaptation need’. NB. The breaks in the colour scale indicate quantile. For a statistical analysis accounting for the probability distribution of each response variable, see Supplementary Fig. 19.

Discussion

ORO research, particularly studies targeting adaptation to climate change, has increased exponentially over the last 42 years (Fig. 2b), which may echo the shifting focus in policy and academic discourse away from an ‘alert’ phase towards a ‘solution-oriented’ phase30.

We find the distribution of academic research is skewed towards mitigation OROs, in contrast to the higher proportion of adaptation ORO publications cited within the IPCC AR6 report (Fig. 2a, supported by ref. 17). Several factors may contribute to this asymmetry. Firstly, the increased citation of adaptation research (directly by the authors, and/or indirectly by the structuring of the chapters’ content) could aim to increase the visibility of adaptation in the climate change conversation. This could acknowledge that many long-term impacts of climate change are irreversible on decadal timescales even with strong mitigation, and a dual-pronged approach combining mitigation and adaptation is required (C1.1 in ref. 29 and refs. 31,32). There are other advantages in promoting adaptation interventions due to their long history of use for improving resilience against coastal hazards within the pre-industrial envelope of climate variability. These adaptation interventions likely have more social acceptance and existing guidance/methods for implementation compared to newer mitigation interventions (e.g. geoengineering). However, transformative change and/or redesign will likely be required to respond to the unprecedented pressures of climate change33,34,35,36.

Secondly, preferentially citing adaptation ORO publications may counteract factors that inflate mitigation ORO publications and confound the signal of relative importance of each topic for climate change research. High numbers of mitigation ORO publications can be caused by discipline-specific differences in publication outputs which are higher in the physical and engineering sciences researching mitigation OROs (Supplementary Fig. 20), or by mitigation OROs being easier to monetise and attract private investment for research. The latter may be due to difficulties in quantifying the success of adaptation interventions (and conversely the cost of inaction) and comparing them to monetary values37, or that adaptation OROs often act on small scales while mitigation OROs are expected to provide more far-reaching global benefits. Minimising trade-offs and maximising synergies between different ORO objectives can provide a pathway to harmonising this dichotomy.

Despite this need, we find ORO research to be siloed, with relatively few studies addressing multiple ORO types. While this result may be directionally true, it is likely that our analysis failed to identify articles examining interactions due to the limitation of analysing only title, abstract, and keywords. Indeed, studies may only mention the primary intervention of interest in the abstract, and interactions found with other OROs may be discussed in the full text. This knowledge gap certainly warrants verification with analyses at the full-text level; possibly through multiple manual syntheses, or using LLMs (which would require substantial computational capacity and access to full-text files programmatically). Advancing our understanding of ORO inter-comparison is invaluable for informing decision-making when choosing the most effective OROs for a range of contexts, time-frames and management objectives, while synergies and trade-offs can inform portfolios involving co-located options to maximise desired benefits. As an example of the former, the slow implementation of mitigation options can result in the increased release of embedded emissions38; in these limited contexts, adaptation options can provide lower-emission complementary approaches. Overall, analysing the distribution of evidence across ORO types reveals crucial gaps where evidence may be lacking; by contrast high publications alone is not sufficient to conclude technological readiness39.

We hypothesise, however, that differences in the distribution of publications across different study methodologies may be understood through an interpretation that considers the ORO’s trajectory through different development stages. Modelling research on marine renewable energy and increasing efficiency options has been increasing, indicating an advanced stage of technological readiness where modelling can be used to predict up-scaling and future siting3. Conversely, a high proportion of early modelling studies on CO2 removal and storage reflects a scarcity of empirical research, where modelling from first principles was needed to explore their potential and feasibility. Natural resilience and socio-institutional OROs are consistently researched using empirical methods, however this is likely not an indicator of readiness as many of these interventions have already been implemented. Rather, this could signal a gap for further development of modelling approaches (e.g. integrating socio-ecological interactions, fisheries management models incorporating climate change-related uncertainties) or a need for modelling outcomes specific to climate change mitigation and adaptation.

In addition to knowledge gaps, we identify knowledge clusters where ORO research is constrained to specific topics. The distribution of ORO research across the different marine systems and climatic impact-drivers is largely explained by our typology (Fig. 1), shifting from adaptation OROs targeting climatic impact-drivers on coastal land and coastal ocean, to mitigation OROs in the open-ocean. However, climatic impact-drivers also impact the open-ocean40, yet research on open-ocean climate adaptation is lagging40,41. This may be influenced by soft limits such as gaps in socio-institutional structures that govern and enforce activities in the high seas (although this is an active area of change42,43 with new regulations aiming at decarbonising the shipping industry44), or a bias resulting from a hypothesised reduced vulnerability due to reduced non-climatic anthropogenic stressors40,45,46, or hard limits such as logistical challenges from reduced accessibility and/or a sparsity of human activity.

We also observe knowledge clusters within coastal ecosystem-linked research. As a relatively new field, this uneven distribution is likely influenced by inaugural publications creating a myopic focus on particular topics. For example, CO2 removal and storage OROs linked to mangrove and seagrass ecosystems are likely seeded by publications in the early 2000s heralding these systems for their high carbon sequestration potentials1,47,48,49,50,51. Emerging literature suggesting these ecosystems might not have as great a mitigation potential as originally estimated52,53, coupled with their limited global extent, may motivate a shift towards currently under-researched topics. For example, while linking conservation OROs to natural resilience outcomes is well-evidenced, there is less work studying outcomes on co-benefits (e.g., human adaptation54). Human-assisted evolution research is in an even more nascent stage, and publications remain concentrated on the initial application of improving coral adaptation to temperature stress55,56, but may already be expanding into other ecosystems and climatic impact-drivers as knowledge progresses (e.g.,33,57,58).

On the international stage, critical mismatches carry implications for negative impacts on SIDS and developing coastal countries. Lead authors affiliated with these countries prioritise adaptation ORO research, demonstrating national research priorities motivated by heightened coastal hazard risk (exposure × vulnerability)26,29. Financial barriers may impose an additional soft limit to investing in more costly mitigation options. Despite demonstrating relatively high initiative to address their heightened risks, SIDS and developing coastal countries are hot spots for attracting foreign-led research. This reveals an underutilised opportunity to support domestic researchers in developing at-risk countries in their adaptation research59. Domestic researchers likely have better local networks and understanding of the socio-cultural context—advantages that can facilitate deeper engagement from participatory approaches and result in more positive research outcomes60,61,62. Providing financial research support is particularly important for SIDS where we found extreme mismatches between high risk and low adaptation research, as financial constraints impose soft limits on adaptation (C.3.2 in ref. 29, and ref. 63). The timely provisioning of support is critical, as positive feedbacks between increasing climate impacts incurring losses and damages, further reducing the availability of financial resources, can amplify this disparity in future (C.3.2 in ref. 29 and ref. 64). Finally, we identify gaps where historically high CO2 emitting countries should be encouraged to increase their mitigation research.

Our findings also provide the foundation for future work on finer-scale evidence syntheses. For example, while some mitigation OROs, such as marine renewable energies have large bodies of evidence and are at an advanced stage of technological readiness, others, such as CO2 removal and storage may have high publication numbers due to rising optimism, which may not correlate with agreement within the academic community (e.g.,9,65). This potential mismatch in mitigation OROs between the academic literature, policy, and industry deployment warrants further disentanglement. In addition, while natural resilience and societal adaptation OROs are cited more frequently in the IPCC AR6 reports, there is still a lack of collective conclusion regarding their effectiveness to deliver on climate change objectives. Lastly, our map did not resolve details on the distribution of evidence across outcomes that qualified or quantified effectiveness, co-benefits, side-effects or trade-offs, but they are important considerations when implementing interventions (e.g.,66,67,68,69). Addressing these synthesis needs will be important for providing a more detailed picture of how OROs can contribute to climate change mitigation and adaptation goals. Our work illustrates the benefits to be expected from machine learning models to conduct time-efficient and reproducible synthesis of ORO research. This advancement represents a step towards providing evidence-based clarity to meet the modern needs of timely decision-making amidst a landscape of rapidly evolving literature. It also provides a protocol whereby the findings can be rapidly updated as new evidence emerges. We highlight research gaps important for positioning OROs within climate action portfolios, such as addressing environmental knowledge gaps and social inequalities.

Methods

Overview

This machine learning-based evidence map followed the detailed methods described in an a priori protocol70. It was performed, where relevant, in strict concordance with the Guidelines and Standards for Evidence Synthesis in Environmental Management provided by The Collaboration for Environmental Evidence18. We deviated from the Guidelines notably with regard to article coding classifications, which were automated using predictions from LLMs inspired by previous similar methodologies (e.g.,21).

We used a combination of methodologies to define descriptive metadata variables. Metadata was either (1) retrieved from citation-indexed databases, (2) extracted from each article’s title, abstract and keywords using a geoparser22 (see Section ‘Identifying study locations: text geoparsing’), or (3) classified using bespoke categorisation (hereafter referred to as ‘coding’) using predictions from an LLM (DistilBERT71) which we fine-tuned. These methods were used to define the following variables we considered important for describing the landscape of ORO research:

  • Publication year (retrieved from database): Indicates a given article’s publication year and used to infer temporal trends in the distribution of evidence.

  • Country—first author affiliation (retrieved from database): The country given in the postal address of the first author’s affiliated research institute. This was used as a proxy for national research investment.

  • Country—empirical research sites (extracted using geoparser): The country determined from the locations geoparsed from the article text. This was used as a proxy for where research is conducted.

  • ORO branch and type (coding from LLM): Bespoke categorisation of the intervention(s) studied in each article according to our typology (Fig. 1).

  • Method (coding from LLM): Bespoke categorisation of the type of study method employed.

  • Climatic impact-driver (coding from LLM): Bespoke categorisation of the ocean-related climatic impact drivers considered as the motivating factor for research undertaken on a given ORO. For example, a sea wall is a societal adaptation ORO in response to the climatic impact-driver “sea level rise”. This provides an indication of what hazards are predominantly driving ORO research.

  • Marine system (coding from LLM): Bespoke categorisation of the study site into three marine systems: open-ocean, coastal ocean, and coastal land. Each system faces different challenges in terms of climatic impact-drivers as well as having different motivations and limitations for ORO implementation influenced by environmental factors and human activities.

  • Ecosystem type (coding from LLM): Bespoke categorisation of whether a study is conducted in a coastal ecosystem characterised by a habitat-forming species (e.g. mangrove, coral reef, etc.). Different ecosystem types vary in their perceived potential for mitigation through carbon sequestration, potential for attenuating impacts from climatic impact-drivers (e.g. wave height attenuation), value derived from varying ecosystem services, and inherent vulnerability to climatic impact-drivers that may motivate the type of ORO studied.

  • Outcome type (coding from LLM): Bespoke categorisation of whether a mitigation or adaptation (natural resilience or human adaptation) outcome was linked with the ORO in the study.

For more detail, consult the full protocol70, which describes our principle research question (and sub-questions), corresponding search strategy, study eligibility criteria, and coding method to identify and characterise the evidence base.

Deviations from the protocol

Minor changes to the protocol (cf.70), including a Test-list size reduction and discarding some variables, were necessary due to time and resource constraints, as well as model performance. The protocol described in ref. 70 was followed, with two main changes. Firstly, after inter-reviewer consensus was achieved during the screening training, it was decided that 8 of the initial 94 test list articles were irrelevant due to changes in the scope of the eligibility criteria that were implemented to reach consensus. These 8 articles were removed from the corpus to produce a refined Test-list (n = 86 articles, see the Supplementary Data File “Test_list.xlsx”). We decided to retain the same search strategy that was developed under the previous Test-list, as it was still 100 % comprehensive.

Secondly, reviewers coded 30 variables (see codebook in ref. 70), acknowledging that it was impossible to know a priori whether each model would achieve adequately-performing predictions. Ultimately, models for the variables: ORO branch, ORO type, Adapt to threat, Climatic impact-driver, Ecosystem type, Marine system and Method type achieved good performance (F1/F1 macro scores ranging from 0.7 to 0.89, Supplementary Fig. 1, Supplementary Data File “summary_model_scores.csv”), and are presented in this study. The models for other variables F1 scores below 0.65 were deemed unfit for purpose and were excluded from the results. In addition, we attributed a country to each study (where possible) using two methods: (i) the country of first author affiliation, and (ii) the country identified from geoparsing (identification of geographical entity names from article title and abstract text, and attributing them to their corresponding country).

Literature search

Our search strategy was designed to retrieve a broad range of articles covering the topic of OROs for climate change mitigation and adaptation. To do so, five unique substrings were constructed; each substring represented different types of OROs (Supplementary Fig. 2, Supplementary Data Files “wos-search-strings.txt” & “scopus-search-strings.txt”). This allowed us to tailor the architecture of each search string to target interventions that we defined as ‘OROs’. When defining OROs and the perimeter of our Evidence map, we decided to target ocean-related interventions specifically developed and/or implemented to address climate change (mitigation or adaptation). This decision reflected our desire to inform climate action research and policy and influenced each search string. For example, marine renewable energies, by definition, have a focus on climate change mitigation. Therefore, our search string for marine renewable energies included combined ‘keyword blocks’ including terms and synonyms for ‘oceans’ AND ‘renewable energy’. By contrast, other ORO types such as natural resilience interventions can be developed/implemented to address other issues, with climate benefits occurring as an unintentional co-benefit (e.g.,5). Instead, we were interested in interventions where achieving climate benefits were explicitly considered in the design and/or implementation (although other co-benefits may still exist) and therefore added an additional term block with keywords for ‘climate change’.

To calibrate the search strings, comprehensiveness was assessed using a list of 86 (initially 94, see deviations from the protocol for justification of Test-list size reduction) benchmark scientific articles (Supplementary Table 2). These articles—identified by the review team prior to the mapping process – were assembled to cover relevant literature. An iterative process of modifying search terms was undertaken in Web of Science Core Collection until all benchmark articles were retrieved (i.e. increasing sensitivity), while controlling the total number of search results (i.e. favouring specificity).

The five substrings were used to query two citation-indexed databases: Web of Science Core Collection and Scopus, accessed from Bibliothèque La Perouse) between 27 and 30 January 2023 (Supplementary Data File “search-string-blocks.xlsx”). Searches were performed in English. Search results were exported as *.bib files and imported into the R Statistical Computing Environment72, where de-duplication between databases was undertaken using the “find_duplicates” function in the revtools package73, first based on DOI, and then on fuzzy title and year matching if a DOI was unavailable. This resulted in 265,436 unique publications (Supplementary Fig. 3).

Training the LLM classifier: manual document screening

Manual screening of a subset of articles was needed to train a binary LLM classifier predict document relevance from zero (irrelevant) to one (relevant) based on pre-defined eligibility criteria (Supplementary Data File “Eligibility criteria.docx”). This was necessary to define the perimeter of the evidence map. The subset of documents (4000 documents from the total 265,436 deduplicated search results) was created by randomly selecting 1000 articles from the General, Mitigation, Natural resilience, and Societal adaptation substring search results (Supplementary Figs. 2 and 3).

Before manual screening commenced, a pilot test on a sample of 30 titles and abstracts, randomly selected from the 4000, was undertaken by six expert reviewers from the authorship team (D.V., L.B., J.-P.G, Y.-J.S., F.V., A.C.). Pilot testing helps ensure agreement between screeners; if agreement is not reached this should lead to a revision of the eligibility criteria74 (Supplementary Data File “Eligibility criteria.docx”). A Fleiss’ kappa score of 0.773 was achieved and deemed acceptable74. That said, all remaining disagreements were discussed and resolved before proceeding to screening. Following this, articles from the subset were screened individually and in random order, except for 300 randomly-selected articles, which were doubled-screened (NB. any disagreements were resolved by a third reviewer). At the end of this first screening stage, 2083 articles were screened in random order, comprising 427 inclusions 1656 exclusions.

Ideally, to effectively train a LLM classifier, an adequate representation of each class (i.e., inclusion and exclusion decisions) is required in the training sample. Since class imbalances are expected from a randomly generated sample of articles, we implemented a second active-learning stage in our manual screening procedure (adapted from ref. 21). We used the relevance ranking algorithm embedded within the sysrev platform (www.sysrev.com) to predict relevance for the remaining unscreened documents in the subset (4000–2083). We then screened all documents with a predicted relevance >60% until no documents above that relevance threshold remained. This algorithm had no implication on the weights of the screening LLM presented in this article, but served merely as a tool to rapidly screen more relevant articles to achieve a more balanced training sample. At the end of the second screening stage, and including the 86 test list articles, the manually-screened subset totalled 702 included and 1795 excluded articles, with reviewers screening individually a mean of 496.7 articles (s.d. = 458.9, median = 433, n = 6 reviewers).

Training the LLM classifier: manual document coding

For each of the 702 articles identified for inclusion, metadata variables were coded by the same team of six reviewers using a standardised codebook (Supplementary Data Files ‘Codebook instructions.xlsx’ and ‘Codebook template.xlsm’). Key metadata variables included: (i) bibliographic information (e.g., author affiliation, publication year, etc.), (ii) ORO description (e.g., type of intervention, climatic impact-driver motivating the research or implementation, etc.), (iii) study design (e.g. methodology used) and (iv) outcome types (e.g., mitigation outcomes, adaptation outcomes). However, several important variable labels were under-represented in the pilot screening sample. Thus, additional articles were identified from the remaining unsampled pool of articles (265,436 search results—4000 randomly sampled—86 test list) using keywords (Supplementary Data File ‘keyword-search-tokens.csv’) to search for articles from the infrequently appearing categories. From the retrieved articles, another 254 were manually included and coded. Therefore, the model training and validation data set consisted of 2752 manually-reviewed documents (of which 2083 were screened in random order, 328 were screened due to high predicted relevance, and 254 were added at the coding stage from keyword searches).

Training the LLM classifier: fitting model predictions

The classification models were fitted using the method detailed in ref. 70, which was adapted from ref. 21. In a preliminary step, to verify which classification model was best suited to our data, two binary classifiers (a support vector machine (SVM) and a pre-trained LLM called DistilBERT71) were fitted to the screening decisions from the manually-reviewed subset and compared across the following performance metrics: F1, precision, recall and ROC AUC (Supplementary Fig. 6). The models were fitted using a 5-fold nested cross-validation procedure using only documents screened at random (i.e., not through the active-learning pipeline or keyword search) for the validation and test sets. The DistilBERT model outperformed the SVM across all metrics except precision (Supplementary Fig. 6), achieving an F1 score of 0.68, indicating a suitable predictive performance. We decided that the DistilBERT model was the most fit for purpose compared to the SVM, as our systematic evidence synthesis methodology prioritised sensitivity over specificity. The final configuration of the best-performing model (DistilBERT) was chosen using a combination of parameters that yielded the highest F1 score. For models predicting labels for the metadata variables, only DistilBERT was used as it had already out-performed the SVM on the screening classification task. Since the DistilBERT models predicting the metadata variables were trained using a smaller dataset of only the documents screened for inclusion, 3 inner and outer folds were used. Models were selected by optimising the macro F1 score (see Supplementary Data File ‘summary_model_scores.csv’), which weights all labels with equal importance. In summary, all results presented in the main text from an LLM were produced using DistilBERT.

While the F1 evaluation metric indicates good predictive capability for all the models presented, the model for the salt marsh label, for the ecosystem type variable, performed with high precision (0.93) but low recall (0.62). This resulted in no articles being predicted relevant for salt marshes using the 0.5 mean prediction cut-off (Supplementary Tables 9 and 10). We initially considered being more lenient with document inclusions by using the upper relevance boundary instead of the mean. However, this would have resulted in reduced specificity concerning the salt marsh label, thus making it impossible to compare to other ecosystem types.

Following the selection of each model’s parameter configuration, mean relevance predictions ranging from 0 (not relevant) to 1 (relevant) (±standard deviation) were calculated for each unique document in the corpus. This was achieved by using 5 (3 for metadata variables) versions of the final screening model trained on 5 (3 for metadata variables) folds of data, from which the mean prediction and standard deviation were calculated. For the metadata variables, predictions were only calculated for the documents predicted to be relevant by the binary screening classifier.

Following21, we used a 0.5 predicted mean relevance cut-off to classify an article as included (below this threshold equated to excluded). This conservative cut-off value was chosen to avoid excluding potentially relevant articles. However, this increases the risk of false positive errors (i.e., incorrectly including a non-relevant article). To explore the sensitivity of our results to this methodological choice, we calculated the number of articles predicted relevant for each type of ORO over a range of increasing cut-off values from 0.6 to 0.9 and tested for differences from the proportions achieved using the 0.5 cut-off using a chi-squared test. We found that while the chi-squared test indicated statistically significant differences across all the alternative thresholds tested, qualitatively at moderate thresholds (~0.6 to 0.7), the relative distribution across the different ORO types is robust (Supplementary Fig. 5). Since the model performance for labels for the Natural Resilience ORO types were the least sensitive (Supplementary Data File ‘summary_model_scores.csv’), these articles are preferentially sacrificed from inclusion as the relevance threshold is raised. Given the low number of articles in this ORO branch relative to mitigation OROs, we decided to retain the conservative 0.5 cut-off. Therefore, the findings we present on the distribution and extent of the evidence are qualitatively robust to moderate changes in relevance cut-off. However, additional manual screening steps to filter out false positives will be needed when using the Evidence Map’s corpus to conduct downstream evidence syntheses.

Identifying study locations: text geoparsing

Geoparsing is a two-step process that involves (1) identifying words representing geographical entities from each article’s text, and (2) attributing each unique geographical entity to a location. The first step was achieved using the geoparser Mordecai22, which was previously used in the same application in a Systematic map21. The output from Mordecai resulted in a longitude and latitude coordinate representing the centroid of each unique geographical entity mentioned in each article. The entity names were then attributed to shapefiles representing the geographical extent of each entity by querying the Natural Earth Integrated Vector Map Dataset (v 4.0.0, naturalearthdata.com). When no shapefile match was found, the latitude and longitude coordinates representing the centre of the geographical entity were used. The number of unique articles occurring within the geographical extent of a given country or territory were summed (i.e., a single document that spanned the geographical boundaries of two countries would be counted twice). We used these data as an indicator of where studies were conducted. Therefore, we present only geoparsing results extracted from articles predicted relevant for empirical method types (predicted relevant for either the ‘empirical—other’ or ‘empirical—social primary’ labels from the ‘method type’ multi-label classifier). We only present results for locations in the marine realm, or on land within 200 km from the coast.

A separate assessment of Mordecai’s performance reported an error distance with a median value of 161.9 km (±2.9) and a mean value of 1101.3 km (±24.1)75. While at a global scale this error distance would be acceptable, we implemented the following additional cleaning measures. First, we performed a manual scoping exercise of 180 articles and coded the reason for error (see Supplementary Data File “test_geop_results_consistency.xlsx”). These data were used to develop and implement a data cleaning strategy to systematically correct the main sources of error we identified (Supplementary Table 3). While errors still remain that contribute to noise in the data, by correcting the largest systematic biases the signal of the global spatial patterns of geographical entities should be reasonably well represented.

Analysis and modelling

We first aimed to compare the distribution of the number of publications for each ORO type within the scientific literature to the percentage of these articles that were cited in the IPCC AR6 reports. We accessed the .bib files for each working group chapter from their respective downloads pages (e.g., for WG1 Chapter 1: https://www.ipcc.ch/report/ar6/wg1/downloads/report/IPCC_AR6_WGI_References_Chapter01.bib). All the references were de-duplicated using the “find_duplicates” function in the revtools package73, first based on DOI, and then on fuzzy title and year matching if a DOI was unavailable, resulting in 82,438 unique references. Using the same protocol of searching for matches by DOI then title and year, we identified which articles predicted relevant for an ORO were also found in the IPCC AR6 reference list.

To analyse temporal trends across the different ORO branches in comparison to the wider ocean and climate literature (Fig. 2c), the log-transformed number of articles in each category published per year (1980–2022) was modelled as a linear function of the interaction between publication year and dataset. The ‘dataset’ variable had four levels: Ocean and climate (from the Scopus database, as the Web of Science time series displayed an irregular dip in article numbers before 1990, Fig. 2c), mitigation, natural resilience, and societal adaptation. The model had an AR1 correlation structure for year grouped by dataset, and was fit using generalised least squares. We then conducted a focalised comparison between the number of adaptation publications (the sum of both the natural resilience and societal adaptation branches) and mitigation publications (the mitigation branch) over the same time period (1980−2022). This ratio was modelled using a logistic regression as a function of publication year.

The statistical significance of the distribution of literature across the metadata variables: ORO type, marine system, ecosystem type, and climatic impact-driver was also quantified. The significance of the three-way interaction between ORO type, climatic impact-driver, and marine system was quantified using a generalised linear model where the number of publications was modelled with a Poisson distribution. The significance of the two-way interaction between ORO type and ecosystem type was confirmed using two generalised linear models, where the number of publications for mitigation and adaptation outcomes were modelled with a Poisson distribution.

We then quantified whether shared country characteristics influenced the likelihood of two binomial outcomes: publishing ORO research vs. other research topics in ocean and climate science, and publishing research on adaptation vs. mitigation OROs. First, each country was classified into one of three country types that group similar countries based on their relationship to oceanic resources and their developmental status:

  • Small island developing state (SIDS): As defined by the United Nations as a distinct group of low-lying coastal countries that share similar developmental challenges76.

  • Coastal (non-SIDS): Countries with a coastline on an ocean or sea that are not a SIDS.

  • Land-locked: Countries that do not have a coastline on an ocean or sea.

The ratio of ORO publications to general ocean and climate publications was modelled as a function of country type and each country’s percent tropical land cover77 using a logistic regression.

The ratio of mitigation to adaptation publications was modelled as a function of country type using a logistic regression. Then, within each country group, we modelled whether per capita GDP was also a driver of this ratio, by adding an interaction term of GDP per capita and country type. Within the coastal (non-SIDS) country type, the countries with the highest 5% of publications skewed the weight of the data towards a negative relationship between per capita GDP and the proportion of adaptation publications (Supplementary Fig.17a). We therefore decided to remove these outliers so that the modelled relationship was more representative of the majority of countries.

Lastly, we quantified how the geographical distribution of mitigation and adaptation research effort corresponds to the following indicators of responsibility in climate change and adaptation need, respectively:

  • Historical responsibility in climate change: cumulative (1850−2022) CO2 emissions. We hypothesised this variable would serve as an indicator of historical responsibility in climate change, influencing socio-political motivation to invest in mitigation research in order to offset climate change contributions. Furthermore, CO2 emissions are highly correlated with national GDP, which can indicate a higher capacity to invest in more expensive mitigation technologies. This metric was obtained from the variable ‘cumulative CO2 emissions’, representing the total cumulative (1850−2022) emissions of carbon dioxide, including land-use change, measured in million tonnes78. Due to the resolution of the dataset, this analysis was conducted at the country-level.

  • Adaptation need: Coastal risk. This is considered a direct motivator for investing in adaptation OROs. We represented coastal risk as the product of coastal climatic impact-driver-related hazard and exposure79. Hazard was represented by estimates of a prominent climatic impact-driver-related hazard affecting coasts: extreme sea levels associated to storm surges80. To correct for countries/territories spanning large latitude ranges resulting in a large range of cell area, the value of each metric was aggregated to a country/territory level by taking the area-weighted sum of all cell values. Coastal exposure was represented as the % of a population occupying low elevation coastal zones (LECZ) in 2000. This was calculated as: \(\frac{{\rm{population}}\, {\rm{in}}\,{\rm{LECZ}}}{{\rm{total}}\, {\rm{population}}}\)27,28. We used data at the sovereign territory-level when available, as large differences in hazard exposure can exist between sovereign countries and overseas territories (e.g., mainland France vs. French Polynesia). When data were not available at the territory-level, the country-level data were used (shown in grey in Supplementary Fig. 4).

The number of mitigation or adaptation publications attributed to a territory using geoparsing was modelled as a function of the respective historical responsibility in climate change or adaptation need metrics described above. These relationships were modelled first using generalised linear models with Poisson distributions, but these models were found to be over-dispersed (dispersion ratio = 78.4 for the mitigation model and 90.0 for the adaptation model). Therefore, negative binomial distributions were used. To compare the slope of the trends between climate change responsibility and the number of mitigation publications, and adaptation needs and the number of adaptation publications, we scaled each explanatory variable (cumulative (1850−2022) CO2 emissions and coastal risk) between 0 and 1.