Abstract
Carbon dioxide removal plays an important role in any strategy to limit global warming to well below 2 °C. Keeping abreast with the scientific evidence using rigorous evidence synthesis methods is an important prerequisite for sustainably scaling these methods. Here, we use artificial intelligence to provide a comprehensive systematic map of carbon dioxide removal research. We find a total of 28,976 studies on carbon dioxide removal—3–4 times more than previously suggested. Growth in research is faster than for the field of climate change research as a whole, but very concentrated in specific areas—such as biochar, certain research methods like lab and field experiments, and particular regions like China. Patterns of carbon dioxide removal research contrast with trends in patenting and deployment, highlighting the differing development stages of these technologies. As carbon dioxide removal gains importance for the Paris climate goals, our systematic map can support rigorous evidence synthesis for the IPCC and other assessments.
Similar content being viewed by others
Introduction
To comply with the Paris agreement and to limit global warming well below 2 °C, rapid and deep GHG emissions reductions, need to be complemented with Carbon Dioxide Removal (CDR), potentially at the gigaton scale by the mid-century and beyond1,2.
CDR has three distinct roles in climate change mitigation1: first, to reduce net CO₂ and greenhouse gas emissions in the near term, specifically in the land sector; second, to offset residual emissions from “hard to mitigate” sectors like industry, long-distance transport, and agriculture in the medium term3,4; and third, to support sustained net-negative emissions in the long term, helping to lower global temperatures in overshoot scenarios and stabilising warming at or below 1.5 °C1. Of course, CDR cannot compensate for stringent emission reductions, which need to be prioritised even in hard to mitigate sectors5. There are also deep uncertainties with respect to how fast CDR can be sustainably scaled-up, and whether the reversal of temperature overshoot can be safely achieved6. This underlines the need to reduce emissions as fast as possible, while providing sufficient policy support that CDR can actually deliver gigatons of removals in the second half of the 21st century2,7.
CDR has been a key part of climate change mitigation discussions in the scientific literature, but has often been separated into distinct knowledge domains. A stream of literature going back to the first IPCC assessment reports has considered the potential contributions of enhanced natural sinks through afforestation or soil carbon sequestration to achieve net emissions reductions8,9. This area has since broadened to include analogous nature-based approaches in other ecosystems, such as coastal blue carbon10, alongside options aimed at enhancing ecosystems’ ability to absorb and store CO₂, like ocean fertilisation and macroalgae afforestation11,12. Bioenergy with carbon capture and storage (BECCS) technologies have gained prominence in the early 2010s as an explicit option for achieving negative emissions in the integrated assessment modelling (IAM) literature13,14,15; while a range of other technologies such as biochar produced by pyrolysis, direct air carbon capture and storage (DACCS), enhanced weathering (EW), and ocean-based approaches such as ocean alkalinity enhancement (OAE) are now gaining more scientific attention15,16,17.
In the policy domain, CDR has gained increasing attention in recent years18,19, but many countries still lack concrete policies to scale CDR2. This has led to a considerable gap between countries’ (so far limited) plans to develop and deploy CDR versus CDR’s estimated role in mitigation scenarios that stabilise global temperatures at an increase well below 2 °C2,19,20,21. One of the challenges here is that there is a large spread of possible CDR levels that countries might aim for, in part driven by model assumptions of technological innovation and potential market adoption22.
In the age of big literature23,24—where the scientific literature grows at increasing rates—balancing a research question’s scope and the resource demands for reviews, like reviewer time, is increasingly challenging25. To address this issue, systematic mapping methodologies (systematic maps, evidence gap maps, etc.) have been developed by the evidence synthesis community26,27, to map existing literature, identify knowledge gaps and clusters, and guide where reviews are most beneficial. However, these methods remain resource-intensive, prompting discussions about the prospects of automation28,29. Proposals for implementing such automated synthesis approaches have been developed across various scopes and scales23,30,31,32.
There is currently little systematic oversight of the available CDR literature. As the IPCC’s 7th Assessment Cycle is starting and CDR-related policies and targets are being established, it is timely to assess the current landscape of evidence for CDR. Previous research suggests that there is a large and fast-growing evidence base on CDR, but the few available overviews of the field have rapidly become outdated33,34, only give a coarse overview2,35 or are limited in scope by manual efforts supported by community-crowdsourcing36.
The diverse range of CDR options and multidisciplinary fields involved in CDR research also adds to the complexity of this task, as researchers from different disciplines, each with their own specialised languages and methodologies, may be working on the same issues without fully knowing or engaging with each other due to misaligned terminology. It is also crucial to identify and keep track of gaps in the literature in order to effectively allocate research resources.
Here, we follow a systematic mapping methodology to comprehensively lay out the body of knowledge on CDR. We ask an open-framed question—“what is the available evidence on CDR?”—and follow a robust, stepwise methodological procedure that ensures transparency, comprehensiveness and reproducibility. Traditionally, systematic maps have been compiled manually and therefore are often limited in scope. Here, we use an approach that deploys machine learning methods to automate labour intensive tasks to provide an assessment at scale. By doing so, we are not only able to quantify the size and scope of the research landscape of CDR and its temporal dynamics, but are also able to assess the distribution of research efforts across various dimensions, including CDR options, research methodology, disciplinary structure and geographic focus. Furthermore, our machine-learning approach enables swift updating of the dataset in the future. Given the growing importance of CDR in the context of net-zero strategies and temperature overshoot, our publicly available database of CDR research will be of benefit to the research community as well as upcoming scientific assessments of CDR.
In this article, we first quantify the total volume of CDR literature and examine its temporal trends, as well as dissecting the literature by individual CDR options to highlight shifts in research focus. Next, we investigate the origins of these studies, exploring regional profiles and analysing research that specifies geographic locations to identify patterns in CDR research distribution. We then assess the focus of the studies, including the scientific methods employed, to understand how research approaches have evolved. Additionally, we evaluate the representation of CDR literature in the recent IPCC report, comparing it to the overall CDR literature to highlight any discrepancies. Finally, we compare the attention given to different CDR options across various contexts, including Integrated Assessment Model (IAM) scenarios, deployment strategies, and investment patterns.
Results
Literature on CDR is much larger than previously estimated
There is a much larger body of CDR research than previously suggested. Based on our machine learning assisted approach that enables us to identify CDR studies with high precision (0.88 ± 0.0119, meaning the proportion of relevant studies among those identified is high) and recall (0.93 ± 0.005, indicating most relevant studies are captured)—see “Methods”, Supplementary Methods 3 and 4 and Figs. 1 and 2—we predict a total of 28,976 ± 3800 scientific studies in the Web of Science and Scopus (the two largest bibliographic core collections). This is 3–4 times larger than what previous scientometric studies33 or ongoing community efforts to manually track CDR research36 have suggested when comparing the same time range. For the former study, this discrepancy likely arises from their reliance on non-machine learning methods, which forced a high-precision, low-recall search approach. In the case of the manual tracking efforts, the rapid expansion of CDR literature has simply made comprehensive tracking unfeasible.
Squares symbolise documents, a coloured square a document with labels, either assigned by hand (solid colour) or automatically (faded colour). Red documents are excluded, blue ones included. Step 1: 70,000 documents were retrieved from databases using search queries. Step 2: Of these about 6000 documents are sorted (=coded) by hand into being on CDR (relevant, blue squares) or being not on CDR (irrelevant, red squares). Documents on CDR were additionally described with CDR options, see Fig. 2, and other categories. Steps 3 and 4: The relevance labels and additional categories were used to train machine learning classifiers. Step 5: The trained classifiers were used to extend all labels to the unseen ~64,000 documents. Detailed information on methods can be found in the “Method” Section and the Supplementary Methods 3 and 4.
The definitions of each carbon dioxide removal (CDR) option shown in this figure served as the baseline for inclusion. Additional coding guidelines are provided in the Supplementary Methods.
CDR research today comprises only 5% of the overall scientific literature on climate change23, but growth in CDR research is faster than for climate overall. We observe an average annual growth rate of 17% over the past ten years compared to a 12% growth rate for the literature on climate change (Fig. 3).
a Total number of publications per year between 1990 and 2022. Additionally, we note the number of publications released during each Assessment Report (AR) cycle of the Intergovernmental Panel on Climate Change, the latest AR6 considered publications until 2021. b Share of CDR option covered in scientific publications. Multiple options per publication are possible. A more complete list of all counts per option is published in the Supplementary Fig. 1. c Annual growth rate of the scientific literature on CDR, climate change and individual options. Growth rate is only calculated if there were more than 50 publications in total available. Colorblind-friendly versions of the middle and lower panel can be found in the Supplementary Figs. 2 and 3. Source data is provided as a Source Data file.
Patterns of CDR research are uneven and dominated by biochar studies
The distribution of research across different CDR options is highly concentrated on biochar and land-based methods such as soil carbon sequestration and afforestation/reforestation. Most growth rates of the different CDR options are generally higher compared to the climate change literature as a whole (Fig. 3 and Supplementary Table 1).
Biochar research is covered in 56% of the 28,976 scientific publications on CDR. Considering 2022 only, the share of publications covering biochar even increases to 62%.
With an annual growth rate of 18% in the past 5 years and its large number of publications, biochar is the main driver of the high growth rate of the entire CDR literature. The second largest category, SCS, with 24% of the total literature, is also growing fast at 14% per year in the past 5 years.
Other biological CDR options make up a sizable amount of the CDR literature, such as afforestation/reforestation with 12% of all studies, agroforestry with 9.7% of all studies, coastal wetland (blue carbon) management with 4.7% and landscape restoration, such as peatlands with 3.5%. Growth rates in the past 5 years are generally in the range of 20–22%, except for the long established CDR options such as afforestation/reforestation and forest management with 14% and 15% respectively. Growth in newly emerging research areas tends to be particularly high, as initial literature numbers are low, making each new publication a relatively larger addition to the existing body of work.
BECCS is represented in only 5.6% of all studies on CDR, despite being the most common CDR option in most scenario pathways for meeting the Paris temperature goal37,38,39. DACCS accounts for only 2.8% of all CDR studies. The annual growth rate of BECCS is volatile and the average of the past 5 years is 12%.
Other CDR options are much less represented in the scientific literature: for ocean fertilisation, EW and OAE, we found less than 50 studies for each option per year.
CDR research is concentrated in China and OECD countries
We use the first author affiliation to infer the origin of the studies. This approach simplifies the complexities of international collaborations, where authorship, lead roles, and funding often span multiple countries. However, we consider it a valuable proxy for drawing meaningful conclusions about the research origins. With our approach, we find that China is responsible for the largest amount of research on CDR with 6452 studies (30% of all studies where author affiliation is available), followed by the United States (2667 studies, 13%) and the United Kingdom (953 studies, 4%). Only 3.4% of all studies with author affiliation have a first author affiliation from South America and 2.8% come from Africa, see Supplementary Fig. 6.
Though global research trends favour land-based CDR options, specialisation varies. China focuses more on biochar research, Europe on BECCS, and North America, particularly the US, on DACCS in comparison to the average shares per CDR option across all countries. Research on ocean-based CDR options is more predominant in Oceania and North America (Fig. 3). We provide detailed country profiles for CDR research in the Supplementary Fig. 5.
Roughly one third of CDR research refers to specific geographic locations, identified through named entity recognition in titles and abstracts40. Place-based research is important for evaluating CDR implementation in situ, including aspects such as effectiveness of removing CO2 and environmental or social side-effects. Out of 28,976 studies, 9305 mention a location, of which 74% are countries and 25% are sub-national locations such as federated states, counties or cities. Soil and vegetation-based CDR options feature more place-specific research, with afforestation/reforestation studies at 65%, compared to 33% for enhanced rock weathering, and only 10% for DAC(CS) research, Fig. 4. Further information on regional based research with details to the specific regions can be found in the Supplementary Figs. 7 and 8.
a Number of studies per country based on first author affiliation. The three highest study counts are added. b We sort the origin of the study into the world regions. For each world region, we compare the percentage difference of the investigated CDR options against all others from the complete dataset. Displayed are only the three highest and the three lowest differences. c Location-based research derived from locations mentioned in title and abstract. Displayed is the share of location-based research in all scientific literature per CDR option. A colorblind-friendly version of panel b can be found in the Supplementary Fig. 4. Source data is provided as a Source Data file.
CDR literature focuses on technology research using experimental methods and modelling
Our ML approach further enabled us to classify CDR research contents along key dimensions. In particular, we used our classifiers to distinguish research methods and the broad area of research. Additionally, we used the journal, in which a publication appeared, to determine academic disciplines in line with the relevant OECD Category scheme41.
We refer to CDR research that aims to understand, design or further develop CDR options, their efficiency and side-effects as “technology research”. As indicated by Supplementary Fig. 10, technology research accounts for about 89% of all studies across all individual CDR options. We refer to survey or focus group research on public perceptions and attitudes to CDR as “public perception” (0.8% of the total), and integrated assessment scenario research as “socio-economic pathways” (9% of the total). We further classify “policy and governance” research, and studies on the “earth system” that evaluate global carbon cycle or land aspects of CDR implementation, see for example42,43, even though these categories remain relatively rare (at 3.8% and 0.6%, respectively).
In general, patterns of research vary across the technology categories. The literature on CDR in general features mainly policy and governance (28% of papers on CDR in the general category) as well as integrated scenario research (45%). We also find larger shares of scenario research for some individual CDR options—particularly for BECCS (31%) and forest-related CDR options (21–22% for Forest Management and Afforestation/Reforestation), which were the first to be implemented in the modelling community38,44.
CDR research is published to a large extent in journals with a natural science or engineering focus and tends to be rooted in experimental and modelling study designs. In particular, 50% of the studies are published in natural sciences, 26% in agricultural sciences and 22% in engineering and technology journals (see Fig. 5). Only 3% of the publications are published in journals with a focus on social science, including economics.
a The number of studies which report on each of the CDR options. One study can report on multiple CDR options. b For each CDR option the share of research fields the studies were published in. This is based on meta-data from the Web of Science and follows the OECD Category scheme41. c For each CDR option the share of scientific method used in the studies as identified by our classifier. One study can use multiple methods, see Supplementary Table 3. Source data is provided as a Source Data file.
Research designs vary substantially across CDR options, but experiments, reviews and modelling studies are most common. Overall, 86% use experimental methods, either laboratory (48%) or field (38%) experiments, driven mainly by research on biochar and soil carbon sequestration. Reviews (21%) and modelling (18%) make up another large proportion. However, certain research designs are more dominant in the literature for specific CDR options. For example, field studies typically make up a substantial share of forest-based CDR options, but also blue carbon and ocean fertilisation, while laboratory experiments, i.e., experiments in a controlled environment45, are dominant for biochar, soil carbon sequestration, but also some engineered CDR options such as DACCS. Interestingly, BECCS studies to date focus strongly on modelling, highlighting their prominent role in climate protection scenario work. Across all CDR options, reviews are widely available—from 11% for forest management up to 32% for OAE, and 34% for the literature on CDR in general (Fig. 4).
IPCC reports differ greatly from scientific literature
Next, we analyse how the research landscape is reflected in the most recent 6th Assessment Report by the IPCC. For this, we extract all citations from the IPCC AR6, all working groups, and identify those studies which are present in the literature on CDR by matching titles. Although it is clear that the IPCC cannot assess all of the large and growing body of available research33, it is essential to understand which main topics are emphasised or overlooked. We also acknowledge that differences between the two literature bodies can arise from various factors and that the main topic distributions should not necessarily align—as Hume remarked, what “is” doesn’t necessarily lead us to what “ought to be”.
We find that IPCC assessments are not a broad reflection of attention patterns in the underlying scientific literature on CDR options (Fig. 6). Overall, only a small fraction (2% of the CDR literature) of CDR studies are directly assessed. While the IPCC includes a relatively higher proportion of reviews (19% vs. 15%) and systematic reviews (3% vs. 1%) compared to the overall CDR literature, we believe incorporating even more of these could further enhance its ability to fulfil a stated goal of the IPCC—which is to comprehensively evaluate the available evidence.
IPCC assessments cite a wide range of CDR options, but are predominantly concerned with BECCS (27%)—probably due to its prominence in climate change mitigation scenarios (see Fig. 6)2,46. The major focus on biochar in the research community is not reflected in IPCC citation patterns.
The fact that IPCC assessments have tended to focus on scenarios is underlined by an observed shift from experimental research (86% of CDR research; 10% of IPCC citations) to modelling work (13% of all CDR research; 37% in IPCC citations) and data analysis (10% of all CDR research, 19% in IPCC citations). The focus on technology research in the literature (89% of all CDR research; 44% in IPCC citations) is replaced by much more prominence of scenario work (socio-economic pathways) (9% of all CDR research; 33% in IPCC citations) as well as research on policy and governance (4% of all CDR research; 22% in IPCC citations). All this reflects that IPCC assessments focus on the exploration of alternative scenarios with different climate outcomes, societal development pathways and mixes of mitigation strategies, intended to inform policy development33,47,48.
Shares of CDR options vary across indicators of policy and practice
Finally, we find that the CDR options being researched most intensively are not the ones being most actively deployed, developed or invested in (Fig. 7, ref. 2, Chapter 3,6,7). Again, we do not imply that these distributions should necessarily be similar; rather, we aim to highlight and reflect on the differences between these categories. For example, while CDR research strongly focuses on biochar and soil carbon sequestration, the vast majority of current deployment (2Gt yr-1 or 99.9%49) is from afforestation and reforestation. Conversely, even though only 2Mt yr-1 of CO2 removal is currently delivered by more novel CDR options—mainly BECCS (78%) and biochar (21%)2 —these technologies receive an enormous amount of scientific attention or are widely discussed in the scenario literature. Similarly, about 80% of the CDR patents are for BECCS and DACCS2,50. 75% of announced investments in CDR focus on DACCS projects51. In long-term mitigation scenarios that achieve the Paris long-term temperature goals52 mainly BECCS (99%), afforestation (67%) and DACCS (29%) are the CDR options included. There is not a single scenario dealing with biochar or soil carbon sequestration due to a lack of implementation of these CDR options despite their potential co-benefits, such as food security or N2O emission reduction53.
Data was taken from ref. 2. Source data is provided as a Source Data file.
Discussion
In this article, we provide a comprehensive evidence map of the CDR literature. Our machine learning assisted approach follows a systematic mapping methodology26,27, and automates key labour-intensive parts of the process28. This allows our systematic map to cover the entire research domain around CDR rather than being limited to a niche area of literature due to resource limitations29. As a result, we were able to quantify the CDR research landscape in an unprecedented way. Moreover, the automated classification can also be applied to newly published CDR research, representing a critical step forward in accelerating learning on CDR and providing high-quality evidence syntheses on the topic. This is particularly important as we continue to face a rapidly growing evidence base.
At the heart of our map of CDR research is a classification system trained with about 5300 manually labelled documents that is able to predict not only if a scientific publication is relevant for the evidence map, but also the CDR option, the broad area of research it is situated in, and the research methodology applied. This literature base serves as a foundation for further analysis and can be easily expanded with additional features that provide more detailed descriptions of the scientific literature. In this context, a simple keyword search can enrich the literature landscape across a diverse range of interests. All documents together with their categories can be downloaded from our literature hub: climateliterature.org/#/project/cdrmap.
We find that the CDR literature is 3–4 times larger than previously estimated33,36 when comparing the same time frame. The reason for this is that our machine learning assisted approach enables us to be systematic in our procedure and at the same time achieve both high levels of precision and recall. Previous approaches had either designed precise search strings that lack recall33 or relied on manual tracking of the field, which has simply grown too large36.
While our CDR map represents the most comprehensive work in this area to date, it does not offer a complete portrayal of CDR science. Our search, focused on English-language articles in Web of Science and Scopus, overlooks significant portions of literature in other languages and grey literature, particularly relevant for emerging technologies like BECCS and DACCS. Estimates suggest that Web of Science captures only about 40% of scientific publications54, implying potentially another 50,000 CDR-related publications. Additionally, the large opportunities for CDR functionality that are provided by converting captured CO2 into long-lived economically viable products (Carbon Utilization Infrastructure, Markets, and Research and Development 2024) has not yet been implemented in this review but is subject of ongoing work.
Our machine learning classification system is not perfect and varies in accuracy across tasks. For example, while we are able to predict biochar with a F1-score of 0.98, our classifiers perform much poorer for classifying agroforestry. However, our supervised machine learning procedures involve in-depth validation and as such, we establish transparency about our uncertainty in quantifying the evidence base—something rarely provided in manually compiled evidence maps, which are commonly viewed as gold standard.
Here we confirm previous research33 that the expansion of the scientific literature on CDR is taking place more rapidly than for climate change as a whole. Overall, we find that CDR research is highly concentrated on particular CDR options, specific areas of research as well as research approaches. The CDR literature is dominated by biochar research today—with a geographical centre in China. This development is relatively recent and driven by much higher publication rates than observed for any other CDR option. There could be a number of drivers that explain the large uptake of biochar research in China, including institutional developments (e.g., increased core funding at agricultural universities, publishing incentives, or research grants), strengthening scientific networks (e.g., new societies, journals, project collaborations and exchanges), or a concerted push from the policy sphere (e.g., strategic research funding, support for public-private enterprises). Of course, applied research cannot be abstracted from its surrounding geographic and economic contexts. It is therefore not unexpected to find CDR research niches in different contexts (e.g., biochar in agriculturally productive regions, ocean-based CDR in coastal regions, DACCS and BECCS in industrialised regions).
Patterns of research are also distinctly different from what we observe in policy and practice. In part, this reflects the differing technological readiness levels of each CDR option, which vary from early stage research (e.g., enhanced weathering), to pilots and demonstrations (e.g., DACCS), and full-scale commercialisation (e.g., afforestation/reforestation)55. This may explain why—compared to the available research—patenting and investment activity has been relatively more active for DACCS, where a series of recent demonstrations have taken place. The tendency for scenarios to include a very significant share of BECCS also reflects path dependencies in model development, which already started to implement this technology option in the 2010s. It should be noted, though, that there are active developments to expand the range of CDR options in IAMs56. CDR deployment is also driven by issues like social acceptance, where methods with higher perceived “naturalness” and a longer history of practice (e.g., afforestation/reforestation) have a clear advantage57.
We show that IPCC assessments do not reflect publication patterns in the underlying scientific literature. Systematic mapping efforts can help identify topical areas worthy of focus, and these may need to be adopted into assessment procedures. Indeed, a first practical step is to identify, evaluate and utilise the existing body of reviews and metastudies, which too have been under-cited in the IPCC in favour of a limited set of primary studies.
We identify a few “evidence hubs” where systematic reviews, i.e., a complete and robust assessment of the available literature, would be feasible based on the current body of work. For example, the extensive literature surrounding afforestation policies offers an opportunity for an ex-post analysis to yield insights on the long-term effectiveness and social impacts of these efforts. Updated assessments of CDR costs and potentials are also needed—expanding beyond previous efforts, such as those by ref. 37—to reflect recent advancements in the underlying evidence base. Additionally, there is sufficient evidence to support a systematic review on monitoring, reporting, and verification, as identified by ref. 58, which is instrumental to developing reliable certification schemes for CDR.
Finally, in terms of evidence gaps, we note less of a research focus on ocean alkalinization, EW, and agroforestry. We also observe there are few studies on more novel CDR options, such as DACCS, that are place-specific. More localised research is needed, given that the successful implementation of these methods is often dependent on local geographies (e.g., geological reservoir access) and socio-economic contexts (e.g., social acceptance, energy prices and availability). Additionally, our results point to a need for more research on CDR in the social sciences and humanities, for example, to support evidence-based decision-making on questions of governance and equity. This type of research will be increasingly important as focus shifts towards the implementation of CDR at scale and policy design to support this, as is implied by the ambition of net-zero targets in the context of relatively slow action of mitigation policies.
Methods
Systematic map - protocol
We use an approach assisted by machine learning to provide the a comprehensive evidence map of CDR research. We follow the well established guidelines for systematic mapping25, wherever possible, and adjust them as needed to align with our machine learning approach. We document all steps in a detailed systematic map protocol for transparency and reproducibility45, which is summarised in Fig. 1 and Supplementary Fig. 11.
Document search
We started by developing, for each CDR option, search strings with high levels of recall to make sure that as few scientific articles are missed as possible. The search strings include keywords describing the CDR technology, see Supplementary Information for the full search queries. For long established CDR options, such as afforestation, we included keywords that make sure the CDR option is evaluated with a focus on carbon sequestration. The development of search strings was done iteratively by validating against an independent list of publications on the various CDR options ensuring that all documents are returned. The validation dataset was extracted from IPCC AR659,60 and 50 randomly selected publications from the CDR bibliography36 published by the Climate Protection and Restoration Initiative. The search queries are available in the Supplementary Table 3. We then ran the final search strings on Web of Science and Scopus on March 28th, 2022 and May 3rd, 2023 and retrieved 75,518 bibliographic records after de-deduplication. Further information on this procedure and information on the validation dataset is available in Supplementary Table 1 and Supplementary Method 1.
Document relevance through machine learning
In the next step, we work towards precision by developing a machine-learning classifier to distinguish relevant, namely all studies on negative emissions and CDR, from irrelevant scientific studies in our query. We manually screen and annotate a total of 5339 documents— 100–600 per CDR option—if they should be included in the map (distinction between blue and red squares in Fig. 1) according to our codebook. To ensure reproducibility61,62, each document is screened and annotated by two coders as recommended by the relevant guidelines25. We use our annotations to train and validate binary classifiers, i.e., automatic sorting into predefined categories, to predict inclusion, using the title and abstract of the documents as inputs. The best performing classifier (F1: 0.91; ROC-AUC: 0.85) is derived from ClimateBERT—a transformer-based pre-trained language model, which has been fine-tuned to better represent domain-specific language used in the climate change context, including in scientific abstracts63. Further details and an explanation of our model validation procedure are available in the Supplementary Methods 2 and 3.
Document classification through machine learning
We further annotated all relevant scientific articles from our manually coded training and validation set with regard to the CDR options covered (Afforestation/Reforestation, Restoration of landscapes/peats, Agroforestry, Soil Carbon Sequestration (SCS), Blue Carbon Management (mangroves, macroalgae, seagrasses, and salt marshes), EW, OAE, Ocean Fertilisation/Artificial Upwelling, Bioenergy Carbon Capture and Sequestration (BECCS), Direct Air Carbon Capture and Sequestration (DACCS), Biochar, additionally we include General Literature on CDR with no focus on a specific technology), the scientific method used, as well as the broad area of research (technology study, policy & governance, equity, public perception, socio-economic scenarios, earth system science). Definitions of all CDR methods used to code the documents are shown in Fig. 2. Additional information on how we distinguished the different classes can be found in our coding protocol45. The additional categories are represented in Fig. 1 by the different blue shades for each annotated relevant document. We used these annotations to train three multi-label classifiers for second stage predictions, and apply them to documents predicted relevant at the first stage. We achieve Macro F1/Macro ROC AUC scores 0.77/0.87 for the “technology” classifier, 0.69/0.89 for the “methodology” classifier and 0.62/0.77 for the main “area of research” classifier.
Machine learning validation
Throughout this process, we evaluate and validate our methodological choices. We test our ClimateBERT classifications against classifications from DistilBERT64 as well as a much simpler classification approach, where we use tf idf-encoding together with an SDGClassifier with Huber-loss65. ClimateBERT is chosen here due to its better performance (see Supplementary Table 3). We optimise classifier performance by tuning the hyperparameters of our model using the Python package RayTune66. Finally, we test the complete training strategy of all classifiers in a threefold cross validation providing us with comprehensive estimates of how the classifiers perform on the complete dataset (cf. Supplementary Table 4–6). To estimate the confidence intervals for absolute counts, we estimated the True Positive Rate and False Positive Rate from our validation procedure and calculated their confidence intervals using binomial proportion confidence intervals, see Supplementary Method 4.
Locations in title and abstract
To find the locations in title and abstract, we deployed the Python package Mordecai40.
Data availability
All documents, including their classification, are available for download on our literature hub at climateliterature.org/#/project/cdrmap. The interactive website allows users to search for documents and filter by category. Source data to Figs. 3–7 are provided as a Source Data file. The data generated in this study have been deposited in ref. 67. Source data are provided with this paper.
Code availability
All code used for training the machine learning models and analysing the data is accessible at https://github.com/mcc-apsis/cdr-map68.
References
IPCC. Climate Change 2022: Mitigation of Climate Change. Contribution of Working Group III to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change (Cambridge, UK and New York, NY, USA, 2022).
Smith, S. M. et al. The State of Carbon Dioxide Removal - 1st Edition. 1–108 https://www.stateofcdr.org (2023).
Buck, H. J., Carton, W., Lund, J. F. & Markusson, N. Why residual emissions matter right now. Nat. Clim. Change 13, 351–358 (2023).
Luderer, G. et al. Residual fossil CO2 emissions in 1.5–2 °C pathways. Nat. Clim. Change 8, 626–633 (2018).
Fuhrman, J. et al. Ambitious efforts on residual emissions can reduce CO2 removal and lower peak temperatures in a net-zero future. Environ. Res. Lett. 19, 064012 (2024).
Schleussner, C.-F. et al. Overconfidence in climate overshoot. Nature 634, 366–373 (2024).
Nemet, G. F. et al. Negative emissions—Part 3: innovation and upscaling. Environ. Res. Lett. 13, 063003 (2018).
Houghton, R. A. The global effects of tropical deforestation. Environ. Sci. Technol. 24, 414–422 (1990).
Jarvis, P. G. et al. Atmospheric carbon dioxide and forests. Philos. Trans. R. Soc. Lond. B Biol. Sci. 324, 369–392 (1997).
Nellemann, C. et al. Blue Carbon: The Role of Healthy Oceans in Binding Carbon: A Rapid Response Assessment/: Christian Nellemann, Emily Corcoran, Carlos M. Duarte, Luis Valdes, Cassandra DeYoung, Luciano Fonseca, Gabriel Grimsditch, Editors (UNEP, 2009).
Coale, K. H. et al. IronEx-I, an in situ iron-enrichment experiment: experimental design, implementation and results. Deep Sea Res. Part II Top. Stud. Oceanogr. 45, 919–945 (1998).
de Ramon N’Yeurt, A., Chynoweth, D. P., Capron, M. E., Stewart, J. R. & Hasan, M. A. Negative carbon via ocean afforestation. Process Saf. Environ. Prot. 90, 467–474 (2012.
Fuss, S. et al. Betting on negative emissions. Nat. Clim. Change 4, 850–853 (2014).
Li, M., Lu, Y. & Huang, M. Evolution patterns of bioenergy with carbon capture and storage (BECCS) from a science mapping perspective. Sci. Total Environ. 766, 144318 (2021).
Tan, R. R. et al. Computing optimal carbon dioxide removal portfolios. Nat. Comput. Sci. 2, 465–466 (2022).
A Research Strategy for Ocean-Based Carbon Dioxide Removal and Sequestration. https://doi.org/10.17226/26278 (National Academies Press, Washington, D.C., 2022).
Oschlies, A. et al. Guide to Best Practices in Ocean Alkalinity Enhancement Research. https://sp.copernicus.org/articles/2-oae2023/, https://doi.org/10.5194/sp-2-oae2023 (2023).
Geden, O., Scott, V. & Palmer, J. Integrating carbon dioxide removal into EU climate policy: prospects for a paradigm shift. WIREs Clim. Change 9, e521 (2018).
Schenuit, F. et al. Carbon Dioxide removal policy in the making: assessing developments in 9 OECD cases. Front. Clim. 3, 638805 (2021).
Bellamy, R. Incentivize negative emissions responsibly. Nat. Energy 3, 532–534 (2018).
Cox, E. & Edwards, N. R. Beyond carbon pricing: policy levers for negative emissions technologies. Clim. Policy 19, 1144–1156 (2019).
Dekker, M. M. et al. Identifying energy model fingerprints in mitigation scenarios. Nat. Energy 8, 1395–1404 (2023).
Callaghan, M. W., Minx, J. C. & Forster, P. M. A topography of climate change research. Nat. Clim. Change 10, 118–123 (2020).
Nunez-Mir, G. C., Iannone, B. V., Pijanowski, B. C., Kong, N. & Fei, S. Automated content analysis: addressing the big literature challenge in ecology and evolution. Methods Ecol. Evol. https://doi.org/10.1111/2041-210X.12602 (2016).
Collaboration for Environmental Evidence. Guidelines for Evidence synthesis in Environmental Management. Version 5.1. (eds Pullin, A. S., Frampton, G. K., Livoreil, B. & Petrokofsky, G.) https://www.environmentalevidence.org/information-for-authors (2022). Accessed 2 Oct 2024.
James, K. L., Randall, N. P. & Haddaway, N. R. A methodology for systematic mapping in environmental sciences. Environ. Evid. https://doi.org/10.1186/s13750-016-0059-6 (2016).
Saran, A. & White, H. Evidence and gap maps: a comparison of different approaches. Campbell. Syst. Rev. https://doi.org/10.4073/cmdp.2018.2 (2018).
Haddaway, N. R. et al. On the use of computer-assistance to facilitate systematic mapping. Campbell. Syst. Rev. 16, e1129 (2020).
Nakagawa, S. et al. Research weaving: visualizing the future of research synthesis A new framework for research synthesis of evidence and influence. Trends Ecol. Evol. https://doi.org/10.1016/j.tree.2018.11.007 (2018).
Callaghan, M. et al. Machine-learning-based evidence and attribution mapping of 100,000 climate impact studies. Nat. Clim. Change 11, 966–972 (2021).
Gonzalez-Rodriguez, D. et al. Elastocapillary instability in mitochondrial fission. Phys. Rev. Lett. 115, 088102 (2015).
Sietsma, A. J., Ford, J. D., Callaghan, M. W. & Minx, J. C. Progress in climate change adaptation research. Environ. Res. Lett. 16, 54038–54038 (2021).
Minx, J. C., Lamb, W. F., Callaghan, M. W., Bornmann, L. & Fuss, S. Fast growing research on negative emissions. Environ. Res. Lett. 12, 035007 (2017).
Oldham, P. et al. Mapping the landscape of climate engineering. Philos. Trans. R. Soc. Math. Phys. Eng. Sci. https://doi.org/10.1098/rsta.2014.0065 (2014).
Smith, S. M. et al. The State of Carbon Dioxide Removal - 2nd Edition. https://osf.io/f85qj/ (2024).
Burns, W. Bibliography: Greenhouse Gas Removal/Negative Emissions Technologies (Climate Protection and Restoration Initiative, 2021).
Fuss, S. et al. Negative emissions—Part 2: costs, potentials and side effects. Environ. Res. Lett. 13, 063002 (2018).
Hilaire, J. et al. Negative emissions and international climate goals—learning from and about mitigation scenarios. Clim. Change https://doi.org/10.1007/s10584-019-02516-4 (2019).
Riahi, K. et al. Mitigation pathways compatible with long-term goals. In IPCC, 2022: Climate Change 2022: Mitigation of Climate Change. Contribution of Working Group III to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change https://doi.org/10.1017/9781009157926.005 (Cambridge University Press, Cambridge, UK and New York, NY, USA, 2022).
Halterman, A. Mordecai: full text geoparsing and event geocoding. J. Open Source Softw. 2, 91 (2017).
Revised Field of Science and Technology (FOS) Classification in the Frascati Manual. https://www.britishcouncil.cl/sites/default/files/oecd_disciplines_british_council.pdf (2007). Accessed 20 Jan 2023.
Keller, D. P. et al. The effects of carbon dioxide removal on the carbon cycle. Curr. Clim. Change Rep. 4, 250–265 (2018).
Werner, C., Lucht, W., Gerten, D. & Kammann, C. Potential of land-neutral negative emissions through biochar sequestration. Earths Future 10, e2021EF002583 (2022).
Minx, J. C. et al. Negative emissions—Part 1: research landscape and synthesis. Environ. Res. Lett. 13, 063001–063001 (2018).
Lück, S. et al. A coding protocol for labeling scientific literature on carbon dioxide removal to train machine learning models. https://doi.org/10.17504/protocols.io.e6nvwqwqwvmk/v1 (2022).
Köberle, A. C. The value of BECCS in IAMs: a review. Curr. Sustain. Energy Rep. 6, 107–115 (2019).
Berrang-Ford, L. et al. Editorial: evidence synthesis for accelerated learning on climate solutions. Campbell Syst. Rev. 16, e1128 (2020).
Minx, J. C., Haddaway, N. R. & Ebi, K. L. Planetary health as a laboratory for enhanced evidence synthesis. Lancet Planet. Health 3, e443–e445 (2019).
Powis, C. M., Smith, S. M., Minx, J. C. & Gasser, T. Quantifying global carbon dioxide removal deployment. Environ. Res. Lett. 18, 024022 (2023).
Kang, J.-N., Zhang, Y.-L. & Chen, W. Delivering negative emissions innovation on the right track: a patent analysis. Renew. Sustain. Energy Rev. 158, 112169 (2022).
Höglund, R. List of known CDR purchases. https://docs.google.com/spreadsheets/d/1BH_B_Df_7e2l6AH8_8a0aK70nlAJXfCTwfyCgxkL5C8 (2022).
Byers, E. et al. AR6 scenarios database. Intergovernmental Panel Clim. Change https://doi.org/10.5281/zenodo.7197970 (2022).
Lehmann, J. et al. Biochar in climate change mitigation. Nat. Geosci. 14, 883–892 (2021).
Khabsa, M. & Giles, C. L. The number of scholarly documents on the public web. PLoS ONE 9, e93949 (2014).
Low, S., Baum, C. M. & Sovacool, B. K. Taking it outside: exploring social opposition to 21 early-stage experiments in radical climate interventions. Energy Res. Soc. Sci. 90, 102594 (2022).
Strefler, J. et al. Carbon dioxide removal technologies are not born equal. Environ. Res. Lett. 16, 074021 (2021).
Osaka, S., Bellamy, R. & Castree, N. Framing “nature-based” solutions to climate change. WIREs Clim. Change 12, e729 (2021).
Lück, S., Mohn, A. & Lamb, W. F. Governance of carbon dioxide removal: an AI-enhanced systematic map of the scientific literature. Front. Clim. 6, 1425971 (2024).
Babiker, M. et al. Cross-sectoral perspectives. In Climate Change 2022: Mitigation of Climate Change. Contribution of Working Group III to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change (eds. Shukla, P. R. et al.) (Cambridge, UK and New York, NY, USA, 2022).
Canadell, J. G. et al. Global Carbon and other Biogeochemical Cycles and Feedbacks. In Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change (eds. Masson-Delmotte, V. et al.) (Cambridge, UK and New York, NY, USA, 2021).
Müller-Hansen, F., Callaghan, M. W. & Minx, J. C. Text as big data: develop codes of practice for rigorous computational text analysis in energy social science. Energy Res. Soc. Sci. 70, 101691–101691 (2020).
Peng, R. D. Reproducible research in computational science. Science https://doi.org/10.1126/science.1213847 (2011).
Webersinke, N., Kraus, M., Bingler, J. & Leippold, M. ClimateBERT: a pretrained language model for climate-related text. https://doi.org/10.48550/arXiv.2212.13631. (2022).
Sanh, V., Debut, L., Chaumond, J. & Wolf, T. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. Preprint at https://doi.org/10.48550/arXiv.1910.01108 (2020).
Zhang, T. Solving large scale linear prediction problems using stochastic gradient descent algorithms. In Proceedings of the Twenty-First International Conference on Machine Learning 116 https://doi.org/10.1145/1015330.1015332 (Association for Computing Machinery, New York, NY, USA, 2004).
Liaw, R. et al. Tune: a research platform for distributed model selection and training. Preprint at https://doi.org/10.48550/arXiv.1807.05118 (2018).
Lück, S. Datasets used in “Scientific literature on carbon dioxide removal revealed as much larger through AI-enhanced systematic mapping” https://doi.org/10.5281/zenodo.15574572 (2025).
Lück, S. Scientific literature on carbon dioxide removal much larger than previously suggested: insights from an AI-enhanced systematic map (published). Zenodo https://doi.org/10.5281/zenodo.15576703 (2025).
Acknowledgements
We thank Christiane Hamann, Doménica Michelle Jaramillo Sánchez, Ronja Kelch, Fariha Mawla, Fabian Metz, Leon Stephan and David Verdugo-Raab for their effort in coding most of the documents. S.L., M.C., F.M.H., W.L., T.R., M.G. were supported by the ERC-2020-SyG “GENIE” (grant ID 951542). I.S., S.F., T.R. and J.M. received further funding from the German Ministry of Research and Education under the CDR-SynTra project (01LS2101F), C.K. and J.H. under the CDRterra PyMiCCS project (Grant Refs: 01LS2109C and 01LS2109A). S.M.S. was supported by the CO2RE Hub, funded by the Natural Environment Research Council (Grant Ref: NE/V013106/1). D.P.K. was supported by the European Union’s Horizon 2020 Research and Innovation Program under grant 869357 (the OceanNETs project) and the German Ministry of Research and Education CDRmare projects RETAKE, sea4soCiety, and ASMASYS. P.R. is supported by the UK’s Industrial Decarbonization Research and Innovation Centre (EP/V027050/1). F.K. and W.R. received support from the EC Horizon Europe project UPTAKE (Project # 101081521). D.T. and M.B. received funding from the German Ministry of Research and Education under the BioNET project (01LS2107A). M.J.G. is also affiliated with Pacific Northwest National Laboratory, which did not provide specific support for this paper.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Contributions
S.L., S.F. and J.M. conceptualised the work. S.L., M.C. and J.C.M. contributed to methodology. SL did the investigation and visualisation. S.F., J.C.M., M.C. and T.R. supervised the work. Writing–original draft: J.M. and S.L. wrote the original draft. M.C., M.B., A.C., S.F., M.G., J.H., C.K., D.P.K., F.K., W.L., N.D., F.H., G.M., B.P., P.R., T.R., W.R., P.S., I.S., S.M.S., D.T., T.G.T., M.S., and V.S. reviewed and edited the draft.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Tom Terlouw, who co-reviewed with Maria Myridinas, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Lück, S., Callaghan, M., Borchers, M. et al. Scientific literature on carbon dioxide removal revealed as much larger through AI-enhanced systematic mapping. Nat Commun 16, 6632 (2025). https://doi.org/10.1038/s41467-025-61485-8
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41467-025-61485-8









