Background & Summary

The use of chemical and mineral fertilizers has grown nearly 10-fold over the last sixty years1, contributing decisively to the increase in crop and livestock production over the same period, driven by growing global food and feed demand of the expanding world economy2. At the same time, over- and non-optimal use of fertilizers has created, by means of spill-over flows from agricultural fields, serious environmental problems potentially affecting the health of ecosystems and people at all scales, from local soil and water pollution, to regional eutrophication hotspots, to marine dead zones at the confluence of major rivers draining important agricultural areas3,4,5,6. The dual goal of ensuring food supply to meet global demand while reverting and reducing environmental damage is a major challenge for humanity and the planet, one that is foundational to the 2030 Sustainable Development Agenda6,7 and the Global Biodiversity Framework8, specifically in relation to the need for efficient use of fertilizers to achieve productive and sustainable agriculture.

Two global datasets, FAO1,9,10 and IFA11, currently provide rich information on nitrogen (N), phosphorus (P), and potassium (K) applications for agriculture, with country-level statistics for the 1961–2022 period, with annual updates. More limited data on crop-specific application rates is also available12. This information is a recognized global reference, facilitating analyses of fertilizers use in agriculture and its trends at country, regional and global scale, as demonstrated by dozens of published papers13, international reports14, sustainability indices15 and planetary boundary science16.

At the same time, studies concerned with local or regional issues may often require more detailed, subnational scale information, to assess the interactions of fertilizer use with critical co-variants such as, for instance, climatic conditions, soil properties and water flows, ecosystems and crops distribution, farm management typology, infrastructure and population data. In order to address these needs, global spatial fertilizer maps have begun to emerge in the literature17,18,19,20,21,22,23,24, largely in the context of informing models of global biogeochemical studies and earth systems science. These products are useful steps in refining information from national to sub-national and grid-level data, though they suffer from a number of important limitations. One is these new maps were typically produced by spatializing already existing national-level information, without incorporating more detailed published data and subnational information from national statistical offices. Another is that the production of such maps requires significant amount of data and computing resources for both development and validation, so that the existing products have largely been one-off efforts, lacking the required coordination needed to facilitate continuous improvement and updates. Indeed, the most widely used geospatial dataset to date, providing application rates of N, P, and K by crop species25 (hereby referred to as MFM and standing for Mueller’s et al. Fertilizers Maps), is limited to data for the year 2003. Significant changes in agricultural land and fertilizer use in the last 20 years9, coupled with momentous changes in computing power and storage space suggest that the times are now mature for implementing a major update of the currently available products.

Here we present the results of a major new effort in data fusion to produce NPKGRIDS, an updated dataset of global gridded application rates of inorganic fertilizers by main plant nutrients: nitrogen (N), phosphorous (P2O5), and potassium (K2O), by crop species, for the year 2020. NPKGRIDS includes the fertilizer application rates of 173 crops at a global spatial resolution of 0.05° (approximately 5.6 km at the equator). The development of NPKGRIDS adopted a data fusion approach to integrate crop mask information recently made available in CROPGRIDS26,27 with other relevant published data sources, as follows. First, we searched and collected the available peer-reviewed and national dispatches of crop-specific fertilizer use data, selecting eight datasets with information specifying individual crops or aggregated crop groups in either georeferenced or tabulated formats. We then selected the best-fit dataset for each crop and subnational unit, using the same data fusion optimization process and quality scoring system of CROPGRIDS. Published national statistics of total applied mass from FAO and IFA and national statistical offices were used subsequently for benchmarking NPKGRIDS.

Methods

We surveyed and collected georeferenced and tabulated datasets reporting the application rates or applied amounts of N, P and K fertilizers to individual crops at national and/or subnational levels. We only collected datasets from peer-reviewed and national sources for data reliability. We next elaborated these datasets following the workflow depicted in Fig. 1, which includes three main steps: Step 1) harmonization of input datasets into tabular format at the level of subnational units; Step 2) determination of endogenous data quality indicators; and Step 3) global spatialization of fertilizer application rates.

Fig. 1
figure 1

Workflow of the development of NPKGRIDS. Step 1: Harmonization of input datasets into tabular format; Step 2: Determination of endogenous data quality indicators; Step 3: Global spatialization of fertilizer application rates; and Step 4: Validation. MFM: Mueller’s et al.25 Fertilizers Maps. MRF: Monfreda et al.34 dataset. GAUL: Global Administrative Unit Layers dataset.

Input and corollary data sources

The starting point for NPKGRIDS was CROPGRIDS26,27, a recently developed georeferenced dataset of crop maps detailing crop location and harvested area. We then researched the available peer-reviewed literature and official national statistics for georeferenced and tabulated fertilizer use datasets specifying individual crops and/or aggregated crop groups. We included only datasets with data vintage more recent than 2003, that is, the latest temporal coverage of MFM25, and with crop names matching the FAO Indicative Crop Classification (ICC)28. We excluded datasets that were non-crop-specific or containing aggregated crops but without further specification of component crops. The collected datasets provided the mass and/or application rates of total nitrogen (N), total phosphorus (P or P2O5), and total potassium (K or K2O) derived from straight and/or compound fertilizer products. Out of these, we selected eight datasets for N and seven for P2O5 and K2O (Table 1 and Supplementary Table 1). Amongst the selected datasets, the periodic Fertilizer Use By Crop (FUBC)29 includes crop-specific and aggregated crop groups across the period 2016–2018 for 63 countries. We separated it into two datasets, one listing only individual (IDV) crops (FUBC18-IDV) and the other listing only aggregated (AGG) crop groups (FUBC18-AGG). The Historic Fertilizer Use By Crop (HFUBC) dataset12 combines all fertilizer use by crop data for individual crops and crop groups from IFA and FAO from 1978 to 2018 for 111 countries. Only 12 individual crops in 65 countries from 2006 to 2018 from HFUBC were used in this work. Note that the FUBC18 and HFUBC are national-resolution datasets. Four National Statistical Offices (NSOs) datasets for the United States of America30 (US), Belarus31 (BY), the United Kingdom32 (UK), and Australia33 (AU) providing subnational-resolution crop-specific fertilizer data were also included. These eight datasets were used as inputs to construct NPKGRIDS (Table 1).

Table 1 List of input used to construct NPKGRIDS.

Additional, corollary datasets were also used, namely to assist with calculations and spatialization of NPKGRIDS. Specifically, two georeferenced datasets of global crop maps were used to inform crop location and harvested area, i.e., the CROPGRIDS26,27 dataset providing maps for 173 crops at 0.05° in 2020 and the Monfreda et al.34 dataset (hereafter called MRF from the initials of the original authors) providing maps for 175 crops at 0.0833° circa 2000, both using FAO crop species nomenclature. Whenever the selected datasets did not provide fertilizer application rate but only total applied mass, we used national-level crop harvested area publicly available from either NSOs, i.e., CROP-AU35 and CROP-BY36, or FAOSTAT37, to estimate fertilizer application rates as mass per unit of crop harvested area. We used the FAO Global Administrative Unit Layers (GAUL) dataset to identify country and regional (subnational unit) boundaries38 (Table 2).

Table 2 List of corollary datasets used to construct and benchmark NPKGRIDS.

Data harmonization (Step 1)

The eight input datasets (Table 1) were harmonized to a common tabular format for N, P2O5, and K2O application rates in each crop expressed as mass applied per unit crop harvested area. The tabular resolution is at the finest scale of each dataset, i.e., subnational (level 1) for MFM, US, BY, AU, UK and national (level 0) for FUBC18-IDV, FUBC18-AGG, and HFUBC.

For the georeferenced MFM dataset, we first tabulated the application rates using the GAUL level 1 mask at the dataset original resolution, i.e., 0.0833° (~10 km at the equator). In subnational units with missing data of fertilizer application rates, we gap-filled the missing information using the national weighted average application rate FMFM [kg ha−1] for fertilizer n, crop i and country j calculated as:

$${F}_{{\rm{MFM}}}(n,i,j)=\frac{{\sum }_{r{\in }j}[\,{f}_{{\rm{MFM}}}(n,i,j,r)\cdot {A}_{{\rm{MRF}}}(i,j,r)\,]}{{\sum }_{r{\in }j}{A}_{{\rm{MRF}}}(i,j,r)}$$
(1)

where fMFM(n,i,j,r) in [kg ha−1] is the available application rate of fertilizer n for crop i in subnational unit r of country j in the MFM dataset, and AMRF is the corresponding harvested area obtained from the MRF dataset.

For all other tabular datasets, the harmonization process consisted in converting the variables to application rates expressed as mass of applied N, P2O5, and K2O per unit crop harvested area. In cases where the datasets only provided information on applied mass, we calculated the application rate using crop-specific harvested areas of the corresponding year sourced from the relevant corollary datasets (Table 2). Specifically, crop harvested area from FAOSTAT was used for HFUBC, FUBC18-IDV and FUBC18-AGG while CROP-AU and CROP-BY were used for the AU and BY datasets, respectively. For the US dataset, which provided information on application rates per unit fertilized area and the percent harvested area being fertilized, we calculated the application rates for the entire harvested area of each crop by multiplication. For the BY dataset, P and K mass were multiplied by 2.29 and 1.20, respectively, to convert them to P2O5 and K2O. In the UK dataset, the application rates of some crops varied across seasons. In such cases, lacking intra-annual detail within our product, we used the season-averaged application rate to infer the annual average fertilizer application rate by crop. For the AU dataset and some countries in FUBC18-IDV and FUBC18-AGG with data spanning two calendar years, we allocated the reference year at the first calendar year. For datasets that provided the list of crops within a crop group, i.e., FUBC18-AGG and UK, the group application rate was assigned to all component crops in that group as per aggregations in Supplementary Table S3.

Other georeferenced and tabular corollary and benchmarking datasets (Table 2) were also harmonized to the same data format and administrative unit levels as the input datasets.

Endogenous data quality indicators (Step 2)

We designed a multi-criteria ranking scheme to determine the best-fit value to represent fertilizer application rates for specific crops in subnational units where multiple sources were available across the eight selected input datasets. The ranking was based on three endogenous data quality indicators: Qc, crop specification; Qr, data resolution; and Qy, synchrony. Each indicator was assigned values between zero (lowest quality) and one (highest quality). For each dataset, the values of the indicators can vary across different crops and subnational units.

The Qc indicator indicated whether the fertilizer data is specific to individual crops or crop aggregations as:

$$\begin{array}{ccc}{Q}_{c} & = & \left\{\begin{array}{cc}1 & \text{if crop} \mbox{-} \text{specific}\\ 0.5 & \text{if crop} \mbox{-} \text{aggregated}\end{array}\right.\end{array}$$
(2)

Dataset including both crop-specific and crop-aggregated data (e.g., UK) will have variable Qc values across different crops, with a higher rank for individual crops.

The Qr indicator ranked the administrative resolution of a dataset, with a higher rank given to datasets with a finer resolution as follows:

$${Q}_{r}=\left\{\begin{array}{c}\begin{array}{cc}1 & \text{if subnational}\\ 0.75 & \text{if national and subnational}\\ 0.5 & \text{if national}\end{array}\end{array}\right.$$
(3)

The Qy indicator rated the synchrony level between the reference year Yr of an input dataset and the reference year of NPKGRIDS, which was set to the period 2015–2020, henceforth referred to as ‘circa 2020’, and was defined as

$$\begin{array}{ccc}{Q}_{y} & = & \left\{\begin{array}{cc}\frac{{Y}_{r}-2000}{2015-2000} & \text{if}\,Y < 2015\\ 1 & \text{if}\,2015\le Y\le 2020\end{array}\right.\end{array}$$
(4)

Qy increases as \({Y}_{r}\) approaches the 2015–2020 period and could vary in response to a wide range of reference years \({Y}_{r}\) within the same dataset e.g., HFUBC and US.

The endogenous data quality indicators defined above are summarized in Table 3 for all datasets used to compile NPKGRIDS. Operationally, we associated an average endogenous quality to each dataset k, crop i and subnational unit r as

$${Q}_{k,i,r}=\frac{{\left({Q}_{c}+{Q}_{r}+{Q}_{y}\right)}_{k,i,r}}{3}$$
(5)
Table 3 Value ranges of endogenous data quality indicators of all input datasets.

Global spatialization of fertilizer application rates (Step 3)

An assemblage of global georeferenced application rates of N, P2O5 and K2O for individual crops was conducted following the algorithm in Fig. 2. First, we disaggregated the crop-specific national application rates of the three fertilizers in HFUCB, FUBC18-IDV and FUBC18-AGG into subnational application rates using the proportional allocations from national to sub-national level calculated from MFM, where MFM data is available. This step leads to the calculation of crop-specific application rate fk(n,i,j,r) for a dataset k of fertilizer n on crop i in subnational unit r of country j as

$${f}_{k}(n,i,j,r)=\alpha (n,i,j)\cdot {f}_{\text{MFM}}\left(n,i,j,r\right)$$
(6)

where fMFM is the application rate of the corresponding crop and subnational unit in MFM and α is a scaling factor defined as

$$a(n,i,j)=\frac{{F}_{k}(n,i,j)\cdot {\sum }_{r|(i,j)}{A}_{{\rm{CR}}}(i,j,r)}{{\sum }_{r|(n,i,j)}[\,{f}_{{\rm{MFM}}}(n,i,j,r)\cdot {A}_{{\rm{CR}}}(i,j,r)\,]}$$
(7)

where Fk(n,i,j) is the national-level application rate of fertilizer n in dataset k for crop i and country j, and ACR is the corresponding crop harvested area in CROPGRIDS. In Eq. (7), we assumed that these proportions α were unchanged between 2000 and 2020. This is true only if such geographical differences were assumed to depend largely on agri-meteorological differences rather than management practices, or alternatively that geographical differences in the latter had remained similar over the two time periods.

Fig. 2
figure 2

Algorithm for the assemblage of global maps of crop-specific fertilizer application rates. Refer to Tables 1, 2 for the names of the datasets.

For each fertilizer n applied to crop i in subnational unit r, we tested if the application rate was available from multiple datasets. If only one dataset k was available, the chosen application rate is f(n,i,r)  =  fk(n,i,r) (Fig. 2). If multiple datasets were available, we selected the best-fit dataset kbest, which has the highest endogenous quality Qk,i,r defined in Eq. (5), and thus, f(n,i,r)  =  \({f}_{{k}_{{best}}}\)(n,i,r). If two datasets have equal Qk,i,r, the dataset with the most recent reference year was chosen as kbest. Alternatively, if these datasets have the same reference year, \({f}_{{k}_{{best}}}\left(n,i,r\right)\) was calculated as the average value of all datasets with equal Qk,i,r and reference year. If no datasets were available, we performed gap filling by first checking if data were available in the neighbouring subnational units. Specifically, if the application rates of fertilizer n on crop i were available in w bordering subnational units, the area-weighted average fertilizer application rate favg(n,i,w) was computed over the w bordering subnational units as

$$f(n,i,r)={f}_{avg}(n,i,w)=\frac{{\sum }_{w|(n,i)}\,f(n,i,w)\cdot {n}_{w}}{\sum {n}_{w}}$$
(8)

with nw being the number of shared-bordering grid cells. If there is no application rate for fertilizer n on crop i in neighbouring subnational units, we estimated f(n,i,r) from application rates on similar crops based on three criteria defined by FAO39: (a) classification (i.e., cereals, pulses, nuts, fruits and berries, spices, permanent oil-bearing crops, temporary oil-bearing crops, fodder crops, fibre crops, vegetables, and other permanent crops); (b) lifespan (i.e., temporary or permanent); and (c) stem type (i.e., herbaceous, shrubs or tree; see Supplementary Table 4). A crop i is considered similar to crops c if they share at least two of the three abovementioned criteria. If the application rate of fertilizer n on similar crops c were available within the subnational unit r (Fig. 2), we calculated f(n,i,r)  =  favg(n,c,r), where favg(n,c,r) is the area-weighted average fertilizer application rate across c similar crops in subnational unit r, such that

$$f(n,i,r)={f}_{avg}(n,c,r)=\frac{{\sum }_{c|(n,r)}\,f(n,c,r)\cdot {A}_{CR}(c,r)}{{\sum }_{c|(n,r)}{A}_{CR}(c,r)}$$
(9)

If there is no similar crop within the subnational unit r (Fig. 2), we computed f(n,i,r)  =  favg(n,c,g), where favg(n,c,g) is the area-weighted average application rate of c similar crops globally across all subnational units g, such that

$$f(n,i,r)={f}_{avg}(n,c,g)=\frac{{\sum }_{c|(n,g)}\,f(n,c,g)\cdot {A}_{CR}(c,g)}{{\sum }_{c|(n,g)}{A}_{CR}(c,g)}$$
(10)

Finally, to construct the global georeferenced maps of fertilizer application rates by nutrients and crops, the subnational crop-specific fertilizer application rates were uniformly spatialized over the grid cells hosting crop i within that subnational unit using crop masks from CROPGRIDS. Example maps of N, P2O5 and K2O application rates for cotton, and the corresponding overall data quality and data sources used to construct the maps are shown in Fig. 3.

Fig. 3
figure 3

Example maps distributed in NPKGRIDS data for cotton. From left to right columns: N, P2O5, and K2O; from top to bottom rows: fertilizer application rate, data quality, and data source.

Data Records

NPKGRIDS dataset distributes global georeferenced maps of N, P2O5, and K2O fertilizer application rates for 173 crops (refer to Supplementary Table 4 for the list of crops) for the year circa 2020 at a resolution of 0.05° (about 5.6 km at the equator) with a bounding box of −180° to 180° longitude and −90° and 90° latitude using the WGS-84 coordinate system. The georeferenced maps are distributed as NetCDF files, where grid cells containing ocean/water are marked as “-1”. Files included in the dataset are described in Table 4. NPKGRIDS dataset is available for public download from the figshare repository40 at https://doi.org/10.6084/m9.figshare.24616050. The data for P and K fertilizers are distributed in terms of oxide-based application rates. These can be converted to elemental-based application rates using the following conversions: 1 kg of P2O5 is equivalent to 0.436 kg of P, and 1 kg of K2O is equivalent to 0.83 kg of K.

Table 4 NPKGRIDS data distribution files and variables.

Technical Validation

Validation of NPKGRIDS with national-level data from FAOSTAT and IFASTAT

Lacking additional datasets with fertilizers by crop data beyond those already used herein, we evaluated NPKGRIDS data using national-level total applications of N, P2O5, and K2O fertilizers provided by FAOSTAT41 (160 countries) and IFA11 (110 countries) (Table 1). To this end, we first calculated the total national-level applied mass M(n,j) of fertilizer n in country j estimated by NPKGRIDS as

$$M(n,j)={\sum }_{({p}{,}{i})|(n,{j})}{{A}}_{{\rm{CR}}}(p,{i}{,}{j})\cdot {f}(n,p,{i}{,}{j})$$
(11)

where ACR(p,i,j) is the harvested area of crop i in grid cell p of country j in CROPGRIDS26 and f is the corresponding application rate of fertilizer n in NPKGRIDS. The country boundaries were determined based on the GAUL38 dataset (level 0). We then compared M(n,j) against the corresponding fertilizer use reported in FAOSTAT and IFASTAT, MFAO and MIFA, respectively, averaged over the 2015–2020 period. These comparisons were characterized using the coefficient of determination R2 (analogous to Nash-Sutcliffe efficiency coefficient), the concordance correlation coefficient (CCC), and the normalized root mean squared errors (NRMSE), expressed as

$${\text{R}}_{x}^{2}(n)=1-\frac{\sum _{j}{\left({O}_{x}(n,j)-E(n,j)\right)}^{2}}{\sum _{j}{\left({O}_{x}(n,j)-\bar{{O}_{x}}(n)\right)}^{2}}$$
(12)
$${{\rm{CCC}}}_{x}(n)=\frac{2\rho \left(n\right){\sigma }_{{O}_{x}}\left(n\right){\sigma }_{E}(n)}{{{\sigma }_{{O}_{x}}\left(n\right)}^{2}+{{\sigma }_{E}(n)}^{2}+{[\bar{{O}_{x}}(n)-\bar{E}(n)]}^{2}}$$
(13)
$${{\rm{NRMSE}}}_{x}(n)=\frac{\sqrt{\frac{\sum _{j}{\left[{M}_{x}(n,j)-M(n,j)\right]}^{2}}{{n}_{p}}}}{[{M}_{{\rm{x}},\max }\left(n\right)-{M}_{x,\min }\left(n\right)]}$$
(14)

where \({O}_{x}\) represents the logarithmic of either MFAO or MIFA and \(E\) represents the logarithmic of national-level applied mass (M) calculated from NPKGRIDS. \(\bar{{O}_{x}}\) and \(\bar{E}\) are the corresponding means across all countries, \({{\sigma }_{{O}_{x}}}^{2}\) and \({{\sigma }_{E}}^{2}\) are the corresponding variances, and \(\rho \) is the Pearson correlation coefficient between Ox and \(E\). Mx represents either MFAO or MIFA, \({M}_{x,\max }\) and \({M}_{x,\min }\) are the corresponding maximum and minimum fertilizer masses across all countries, and \({n}_{p}\) is the number of data points.

In NPKGRIDS, the global total N applied was 100 million tonnes, approximately 10% lower than the world estimates reported by FAOSTAT and IFASTAT for the year 2020, which stood at 110 and 112 million tonnes, respectively. At national level (Fig. 4, left column), the N applied mass calculated using NPKGRIDS matched relatively well with FAOSTAT (R2  =  0.76, CCC  =  0.89 and NRMSE = 0.01) and reasonably well with IFASTAT (R2 = 0.66, CCC = 0.87 and NRMSE = 0.01). In the comparison against FAOSTAT data, underestimation of N application was mostly identified in Africa, such as the Democratic Republic of Congo, Namibia, and Madagascar. NPKGRIDS consistently overestimated the N application in Iraq, Syria, and Jordan when comparing against FAOSTAT and IFASTAT data.

Fig. 4
figure 4

Comparison of fertilizer applied mass between NPKGRIDS and FAOSTAT (top row) and IFASTAT (bottom row) for N (left column), P2O5 (middle column), and K2O (right column). Each marker in the scatter plots represents a country and the black lines show the 1:1 ratio.

For phosphorus, NPKGRIDS estimated a global applied mass of 46 million tonnes, closely aligning with FAOSTAT’s and IFASTAT’s estimates for 2020, which were 48 and 49 million tonnes, respectively. The national-level comparisons for the total use of P2O5 had the strongest correlations with FAOSTAT (R2 = 0.82, CCC = 0.91 and NRMSE = 0.02) and IFASTAT (R2 = 0.70, CCC = 0.88 and NRMSE = 0.01, Fig. 4, middle column). Overall, and similarly to N data, discrepancies of P application data between NPKGRIDS and data from FAOSTAT and IFASTAT were more pronounced in Africa and in Middle Eastern countries.

For potassium, NPKGRIDS reported a global application of 40 million tonnes, matching well the global estimates by FAOSTAT and IFASTAT, which were 39 and 41 million tonnes, respectively. At the same time, the national-level comparisons of total K2O application showed less alignment with the FAO/IFA estimates (Fig. 4, right column), with lower correlations for both FAOSTAT (R2 = 0.68, CCC = 0.84 and NRMSE = 0.01) and IFASTAT (R2 = 0.50, CCC = 0.77 and NRMSE = 0.01). NPKGRIDS tended to overestimate K applied mass in North Africa and West Asia.

Validation of NPKGRIDS with national and subnational data from NSOs

We obtained non-crop-specific total applied quantities of N, P2O5, and K2O at national and subnational levels for 37 countries and 166 subnational units from 2006 to 2020, including 32 countries in Europe42, India43, Pakistan44, China45, Iran46, and Sri Lanka47 (Table 2, Supplementary Table 2). Of the 37 countries, 11 countries provided subnational data while 26 countries provided only national-level data. Only 5 countries, with a total of 99 subnational units, provided K2O data. We calculated the 2015–2020 averages for all NSOs data, with the exception of Iran, for which the latest available data is from 2006. We aggregated pixel-level data in NPKGRIDS to national and subnational total applied masses of N, P2O5 and K2O following Eq. 11, with j being a country (level 0) or subnational (level 1) unit defined in the GAUL38 administrative unit boundaries. The quality of comparisons between NPKGRIDS and NSOs data was quantified using R2 (Eq. 12), CCC (Eq. 13) and NRMSE (Eq. 14).

The comparison of national and subnational levels N applied mass between NPKGRIDS and NSOs showed a relatively good agreement with R2 = 0.80, CCC = 0.90 and NRMSE = 0.03 (Fig. 5), while estimates for P2O5 and K2O were weaker, with R2 values of 0.74 and 0.75 against NSOs data, respectively.

Fig. 5
figure 5

Comparison of national and subnational-level fertilizer masses used on all crops between NPKGRIDS and National Statistical Offices (NSOs). Total applied masses of (a) N, (b) P2O5, and (c) K2O. Coloured markers refer to various NSOs: EU (European Unions), LK (Sri Lanka), PK (Pakistan), IR (Iran), IN (India), and CN (China). Black lines show the 1:1 ratio.

The comparison of 32 countries against the statistical office of the European Union (EUROSTAT)42 data showed a good alignment at both national and subnational level, with a few exceptions. Specifically, total use of N and P2O5 fertilizers were significantly underestimated in Iceland and Ireland (N and P2O5) and in Malta (N). This underestimation was likely due to the high uncertainty in crop harvested area reported in CROPGRIDS for these countries, and for Ireland specifically, due to uncertainty fertilizer use on meadows and pastures. In Iceland, only potatoes were mapped in NPKGRIDS. In contrast, nutrient application in China was slightly overestimated.

Limitations and uncertainty

NPKGRIDS incorporates uncertainties and errors of its input datasets, such as the source fertilizer datasets and the CROPGRIDS dataset used to spatially allocate the tabulated fertilizer application rates. Uncertainties can arise from errors in, or missing reporting of, fertilization amounts and crop area data submitted for national and international reporting. For example, MFM encountered data limitations in many lower- and middle-income countries and observed more anomalies in P and K fertilization data as compared to N. On the other hand, CROPGRIDS was constructed by reconciling multiple data sources, including surveys, remote sensing, and models, where each of these sources have uncertainties that will propagate into the construction of NPKGRIDS.

The spatialization of national and subnational-level data to grid cells implemented in NPKGRIDS introduces further uncertainty. For instance, the spatialization of national data (e.g., HFUBC, FUBC18-IDV, FUBC18-AGG) assumes that the relative ratio of fertilizer usage within a country follows the same patterns observed in MFM (Eq. 7), ignoring potential relative changes in cropping practices that may have occurred across different subnational units within a country. Additionally, for information taken directly from MFM, changes in fertilizer usage that may have occurred in those regions over the past 20 years are not considered.

Finally, NPKGRIDS excluded some small countries and territories due to constraint in spatial resolution, including Falkland, Faroe Islands, French Southern and Antarctic Territories (SAT), Heart Island, Isle of Man, Kingman Reef, Kiribati, Ma’tan al-Sarra, Mayotte, Netherland Antilles, Palau, Réunion, Saint Pierre, South Georgia, Svalbard, and Virgin Islands.

Data quality of NPKGRIDS

To quantify the underlying uncertainty, we computed a data quality indicator at subnational unit level based on endogenous quality indicators and comparisons against FAOSTAT and IFASTAT data. The overall data quality \(Q(n,i,j,r)\) of NPKGRIDS for nutrient n (i.e., N, P2O5, and K2O), crop i in subnational unit r in country j is computed as

$$Q(n,i,j,r)=\frac{{Q}_{\text{k}}(n,i,j,r)+{Q}_{\text{FAO}}(n,j)+{Q}_{\text{IFA}}(n,j)}{3}$$
(15)

where Qk is the endogenous quality of the chosen dataset calculated using Eq. (5), and the qualities of benchmarking QFAO and QIFA against FAOSTAT41 and IFASTAT11 datasets are defined as

$${Q}_{\text{x}}(n,j)=1-\min \left\{1,\frac{\left|M(n,j)-{\text{M}}_{x}(n,j)\right|}{\,{\text{M}}_{\text{x}}(n,j)}\right\}$$
(16)

with x being either FAO or IFA and Qx having values between 0 (low quality) and 1 (high quality). For those subnational units where the application rates were gap-filled, we assigned zero to the corresponding Qk. Maps of data quality are distributed along with NPKGRIDS dataset. Examples of data quality maps for cotton are shown in Fig. 3 (second row).

Usage Notes

All georeferenced maps distributed in NPKGRIDS dataset40 are formatted as standard NetCDF4 files. Various coding languages (e.g., MATLAB, Python, Julia, R) and software (ArcGIS, QGIS, Panoply) can be used to read and analysis these files. NPKGRIDS dataset includes the same crops as CROPGRIDS26 dataset, which follow the naming system used by FAO28.