Introduction

Multifunctional urban ecological infrastructure (UEI) is a key component of sustainable cities and it is challenging to systematically measure benefits both within and across types of UEI given the complexity and heterogeneity of urban spaces1,2,3. Urban agriculture exemplifies an important category of UEI with over 60 million ha of global cropland located within cities4, or approximately 3-4% of global total cropland. Food production and recreation are notable benefits across these croplands, although access to space, labor, and inputs (e.g. water and nutrients) limits the global potential of these practices to meet urban food needs5,6, and different access to these resources also translates to a large diversity of practices and realized benefits among and within cities7,8,9. In fact, diverse practices can be seen as desirable from an agroecological perspective, prioritizing cultivation options based on local context10,11. Urban agriculture can support multiple other benefits including mental health, biodiversity, nutrient (re)cycling, stormwater attenuation through increased precipitation infiltration, and heat mitigation; this makes it an easy practice to encourage but the provision level of these benefits is highly variable and uncertain7,12,13.

Multifunctionality does not always mean a win-win for the delivery of all potential benefits. When prioritizing one benefit (e.g., stormwater infiltration) or trying to minimize the use of one resource (e.g., water), there may be trade-offs with other resources. For example, urban vegetable farms in Sydney Australia, had higher yields than their rural counterparts, but resource use (e.g., nutrient inputs) was more inefficient14. From a greenhouse gas emission perspective, urban vegetable production can out-perform conventional production for some crops but, on average, fares worse in terms of emissions per amount of food produced15; poorer performance is in part due to urban agricultural projects making physical infrastructure investments (e.g., raised beds) which end up being used for only a short amount of time (as projects are often ephemeral due to development pressure). Limited training/support and different priorities among gardeners may also contribute to inefficient resource use16. Finally, many urban agricultural practitioners favor recycled sources of inputs such as nutrients, which is a benefit from a resource perspective. Still, on average, the application rates of these recycled inputs often exceed the annual amount of nitrogen (N) and phosphorus (P) harvested in crops17. Repeated over-application leads to accumulation, especially of P in the soil, which can, in turn, pose a risk to water quality.

Within cities, nutrient inputs in other types of green spaces can contribute to local water quality impairment. For instance, lawn N fertilization still dominates urban watershed inputs in Minneapolis-St. Paul (MSP), USA but not P inputs because the state of Minnesota has banned P fertilizers on lawns to improve water quality in nutrient-sensitive lakes18. Although urban agriculture usually represents a much smaller land footprint than lawns, they are likely to receive far higher areal input rates of N and P. If soils in urban farms and gardens cannot retain excess P, or denitrify excess N19, from years of inputs that exceed crop demand, then nutrient losses from urban agriculture could pose a risk to water quality. This is amplified as the area of urban land under cultivation expands20. However, rates of nutrient export from urban agriculture are poorly understood, as few studies have measured runoff, erosion, and leaching of nutrients from urban agriculture17.

There are numerous reasons why, even when not measured, there are concerns around risks of water quality impairment related to urban agriculture17. Composts produced in gardens, or purchased by gardeners, are often poorly characterized, both in terms of initial nutrient content and how they may release organically bound nutrients over time, making it hard to know how much to apply. These organic inputs generally have lower N:P ratios than crop needs21. Therefore, it is not uncommon that annual input rates exceed what can be harvested in crops (Fig. 1a for garden balance conceptualization). In addition, N and P transformation, and retention capacity, in garden soils are poorly characterized; some soils are likely net sinks, while others have exceeded their retention capacity. Finally, N and P exported from urban agricultural systems may have a disproportionate impact on urban water quality due to altered hydrology in cities. High cover of impervious surfaces, stormwater connections, and disturbed soils creates a context of limited opportunities for nutrient retention and removal along hydrologic pathways22,23. As such, although the spatial extent of urban agriculture may be small relative to rural agriculture, nutrient export may still adversely affect important waterbodies in and around cities.

Fig. 1: Main urban agriculture nutrient flows and leachate measurements.
Fig. 1: Main urban agriculture nutrient flows and leachate measurements.
Full size image

a Conception of nutrient mass balance in gardens. Imports (left-hand side) of nitrogen (N) and phosphorus (P) are often organic in nature, and quite diverse. Once applied to soils these inputs, in addition to on-site nutrient recycling through composting practices, are taken up by crops. Intentional exports of N and P (right-hand side) leave the system through intentional harvest, and unintentional export (bottom) can occur via leaching. Losses (unintentional exports) are also possible via runoff and erosion (N and P), and atmospheric pathways (for N, e.g., denitrification) but are not quantified in our work. The system boundaries are the garden plot (middle of the image) including crops and soil (30 cm down), and assume a one growing-season time step. If imports are larger than intentional exports there is accumulation of N and P in the system and/or an increased risk of unintentional losses. If intentional imports and exports match, they are balanced. If imports are smaller than intentional exports, there is mining of N and P soil pool and/or potentially lower yields/exports over time. Hays include straw and other types of mulch or organic soil coverings. b Sampling scheme common across the three studies. This figure was created in https://BioRender.com.

To maximize the multifunctional benefits of urban agriculture, careful nutrient management is needed, but there is limited field data available to make recommendations. Here, we present empirical data from three research projects that used a standardized protocol to measure nutrient leaching in urban agricultural systems. The teams collected leachate weekly via zero-tension lysimeters at 30 cm below the soil surface (Fig. 1b), filtering samples, and determining nitrate (NO3--N) and phosphate (PO43--P) concentrations. Nutrient inputs were quantified for each study year and plot (or garden), with those in the USA being pre-determined, and those in Sweden obtained via in-depth interviews. Soil properties were also collected for individual plots each year (Supplementary Data 1). The studies exist on a continuum from controlled experiments to in-situ measurements of gardener practices with:

  • A 7-year controlled experiment with different compost types and input levels in a university research garden in Saint Paul, Minnesota, USA, in which initial soil conditions were uniform, and water inputs (from rainfall and irrigation) were uniform across treatments.

  • Three seasons of experimental compost inputs to plots on four urban farms and community gardens across MSP, in which background soil conditions and water inputs varied across farms. We refer to this setup as a semi-observational study.

  • Three seasons of leachate observations in garden plots across four allotment/community garden areas in Linköping, Sweden, in which inputs were documented but not controlled, and background soil conditions and water inputs varied across garden plots.

Using these three set-ups, we present data on weekly and growing season (June to mid-October) nitrate and phosphate in leachate, annual total N and P inputs, and soils properties using standard measures. By comparing these datasets, we assess whether the range of measured nutrient leaching rates from controlled experiments corresponds to real-world conditions (which could have a broader range of management practices). By comparing across sites in different climate zones, we also examine how climate differences may affect nutrient leaching from compost-amended gardens, as well as how nutrient circularity and leaching risk are juxtaposed in these environments.

Results and discussion

Annual input differences do not match leachate outputs

Our observational study supports previous work, showing that gardeners favor organic inputs and that annual nutrient inputs vary greatly across gardens. Compost products were the largest contributor to annual inputs in most gardens (Fig. 2), consistent with urban agricultural projects studied across cities17. The median P application (42 kg P ha−1) and standard deviation (77 kg P ha−1) were lower in our observational study than other studies that used similar survey methods, including urban agricultural farms in the Netherlands (median= 61 kg P ha−1, SD = 198 kg P ha−1 24), Montreal Canada (median= 117 kg P ha−1, SD = 829 kg P ha−125), and MSP (median= 300 kg P ha−1 26). The same was true for N application rates. Across studies that have looked at nutrient balances in gardens, the larger the survey size, the higher means and standard deviations are, which could explain why our study values were lower. Importantly, the range of annual N and P inputs surveyed in the observational study was relatively well captured by the controlled application ranges (Fig. 2).

Fig. 2: Annual nitrogen (N) and phosphorus (P) inputs in observational gardens (Sweden) benchmarked for experimental input levels (lines).
Fig. 2: Annual nitrogen (N) and phosphorus (P) inputs in observational gardens (Sweden) benchmarked for experimental input levels (lines).
Full size image

Each bar represents one garden in one year (abbreviations on the X-axis show that some gardens were in the study for more than one year). The height of each bar is the cumulative input for N (a) and P (b), with the color breakdown corresponding to types of inputs. The order of bars is ascending from left to right for total N input, and is kept for P (so N and P match by garden). Dotted lines represent experimental (USA) input levels where text colors match Fig. 4. The experimental treatment input levels account for both compost and additional synthetic N applied to meet crop demand. That is, assuming 20% of compost N is available for crop update in a year, the treatments added synthetic N to ensure crop demand was met (average crop N demand 145 kg/ha for the crops selected). This means that the total N inputs depicted are higher than crop demand even if they are designed to meet crop demand in a given year.

Although the median growing season leachate collected for phosphate and nitrate was higher for plots that received inputs than for those that did not (Fig. 3, Supplementary Fig. 1), annual inputs were not a good predictor of seasonal leaching across our three studies. The leaching differences between input and no-input were smallest in the controlled study with the median for N-input plots being 53% and P-inputs 69% higher than plots with no inputs, whereas the difference was almost 3 times higher for P in the semi-observational study. Though differences were not statistically significant in the observational study, the median leachate in plots with inputs was 2.4 times higher for N and 1.8 times higher for P. Even in plots that received no input for one or more years, nutrients often leached into the lysimeters across study sites, sometimes even showing a similar range (Fig. 3c, e), while some lysimeters occasionally contained no water (Supplementary Fig. 2). Overall higher inputs did not strongly correlate with higher growing season leachate (Fig. 2, Supplementary Fig. 3, and Supplementary Fig. 4). Although there is a positive trend between inputs and leachate, for PO43--P especially, regression models violated assumptions with patterns in the residuals.

Fig. 3
Fig. 3
Full size image

Leaching of nitrogen (N) as nitrate (NO3--N) (ac) and phosphorus (P) as phosphate (PO43--P) (d, e) with and without nutrient inputs to gardens. Violin plots of growing season leachate, where the top panel shows N and the bottom shows P. Y-axis values are on a log base 10 scale. The box and whisker plot inside each shaded area shows the median (middle horizontal line), the interquartile range (the box), 1.5 times the interquartile range (vertical lines), and outliers (dots). The width of the shaded colored area represents the density of data points for each given value. The number of data points is different in each dataset as the nature of each research project was different; the controlled study had 168 no input and 728 input plot year points for both N and P, the semi-observational study had 44 plot year points with no input and 47 with inputs, finally the observational dataset had 8 plot year points with no N input and 85 with inputs while it was 9 without inputs for P and 84 with inputs. Using a non-parametric Wilcox rank sum test, we can see significant differences between no input vs input groups for the semi-observational and controlled projects (significance level via p-values in e, c, f). Colors match Supplementary Figs. 1, 3, 4, and 5.

Such a lack of fit points to other factors playing an important role in understanding the pattern of leaching, even when accounting for study as a random factor in the models. Precipitation differences also did not closely match leachate difference patterns (Supplementary Fig. 5). The lack of correlation between annual input differences and same-year leaching is not entirely surprising27. Crop uptake, especially for plants capable of luxury uptake, may contribute to this disconnect28,29. For N, atmospheric loss pathways, including denitrification, may further the disconnect between inputs and leachate30,31. There are multiple other chemical and physical processes that decouple inputs from same-season leachate fluxes; notably the sorption of inorganic P and ammonium, as well as organic pools taking several years to fully mineralize32,33. This overall lack of correlation has been observed at the watershed level, in part because soil storage capacity creates a buffer that can later contribute to legacy losses via leaching, runoff, and erosion34,35,36.

Although inputs alone may not be a good predictor of leaching, quantifying inputs and system outputs (Fig. 1) is important in characterizing the net social and environmental benefits of compost-based urban agriculture. For instance, tracking nutrient inputs and soil content is key to maintaining high yields, which can contribute to food security. Similarly, to get heat island mitigation benefits, vegetation needs to be sufficiently lush and thus have access to adequate amounts of nutrients and water (via irrigation or rainfall), which thus require tracking. Conversely, high water or nutrient input application rates, or large pools of available N and P in soils, could contribute to water quality impairment20. Stormwater retention and infiltration may be an important ecosystem service of outdoor UA, but our data show that nutrient export via leachate may be a common occurrence and a potential tradeoff, although direct impacts on local water bodies depend on opportunities for retention along hydrologic flow paths.

Cumulative P input effects may explain part of the variation but not all

When there was higher plant-available soil P, there were often higher concentrations of P in leachate (Fig. 4). A linear regression analysis revealed significant correlation between plant-available soil P and higher PO43—P leachate concentrations in semi-controlled research gardens (r2 = 0.56, p < 0.01). In controlled and observational gardens this positive relationship is also present; however, model assumptions in residuals were violated, thus we do not report these models. Nevertheless, such a pattern supports the idea that legacy P contributes to the leaching observed, and thus the disconnect with annual inputs. In line with this, a review by Rupp et al. 37 reported that both P concentrations in lysimeter leachate and leaching have been reported as positively correlated with the contents of soil P (i.e. legacy P) extracted by NaHCO3 (Olsen-P) and acid oxalate. Other studies also report correlations between P extracted with other methods (e.g. double lactate extract and the Mehlich-3 P test) and P in lysimeter leachates, though the relation is not unambiguous. Given the more diverse pathways by which N can leave soils, we would not expect a strong correlation between the N content of soil sampled once a year and leachate concentrations.

Fig. 4: Average growing season concentration of phosphate (PO43--P) collected in lysimeters versus the plot soil P level in that same growing season.
Fig. 4: Average growing season concentration of phosphate (PO43--P) collected in lysimeters versus the plot soil P level in that same growing season.
Full size image

P tests for each study (each panel) are different and represent the suggested test for that region. P-AL and Bray-P are used for acidic soils, and Olsen-P for calcareous/alkaline soils64. Similarly, the vertical grey line in each panel represents regional recommendations. Each dot represents one plot in a given study year. Colors in (a) match Figs. 2 and 5 –input treatments. In panels b and c, shapes represent farms and gardens, and the colors match Fig. 3. Given shared site history, one would expect soil P levels to be more similar within one farm (b) or garden area (c) which is why these were used as symbols. The black line in panel b is a linear regression line (p < 0.01).

Across our three studies, soil plant-available P was above the threshold for when national fertilization guidelines recommend halting the use of P inputs to nearly all agricultural row crops (Fig. 4). Such guidelines aim to balance agronomic benefits (costs of fertilizers vs relative yield gains) and environmental protection (minimize risk of loss towards water), and thus are not a perfect indicator of soil saturation and when a system would shift from buffer to legacy source.

Still, given the difference in leachate nutrient fluxes between control and treatment was small relative to the magnitude of leachate in the no-input control treatments (Fig. 3), it seems possible that the majority of leachate came from nutrient inputs applied in previous years, at least for P. The leachate flux in the no-input controls in the controlled research gardens is the result of soil amendments added prior to the start of the experiment, indicating long-term slow mineralization of this soil organic matter. In addition, given the input types used were mostly organic, it is highly likely soil nutrient stocks could build up over time (Supplementary Fig. 6) and have a cumulative impact. In the observational study, the oldest site (i.e. the longest history of urban cultivation, Fig. 4 lower panel blue triangles) had the highest soil P plots, but also the biggest range of soil P of all sites studied. Conversely, the youngest site (Fig. 4c, blue circles) had the lowest soil P, the most clustered distribution, and lowest leachate P concentrations. These patterns in real-world settings support that cultivation length and local soil characteristics affect soil and water quality observations, but that specific management practices (nutrient input types but likely also other practices), especially over a longer period of time, also contribute to the variability observed.

A temporal perspective helps uncover potential drivers of leaching

Depending on the type of input used over time, the controlled study showed increased leaching, while no-input plots eventually decreased in the amount of N and P leached. Leachate NO3--N flux in the controlled study plots was higher for all compost treatments relative to no-input controls during years 2-5 (Fig. 5a). During year 1, NO3--N leachate flux was low relative to controls for three of the compost treatments, and during years 6-7, there was little difference in NO3--N export as leachate between compost-amended plots and controls.

All compost treatments in the controlled study received similar total N inputs, based on the assumption that 20% of compost N would become plant-available each year. Therefore, compost-amended plots received approximately 5x more N inputs relative to N uptake by crops each year, although we did not conduct a complete N mass-balance to determine the fate of these inputs. However, it is notable that NO3--N leachate flux during years 2-5 corresponds to approximately 50% of annual total N inputs for the low-input compost treatments, and approximately 20% of annual total N inputs for the high-input compost treatments. A simulation of garden soil nutrient dynamics showed that compost N contributed significantly to leachate N for three years following compost application20, suggesting that the relatively high NO3--N leachate from compost treatments in years 2-5 was likely the result of mineralization of organic N that was added in earlier years. It is interesting that there were no consistent differences observed in NO3--N leachate flux between high and low compost inputs, and between the plant-based compost and manure-based compost. The manure-based compost has a higher C:N ratio (33.7 compared to 16.1), but the high-input manure compost treatment was also characterized by high microbial respiration rates and high N-mining soil microbial enzyme activity32.

Leachate PO43--P flux in the controlled study showed consistent differences between high- and low-input treatments, after year 1 (Fig. 5c). The high-input manure treatment was associated with higher PO43--P leachate fluxes compared to the high-input plant compost treatment, despite the fact that the plant compost treatment led to higher levels of plant-available soil P (Fig. 4a and ref.38). Low-input compost treatments (targeted for crop P demand) had PO43--P leachate fluxes similar in magnitude to the no-input control treatment. The semi-observational study used the same low-input compost treatment and did not show more leaching than the reference sites (Fig. 5b). This shared pattern supports the idea that targeted compost applications are not likely to be a problem when it comes to P leaching losses.

Fig. 5: Difference in leaching between control (no input) and treatment over time.
Fig. 5: Difference in leaching between control (no input) and treatment over time.
Full size image

The height of each bar is the average leachate per season for lysimeters in a given treatment (color) minus the average leachate per growing season for plots that received no inputs. For the controlled study (a, c) each bar is the average of 20 lysimeters (per treatment) minus the average of 24 no-input lysimeters. For the semi-observational study (b), in year 1, the bar is the average of 8 lysimeters getting the low-input compost treatment minus the average of 10 no-input lysimeters. For year 2, there were 9 treatment and 9 control lysimeters, and in year 3, there were 5 treatment and 6 control lysimeters.

Missing information at different temporal and spatial levels may help complete the story

In addition to showing larger leachate differences among inputs over time in the controlled study (Fig. 5), collecting data over multiple years can help integrate the impact of weather on nutrient cycling in gardens, an important factor given the effect of extreme weather on nutrient cycling in agroecosystems39,40. For instance, the 7-year study was able to capture growing seasons ranging in precipitation from 16.6 to 55.2 cm; years 5 and 6 were drought years, leading to low leachate volumes with elevated PO43--P concentrations for some compost treatments38. Flood events were not captured but would also be important, especially as climate change is predicted to increase the frequency and magnitude of precipitation-related extreme events, and thus nutrient losses toward waters41,42,43, especially in cities where impervious surfaces dominate44. Year-round monitoring can also improve estimates of loss and identify key periods of loss. In Sweden, we measured weekly leaching year-round for 2.5 years in a sub-set of 8 lysimeters and found that over 50% of N, P, and water leaching happened in the 6-month non-growing seasons (Supplementary Fig. 7). Periods of snow-melt were particularly important, although differences among plots, in terms of total leachate per year was still large (Supplementary Fig. 8). Some of these differences may be attributed to non-nutrient-related management practices such as soil covering45. Harvest was only quantified in the MSP controlled study, and was found to be much less variable across treatments compared to leachate fluxes. The ratio of cumulative harvested P to cumulative leachate P was highest (32.3) in the synthetic fertilizer treatment, and lowest (3.1) in the high-input manure compost treatment38.

From a spatial and systems perspective, focusing only on leaching likely underestimates total losses from gardens across studies, but the placement of the lysimeters may have overestimated the N and P values that could find their way to ground and surface water via leaching. We did not quantify denitrification, which would have decreased the amount of N available for leaching and would vary not only with the type of input used, but the timing and placement of soil amendments. Higher soil drainage can also decrease the importance of denitrification but can still occur at micro-sites in aerobic soils; given the studies had different soil properties (notably good drainage in the controlled and semi-observational studies) this loss pathway would have played different roles. In urban agriculture on vacant land in Buffalo, New York, USA, denitrification potential was not higher in food gardens than reseeded grass or intact vacant lots, although N soil content was higher, which increases the risk of N runoff, especially in winter19. Still, limitations in our understanding of placement and timing of inputs are particularly true for our observational study.

We collected information for total nutrient inputs per year for their whole garden, and did not account specifically what was applied on the soil directly above the lysimeters. N and P can reach water via runoff and erosion processes, which we did not measure. Across our study sites, there was minimal slopping, which may minimize losses via these two pathways, but they cannot be discounted without proper field studies. Finally, the relatively shallow placement of the lysimeters (30 cm below the surface) cannot account for sequestration and processing happening lower down in the soil column, before leachate reaches groundwater, especially for N46,47,48. Still, many plots in the observational study seemed to be over a shallow water table with distinct clay layers, given lysimeters were submerged during spring installation. The placement of the lysimeters themselves can also create preferential flow paths for water, leading to water either bypassing the lysimeters or funneling into them, reducing the quality of data, although we assume this process to be random across lysimeters and not a source of systematic bias.

Minimizing loss risks and maximizing nutrient recycling

Our three studies focused on urban agricultural practices as they exist now, but these can be changed. The controlled study already hints at the fact that combining organic with inorganic inputs could reduce the risk of leaching. That is, applying lower levels of organic amendments to match P needs, given the N:P ratio of recycled materials, and supplementing with inorganic N can ensure high yields while minimizing soil P build-up and the risk of P leaching. The integration of drip irrigation, soil conservation practices, as well as agroforestry and perennials could alter water flows, nutrient needs, and the interplay between water and nutrients49,50. Many of these transitions, however, require longer-term investments (and such infrastructure also comes with environmental costs), making land tenure, consistent advisory services, and monitoring important underpinnings for success15,17,51,52.

Methods

Data used in this paper (see Supplementary Data 1) came from three studies, varying along a controlled to observational spectrum (Supplementary Table 1), where the set-up and protocols for leachate sampling were similar (Fig. 1b) but nutrient inputs, length of study, and context varied38,45,53. Both MSP and Linköping experience a temperate climate with four distinct seasons, with MSP having more extreme temperatures and more precipitation. The most controlled study started in 2017 using raised beds where measurements and garden upkeep was done weekly by undergraduate students over the growing season38. The semi-observational study was done at four garden/farm locations where a sub-plot was used to test the low-input plant compost application from the controlled MSP research garden study, in addition to a ‘growers choice’ treatment, and was part of a broader project on urban agroecology (e.g., ref. 53). The observational study began in 2020, documenting existing practices in urban agricultural spaces by interviewing practitioners annually to calculate N and P inputs, and installing 2–4 lysimeters in their plots45. In some cases, gardeners only participated in the study for 1 year, while others stayed for all three.

We do not present full mass balances for gardens, instead focusing on intentional inputs and one loss pathway: leaching. Intentional inputs to plots were known either by their treatment level, or by conducting detailed interviews with gardeners and farmers which included taking pictures of all purchased products to determine N and P content, and using literature estimates when necessary25,45. We do not account for unintentional inputs such as atmospheric deposition or wild animal excreta. Because the observational study could not quantify intentional nutrient exports from harvest (as it is gardeners who eat the produce), we cannot explore the relationship between leaching and this other export pathway. We also did not measure runoff and erosion, or other unintentional losses such as insect damage.

In general, across all projects, water was collected weekly from each zero-tension lysimeter using 60 ml syringes during the growing-season season to determine water volume, as well as take samples to test for N and P concentrations (Fig. 1b shows placement and lysimeter components, which are also described in ref. 45). Samples were filtered at a pore size of 0.45 μm (polyethersulfone syringe filters) before measuring nutrient concentrations. The two MSP studies used the same hand-held instruments–Low Range Phosphate Portable Photometer (Hanna Instruments, HI96713) and a YSI Pro Series Nitrate Ion-Selective Electrode- to measure concentrations, after sample dilution. The Swedish study used bench-top equipment- Nanocolor spectrophotometer UV/VIS (Macherey-Nagel, Germany) to measure PO43—P and a 30 Compact IC Flex ion chromatograph (Metrohm) to measure NO3N, also following the dilution of samples to be within equipment ranges. Soil samples were collected at least once a year across sites, gathering information on texture as well as nutrient content using locally relevant agronomic tests (Supplementary Table 2). See Supplementary Table 1 for more details about sampling and equipment across the studies, as well as Small et al. 38 and Sieczko et al. 45 for analytical details. All data processing was done in R studio (R. Core Team 1.3.959) using plotly, broom, dplyr, data.table, RColorBrewer, scales, stringr, rstatix and reticulate packages54,55,56,57,58,59,60,61,62.

To standardize data and facilitate comparison as much as possible, while still documenting variability, we have selected a subset of weekly concentration and water volumes that roughly matched across studies. Specifically, we used cutoff dates of June 1 to October 15th for each season, with a two exceptions related to fieldwork constraints. In year 7 of the MSP research garden project (2023) the last sampling happened at the end of September, meaning the seasonal cumulative numbers for that year are slight underestimates. In year 3 of the Linköping garden study (2022), we had to cut data off on August 17th because of data loss, also underestimating seasonal leachate.

To calculate weekly fluxes, we multiplied water volumes by the concentration of each nutrient for each lysimeter. In all datasets, there were several weeks where the volume of the leachate was insufficient to conduct NO3--N and/or PO43--P analyses. In these cases, we used the average concentration for a given plot (i.e. lysimeter) and season to calculate N and P leaching for the missing (multiplying by the actual week’s low water volume). We omitted interpolated weeks from any analysis or data presentation other than for seasonal fluxes to minimize the influence on our results.

We combined collected weekly water volume and concentration data (including interpolated weeks) to calculate leaching fluxes for each lysimeter/plot for each growing season in each study (Eq. 1).

$${F}_{l,s,y}=\mathop{\sum }\limits_{week\,1}^{week\,n}concentratio{n}_{N\,or\,P}\,x\,Volume$$
(1)

Where F stands for the seasonal leaching flux of NO3--N or PO43--P for lysimeter l, study s, and year y. We sum the weekly fluxes which are calculated by multiplying concentration and volume. To standardize across lysimeters, as two different sizes were used (Supplementary Table 1), we also divided these seasonal fluxes by funnel area and expressed leaching in kilograms per hectare per growing season year.

In order to avoid underestimating weekly fluxes where there was water in lysimeters, but not enough to do concentration analyses, we assumed average N and P concentrations per season and plotted for missing weeks. Given this assumption is only used for low-flow weeks, the relative importance of these additions was expected to be low. For the controlled study interpolated weeks represented on average did represent 69.8% of nitrate and 35.6% of phosphate estimated fluxes per plot (Eqs. 2 and 3). However, when considering the importance of interpolated weeks on the full flux of nutrients measured for all lysimeters combined per growing season over all years, this only represents 5.1% of seasonal nitrate fluxes and 7.5% of phosphate (Eq. 4). The 1st set of numbers thus seems high because there were many lysimeters with very little water in them over the years. In the observational study, these calculated values represented 16.4% for nitrate and 11.1% for phosphate on average per season per plot. The highest cumulative effect of the assumption on seasonal study totals was also relatively small with 6.6% of P and 6.1% N fluxes. In the semi-observational study, the missing weeks contributed on average 26.7% P fluxes per season per plot, with a cumulative contribution of only of 8.4%. Although less frequent, there were also weeks across studies where more than 1 liter of water was collected; however, given the volume of the lysimeter collection bottle was 1 liter, and there was saturation of the soil above the lysimeter refilling it as water was collected, we could only report a maximum of 1 l and know these weeks are underestimates.

$${I}_{l,s,y}=\frac{{F}_{l,s,y}-{F}_{l,s,y}^{* }}{{F}_{l,s,y}}x100$$
(2)

Where I stands for the percentage of the flux made up by interpolated data and * stands for a seasonal flux as calculated with Eq. 1 but only for weeks where the was enough water to do N and P measurements.

$${I}_{s}=\frac{\mathop{\sum }\limits_{1,1,s}^{n,n,s}I}{{nlysimeter\; seasonal\; fluxes}}$$
(3)

Where Is stands for the average percentage of a flux made up by interpolated data for lysimeters in study s. n varies by study as the number of years and lysimeters per study years.

$${{IT}}_{s,y}=\frac{{F}_{s,y}-\,{F}_{s,y}^{* }}{{F}_{s,y}}$$
(4)

Where IT stands for the total interpolated seasonal nutrient flux for a study site per year. Here instead of taking the average per lysimeter year (Eq. 3) we use the cumulative flux of all lysimeters in a given study per year. This metric is less sensitive to the fact that lysimeters that lysimeters which have little water will by definition have more interpolated data. We report this metric per study year as opposed to an average across years to be able to identify outliers.

We calculated summary statistics (e.g. median) for leachate and soil properties and then examined for the presence of a relationship between leaching (and input amount and type (Figs. 3, 5, Supplementary Fig. 3, Supplementary Fig. 4, Supplementary Fig. 6) as well as soil properties (Fig. 4) and precipitation (Supplementary Fig. 5). The three studies had a different number of lysimeters, years, as well as crops and treatments; one must use caution in looking for relationships without accounting for these differences. In addition, the semi-observational and observational studies did not have true replicates. As such, we relied on pattern descriptions and only a few statistical tests, setting the significance threshold as p < 0.05.

More specifically, we ran non-parametric Wilcox rank sum tests to look at the difference between plots with inputs and no inputs (Fig. 3). Following Zuur et al.63, we used a systematic approach to the application of regressions to look at how input levels (kg per season) and seasonal leachate fluxes were related. For N, P, and N:P input ratio, we compared three models to evaluate if study was an important factor to consider and selected the model with the lowest Akaike information Criterion (AIC) value as an indicator of the best model while avoiding overfitting. The three models were run using restricted maximum likelihood (REML). The first model in all cases was a simple generalized least squared model (GLS, commonly called a single linear regression) with the independent variable being input quantity per season. The dependent variable was the average leachate flux per season for each treatment and crop combination (controlled and semi-observational studies) or gardener (observational study) to account for the asymmetry with the number of different inputs vs number of lysimeters. In order words, we accounted for replication in the two MSP studies, and for our coarsest data resolution for the Swedish study. The second model was a linear mixed effects model where we used the study as a random effect for slope, and the third model accounting for study as a random effect for both intercept and slope. Below is an example of R code for the three models being compared. We ran such models on both log base 10 and non-transformed leachate data.

library(mgcv); library(nlme)

Data$City <- factor(Data$City))

B1 <- gls(N_NO3_flux_per_season_g_m2 1 + N_input_g_per_m2, method = “REML”, data = Data)

B2 <- lme(N_NO3_flux_per_season_g_m2 1 + N_input_g_per_m2, data = Data, random =1 | City, method = “REML”)

>B3 <- lme(N_NO3_flux_per_season_g_m2 1 + N_input_g_per_m2, data =Data, random =1 + N_input_g_per_m2 | City, method = “REML”)

AIC(B1, B2, B3)

If there were patterns in the residuals of a model (evaluated visual inspection of residuals vs fitted, Normal Q-Q, Scale-Location, and Residual vs Leverage plots), we deemed it inappropriate to report the significance of the relationship between explanatory parameters and leachate fluxes.

To explore the relationship between soil P and leachate PO43--P concentrations, we ran linear regressions for each study separately (Fig. 4), looking for patterns in the residuals as this would violate model assumptions, making statistical results inappropriate for reporting. Finally, to look at the potential impact of input type over time in the controlled and semi-observational studies, we calculated the difference between the average control (no-input) plot leaching and the average leaching of lysimeters under a particular treatment (Fig. 5).