Abstract
Developing scalable, context-specific sustainability assessments is a major challenge in agroecosystem management. This study presents the Agroecosystem Sustainability Index (ASI), a novel tool integrating environmental, economic, and social indicators for comprehensive agricultural sustainability assessment. Using county-level data from the U.S. Corn Belt (1997–2022), we applied advanced statistical methods, dimensionality reduction, and weighting schemes. A hybrid approach combining factor analytic weighting and weighted sum deviation proved most reliable. The ASI scores range from 1 to 100, with higher scores indicating greater sustainability. It identified three distinct grower patterns: “Optimizers,” “Emerging Conservationists,” and “Challenged Growers,” enabling targeted sustainability strategies. The ASI offers a robust, adaptable method applicable across diverse agricultural landscapes, supporting targeted interventions to enhance sustainability globally.
Similar content being viewed by others
Introduction
Sustainable management of agroecosystems is a critical global challenge due to the complex interactions among environmental, economic, and social factors that influence agricultural productivity and ecosystem health. Developing scalable and context-specific sustainability assessments that accurately capture these multidimensional dynamics remains a major obstacle, particularly for large and heterogeneous agricultural regions such as the U.S. Corn Belt. Traditional assessment methods often fall short in integrating diverse data sources across spatial and temporal scales, limiting their ability to inform targeted and effective sustainability interventions.
Developing a comprehensive composite index to assess the sustainability of agroecosystems is challenging due to the complex, multi-layered nature of agricultural systems. Current methods struggle to capture the multiple dimensions of sustainability in large regions such as the U.S. Corn Belt, particularly in integrating environmental, economic, and social factors across time and space. Previous approaches, including composite indices, spatially explicit assessments, and machine learning techniques, have encountered limitations in data integration, resolution, and capturing the complexity of agroecosystems1,2.
To support interpretation and communication, we developed three grower personas based on indicator dynamics from 1997 to 2022: “Optimizers” (high-performing, profitability-driven), “Emerging Conservationists” (moderately profitable, adapting to environmental constraints), and “Challenged Growers” (low-performing, resource-constrained). The “Optimizers” label reflects farms prioritizing economic performance through high input use and productivity, while acknowledging environmental trade-offs, particularly in emissions management. In contrast, the “Emerging Conservationists” represent farms transitioning toward conservation-oriented practices; although they currently exhibit higher greenhouse gas emissions, their trajectory signals growing environmental stewardship expected to improve over time. These personas are relative descriptors that compare clusters’ characteristics within the study context, highlighting distinct sustainability strategies and development pathways across counties.
Our methodology for creating the Agroecosystem Sustainability Index (ASI) integrates theory-guided data science (TGDS) principles throughout the workflow, consistent with the recommendations of the US National Research Council3,4. Expert knowledge informed indicator selection during data pre-processing to prioritize agronomically relevant variables and exclude spurious ones. Feature selection combines statistical criteria with sustainability theory to retain meaningful indicators representing environmental, economic, and social dimensions. TGDS also guided the construction of the composite index—particularly in evaluating weighting schemes and aggregation methods—ensuring the inclusion of theoretically grounded sustainability benchmarks. Finally, cluster interpretation was informed by established agroecological and socio-economic concepts, enabling the definition of actionable grower personas. This integration of expert insight with data-driven methods enhances both robustness and interpretability, ensuring the ASI reflects complex agroecosystem dynamics accurately and supports spatially explicit and temporally dynamic assessments5,6.
We synthesized datasets from various sources including the National Agricultural Statistics Service (NASS), Economic Research Service (ERS), European Center for Medium-Range Weather Forecasts (ECMWF), Environmental Protection Agency (EPA), and International Soil Reference and Information Center (ISRIC), covering a wide range of agricultural, environmental, and socio-economic factors1,7,8. Our analytical framework combines statistical and machine learning approaches to uncover hidden patterns and nonlinear interactions in agroecosystems9,10. Rigorous data pre-processing, dimensionality reduction, and pattern recognition procedures ensure methodological robustness and transparency2,11.
In constructing the composite index, different weighting schemes and aggregation methods were evaluated, building on the work of the Joint Research Center12. Our hybrid approach combines factor analytic weighting with the weighted sum of variance of sustainability benchmarks, validated by sensitivity analysis and cross-validation.
The novelty of our approach lies in the integration of TGDS principles that combine expert knowledge with statistical methods for comprehensive assessment. This methodology captures the complex interactions between sustainability dimensions and provides a nuanced understanding of agroecosystem functioning.
The resulting ASI is a robust, adaptable tool applicable to different agricultural landscapes and provides a basis for targeted action to improve agricultural sustainability worldwide. An accompanying paper13 demonstrates its practical application in the U.S. Corn Belt region, making an important contribution to the ongoing dialog on agricultural sustainability.
Results
Pre-processing of data and selection of indicators for the development of ASI
In developing the Agroecosystem Sustainability Index (ASI), we used a comprehensive approach to data processing and cluster analysis. We evaluated several normality improvement techniques, including Yeo-Johnson, Box-Cox, quantiles, and rank-based inverse normal transformations (INT). As shown in Supplementary Tables 1 and 2, the INT method was found to be the most effective based on improved statistical metrics and higher p-values14,15,16. This finding aligns with previous studies emphasizing the robustness of rank-based normalization methods17,18 confirming its suitability for our dataset.
To further ensure data comparability, we compared Min-Max normalization and Z-score normalization across multiple criteria (Supplementary Fig. 1 and Supplementary Table 3). Z-score normalization slightly enhanced random forest model performance and range consistency, whereas Min–Max normalization minimized outliers and improved clustering performance. Given their respective strengths, both methods were retained for further sensitivity analysis, allowing us to assess their influence on ASI robustness.
Our feature reduction procedure included the removal of redundant indicators and the application of principal component analysis (PCA). We validated our selection by analyzing the loadings of the metrics on the principal components to ensure that the retained metrics contributed substantially to explaining the data variance. This step was crucial to streamline the dataset while preserving multidimensional sustainability information. The resulting streamlined set of metrics that are critical for representing different aspects of agroecosystem sustainability is shown in Supplementary Table 4.
Together, these preprocessing steps—normalization, standardization, and indicator selection—formed a robust analytical base for the subsequent development of the ASI, ensuring both statistical soundness and interpretability.
Clustering analysis reveals distinct patterns of agroecosystem sustainability across Midwest and Great Plains counties
To explore sustainability patterns across counties, we employed both agglomerative and K-means clustering using Min–Max and Z-score scaled data. The elbow method determined that a three-cluster solution was optimal. Clustering performance was evaluated using several indices, including the Silhouette score, Davies-Bouldin score, Calinski-Harabasz index, and random forest accuracy (Supplementary Table 5). K-Means clustering, when combined with Min-Max normalization, outperformed other configurations (Supplementary Figs. 2 and 3). The Calinski-Harabasz index reached its highest value (1672) under this configuration, supporting its selection for downstream analysis. Cluster structures were visualized using PCA (Supplementary Fig. 4), and we identified cluster characteristics based on 1997–2022 indicator dynamics (Supplementary Fig. 5 and 6).
Based on the analysis, we identified three distinct clusters of growers: “Optimizers”: These growers exhibit the highest profitability and invest heavily in agricultural inputs and technology. They effectively diversify income sources, prioritize economic returns, and apply some conservation practices such as no-till farming and cover crops. However, they face trade-offs in environmental performance, especially in emission management. “Emerging Conservationists”: Moderately profitable growers operating under the harshest environmental constraints (e.g., poor soils, low rainfall, rugged topography). Despite producing the highest emissions, they are increasingly adopting conservation practices, particularly reduced tillage. “Challenged Growers”: This group faces the greatest challenges, with low profitability and limited adoption of modern practices. Although they benefit from more favorable environmental conditions than other groups, these are not sufficient to overcome their economic and agricultural disadvantages.
These clusters were mapped across 13 states (Supplementary Fig. 7), revealing geographic patterns: “Optimizers” dominate the Corn Belt, while “Emerging Conservationists” are more frequent in the Great Plains. These findings suggest that sustainability performance is closely tied to both biophysical and socioeconomic factors. Future development of agroecosystem sustainability indices will allow for more in-depth analysis of the factors underlying these regional patterns.
Comparative analysis of weighting methods for ASI—insights from Kendall’s tau correlations
To evaluate different weighting strategies, we used Kendall’s tau to compare rankings from PCA, FA, equal weighting, and expert judgment. Statistical methods (PCA, PCA1, FA) showed strong internal consistency, suggesting a coherent underlying data structure. In contrast, equal weighting and expert judgment revealed lower correlations with statistical methods, reflecting divergent priorities (Supplementary Fig. 8). Specifically:
-
Statistical methods emphasized economic livelihoods and farm management indicators (Fig. 1), aligning with previous research19.
-
Equal weighting provided balanced importance across a wide range of indicators, offering transparency but less precision20.
-
Expert judgment highlighted indicators related to management practices and farmer characteristics, capturing more nuanced sustainability dimensions11.
The figure displays the top-ranked indicators identified by four weighting methods: Principal Component Analysis (PCA), Factor Analysis (FA), equal weighting, and expert judgment. Each bar represents the relative importance or ranking of key indicators within each method.
These variations demonstrate the complex, multi-criteria nature of agroecosystem sustainability and the importance of context-specific weighting choices.
Sensitivity analysis and methodological assessment of the ASI
To understand how methodological choices affect ASI results, we used Sobol’s variance-based sensitivity analysis21. Weighting emerged as the most influential factor (ST = 1.027), followed by normalization (ST = 0.984) and aggregation (ST = 0.904), although aggregation had negligible first-order effects (S1 ≈ 0), indicating its effect arises through interactions (Supplementary Fig. 9).
Further assessments revealed considerable variation across ASI designs (Supplementary Fig. 10). WSM and WSD methods provided consistent mean values (~50, SD15,16,17), while MOM showed greater variability, particularly under Z-score normalization. GMM proved especially sensitive to normalization choice.
Supplementary Fig. 11 illustrates how weighting schemes influenced index values under various aggregation methods. Expert-based weights often produced higher ASI values, particularly under MOM.
Based on these results, WSM and WSD emerge as the most stable and interpretable methods. MOM may be preferable when the identification of extreme values is of interest. These findings reinforce the need for careful selection of method combinations tailored to research goals.
Comprehensive analysis of ASI methods: sensitivity, uncertainty and cross-validation
To confirm the robustness of ASI construction methods, we complemented the sensitivity analysis with Monte Carlo simulations and cross-validation. These analyses evaluated the impact of design variations on index stability and accuracy.
Supplementary Fig. 12 shows that WSD_Expert_judgement had the highest sensitivity values with first order (S1 = 0.057609) and overall sensitivity (ST = 0.053486). Yet, the relatively small differences between all methods confirm the difficulty of identifying a universally “best” approach—consistent with findings in ref. 20.
Variance-based sensitivity and coefficient of variation analyses (Fig. 2) confirmed that WSD_FA was among the most stable and reliable methods, with a low standard deviation and strong performance under uncertainty. Monte Carlo simulations (Supplementary Figs. 13, 14, and 15) also highlighted WSD_FA’s reliability (mean = 58.44; SD = 13.22), while cross-validation showed it had the lowest MSE variability.
Variance-based sensitivity and coefficient of variation analyses of ASI construction methods are shown, indicating the relative stability and reliability of each method under uncertainty. Results include Monte Carlo simulation outputs (mean and standard deviation) and cross-validation error variability metrics, highlighting differences in performance across methods.
While WSD_FA stands out as a balanced and dependable method, the strong performance of multiple methods supports the idea of combining approaches based on theoretical, empirical, and policy considerations22,23.
Comparative analysis of weighting methods for agroecosystem sustainability indices
This study examines different weighting methods for agroecosystem sustainability indices, comparing principal component analysis (PCA), factor analysis (FA), equal weighting and expert judgment. The Kendall-Tau rank correlation is used to assess the consistency of indicator rankings between these methods.
The analysis shows strong correlations between the statistical methods (PCA, PCA1 and FA), indicating a consistent data structure. PCA1 and FA show the strongest correlation, indicating similar indicator rankings despite possible differences in sign (Supplementary Fig. 8). In contrast, equal weighting and expert judgement show a weak correlation with the statistical methods, indicating different prioritization approaches.
Statistical methods consistently prioritize economic livelihoods and farm management indicators (Fig. 1), aligning with previous findings that emphasize quantifiable economic factors19. Indicators such as AG_Herbicide, Commodity_Sales, and Crop_Insurance are often given a high weighting. PCA focuses on a mix of economic and management indicators, while PCA1 and FA show slight bias towards government programs and asset indicators. Equal weighting assigns uniform importance to a broader range of indicators, valued for its transparency and simplicity20. Expert judgment highlights farmer characteristics and farm management, capturing contextualized sustainability signals11.
These findings reinforce the multi-layered nature of agroecosystem sustainability. Statistical methods capture quantifiable components, while expert judgment includes nuanced insights, highlighting the complementarity of approaches. As also noted in ref. 1, relying on a single method may underrepresent dimensions crucial for local realities.
The study concludes that a hybrid approach that incorporates multiple weighting methods may provide a more robust and inclusive assessment of sustainability. Future research should include sensitivity and uncertainty analyses to synthesize these perspectives and reduce method-related bias24.
Mapping the sustainability of agroecosystems in the US Corn Belt: a twenty-five-year analysis using the weighted sum-distance factor analysis method
A comprehensive analysis of agroecosystem sustainability in the U.S. Corn Belt using the Agroecosystem Sustainability Index (ASI) was developed with the weighted sum-distance factor analysis (WSD FA) method (Fig. 3). We examined sustainability trends in three regional sustainability patterns: “Optimizers”, “Emerging Conservationists” and “Challenged Growers”, from 1997 to 2022 (Supplementary Fig. 16 and Supplementary Tables 6, 7, 8 and 9).
Violin plots depict the distribution and density of ASI values across the US Corn Belt from 1995 to 2020. The plots show temporal variation and data spread for each year, highlighting changes in agroecosystem sustainability over the study period.
The WSD FA method was selected for its superior performance across evaluation criteria. It had the lowest sensitivity index (0.0363) in variance-based analysis, high stability of the coefficient of variation (ranked 4th), and minimal mean square error variability (SD = 3.15) in cross-validation. Monte Carlo simulations confirmed its robustness, yielding a mean ASI of 58.44 (range: 45.23–71.66). This method integrates factor analysis with distance-based aggregation, aligning with established benchmarking practices25,26.
Geographic differences between states revealed that “Optimizers” consistently achieved the highest ASI scores (50.13 to 80.09), indicating superior sustainability performance, while “Challenged Growers” had the lowest scores (24.46 to 45.35) and showed no statistically significant improvement over time. These observed patterns are consistent with findings from the US Midwest, where diversified systems and crop rotations have been associated with improved agroecosystem performance and long-term soil health27,28. Conversely, the persistent challenges among “Challenged Growers” align with studies highlighting barriers to adopting sustainable practices in more input-intensive or monoculture-dominated systems29,30,31.
Discussion
These findings confirm that sustainable agriculture is a complex, multidimensional challenge requiring trade-offs between economic performance, environmental integrity, and social equity. The ASI framework developed here supports tailored strategies for each sustainability cluster and demonstrates how data-driven tools can guide evidence-based decision-making.
By combining robust statistical techniques with real-world data, ASI offers a flexible and scalable platform to support sustainability planning, monitor progress, and inform targeted interventions. It represents an important step toward operationalizing sustainability in agricultural landscapes.
This study introduces a novel and rigorous framework for constructing the Agroecosystem Sustainability Index (ASI), designed to capture the complexity and multidimensionality of sustainability across agricultural landscapes. By integrating diverse data sources with advanced statistical and machine learning techniques, including robust normalization, dimensionality reduction, clustering, and sensitivity analysis, we developed an index that is not only methodologically sound but also adaptable to diverse agroecological and socio-economic contexts.
The ASI revealed meaningful spatial and temporal trends, exposing inequalities and transformations in farming systems over the past 25 years. By distinguishing farmer typologies and tracking their sustainability trajectories, our approach provides actionable insights for policymakers, researchers, and practitioners aiming to foster more resilient, equitable, and sustainable agroecosystems.
Beyond technical rigor, this index offers a strategic tool for evidence-based decision-making, capable of guiding investments, informing policy reforms, and monitoring progress toward national and global sustainability goals. Future research should explore dynamic indicator integration, stakeholder-driven weighting, and predictive analytics to enhance its responsiveness and relevance in a rapidly changing agricultural landscape.
In short, the ASI lays a critical foundation for moving from measurement to meaningful transformation in agroecosystem sustainability.
Methods
Multi-scale data integration and workflow
To construct the Agroecosystem Sustainability Index (ASI), we systematically selected indicators through a structured review of established sustainability assessment frameworks in agriculture (e.g., FAO’s SAFA guidelines, USDA indicators, and peer-reviewed agroecological indices). This approach ensured that the ASI was grounded in internationally recognized principles and aligned with existing methodologies.
Indicators were chosen based on four main criteria: (i) theoretical alignment with the ecological, economic, and management dimensions of sustainability; (ii) capacity to capture spatial variation in key agroecosystem functions at the county scale; (iii) availability across a consistent temporal window (1997–2022) to enable longitudinal assessment; and (iv) sufficient data resolution and quality for integration across all U.S. agricultural counties. These criteria ensured that selected indicators were not only conceptually sound but also operationally feasible for a national-scale assessment.
Although the ASI includes social indicators, these are primarily limited to farmer demographics and farm structural characteristics. More complex community-level dimensions of social sustainability—such as land tenure security beyond ownership, access to public services, and community cohesion—could not be fully addressed due to the lack of disaggregated, publicly available data. This represents a recognized limitation and an area for future enhancement as more granular social datasets become accessible.
While most datasets were available at the county level, certain indicators, such as greenhouse gas (GHG) emissions, were only available at the state level. To maintain the ASI’s county-level resolution, we applied proportional disaggregation methods that allocated state-level values to counties based on cropland area and agricultural activity. Although this introduced some spatial smoothing, the disaggregated indicators were selected for their relatively stable distribution across counties, minimizing distortion and preserving the integrity of spatial interpretation.
We used Python to visualize datasets, analyze indicator trends, and evaluate the consistency and coverage of data across time and space. Time-series plots at county and state levels helped detect anomalies, assess data completeness, and validate indicator behavior. We emphasized county-level data due to its direct relevance for understanding agroecosystem performance. This quality control step was essential to ensure the reliability of the index inputs and improve the robustness of the results.
Despite the challenges of data availability, the use of multi-scale data is widely recognized in land use science and spatial modeling. This study adopted a county-focused approach, leveraging the most detailed publicly available aggregation of farm-level information in the U.S. We selected a five-year temporal resolution from 1997 to 2022, striking a balance between data continuity and the ability to track long-term sustainability trends. This design enabled a nuanced understanding of spatial and temporal dynamics in agricultural systems across the Corn Belt region.
To support indicator analysis and index construction, we developed an integrated data processing workflow (Fig. 4) that links raw datasets to actionable insights. This framework addresses common challenges in large-scale environmental data science, including data cleaning, normalization, dimensionality reduction, and unsupervised learning. By combining these techniques in a structured and reproducible pipeline, we facilitated efficient data exploration, reduced analytic fragmentation, and improved transparency.
The schematic illustrates the stepwise process starting with data cleaning and normalization, followed by dimensionality reduction and clustering methods, leading to the identification and interpretation of agroecosystem sustainability patterns.
This systematic workflow reveals hidden patterns and relationships in complex, multi-layered datasets, contributing to a more comprehensive and empirically grounded understanding of agroecosystem sustainability. It also enhances the interpretability and reliability of the ASI, supporting future efforts in sustainable agriculture monitoring and decision-making.
Data profiling and quality improvement in time series analysis
Our analytical process began with a comprehensive data profiling phase, designed to assess the structure, distribution, and completeness of the dataset before applying advanced statistical techniques. This initial stage combined descriptive statistics, visualization tools, and exploratory data analysis to detect underlying patterns, identify outliers, and flag potential anomalies that could influence the robustness of the Agroecosystem Sustainability Index (ASI).
To systematically evaluate data reliability, we assessed quality using six key criteria: accuracy, completeness, consistency, conformity, integrity, and duplication (Supplementary Table 10). These metrics allowed us to quantify the strengths and weaknesses of each dataset and prioritize necessary cleaning steps. Overall data quality indices ranged from 83% to 99%, with completeness presenting the most frequent limitation—particularly for historical or spatially disaggregated variables.
To address missing values, we employed K-Nearest Neighbors (KNN) imputation, a non-parametric method that estimates missing entries based on the Euclidean distance between similar data points. This approach was selected for its effectiveness in preserving local data structures and minimizing distortion in multivariate datasets. In time series where only one year of data was missing, we applied constant interpolation, while linear interpolation was used when two or more consecutive years were absent. These techniques ensured temporal continuity without artificially inflating trends or patterns.
Finally, to ensure comparability across indicators and counties, we standardized data using agricultural area estimates from the USDA Cropland Data Layer (CDL). This spatial normalization step allowed us to express all metrics relative to actual cropland extent, enhancing the consistency and interpretability of comparisons within and across counties in the ASI framework.
Data normalization and geographic integrity in spatial analysis at the county level
To enhance data normality, we applied four transformation methods: Yeo-Johnson transformation, Box-Cox transformation with constant addition, quantile transformation, and rank-based inverse normal transformation (INT). These methods were selected because they address different types of non-normality and improve the suitability of data for parametric analyses. The effectiveness of each transformation was evaluated using the Shapiro-Wilk test, confirming significant improvements in normality which, in turn, strengthened the reliability of subsequent analyses.
Outliers were deliberately retained rather than removed, to preserve the full geographic variability and maintain the integrity of county-level patterns. This decision aligns with best practices in spatial data analysis, recognizing that extreme values often reflect genuine local conditions rather than errors. Retaining outliers allows for a more accurate understanding of regional disparities and enhances the ability to capture important variations within the agroecosystem32,33,34,35.
Normalization of the data
We applied both Min–Max normalization and z-score standardization to prepare the data for analysis. Min–Max normalization rescales the data to a fixed range, which is particularly useful for distance-based algorithms like K-Means clustering but is sensitive to extreme values (outliers). In contrast, z-score standardization centers data around a mean of zero and scales it by the standard deviation, preserving the original data distribution and reducing the influence of outliers36,37.
To determine the optimal method, we conducted a comprehensive sensitivity analysis comparing the two techniques based on several criteria: consistency of value ranges, retention of outliers, preservation of variable correlations, performance in machine learning models (Random Forest), clustering quality (K-Means), and similarity of distributions assessed via Kolmogorov-Smirnov tests. This thorough evaluation aimed to select the normalization approach that best balances data comparability and robustness, ensuring reliable outcomes in downstream analyses like clustering and predictive modeling.
Reducing the data dimension
To manage the high dimensionality of our agroecosystem sustainability panel dataset, we applied a systematic dimensionality reduction approach. First, we used Pearson correlation coefficient analysis with a 75% threshold to identify and remove highly correlated variables, reducing redundancy and preventing multicollinearity issues38. Next, we performed principal component analysis (PCA) to further reduce dimensionality by transforming the data into uncorrelated components. The optimal number of principal components was selected based on eigenvalues and the cumulative explained variance, ensuring that the retained components captured most of the dataset’s variability while simplifying the analysis39. This combined strategy improved data interpretability without significant loss of information.
Clustering and feature analysis
We applied both agglomerative clustering and K-means clustering to the county-level metrics derived from the reduced indicator dataset to capture different clustering dynamics. Agglomerative clustering builds a hierarchy of clusters visualized through dendrograms, allowing flexibility in exploring cluster structures without requiring a predetermined number of clusters. In contrast, K-means clustering partitions data into a fixed number of clusters determined by the elbow method, providing computational efficiency and clear cluster membership assignments for large datasets. To objectively evaluate clustering quality and select the optimal method, we used Silhouette Score, Davies-Bouldin Index, and Calinski-Harabasz Score, which respectively measure cluster cohesion, separation, and overall validity.
To understand the key variables of driving cluster differences, we applied random forest classification, which offers robust performance metrics and feature importance ranking. We then used PCA for dimensionality reduction to visually represent clusters in two dimensions, enhancing interpretability with colored dots, distribution ellipses, and principal axis direction lines.
This integrated analytical and visualization strategy enabled a thorough characterization of clusters, facilitating meaningful labeling and interpretation. By “humanizing” the clusters, we gained deeper insights into their sustainability significance, supporting a more nuanced and actionable assessment.
Construction of the ASI: from multi-method synthesis to sensitivity-driven refinement
The ASI was developed through a systematic process exploring various weighting schemes and aggregation methods to capture different compensation behaviors among indicators. This approach was essential to ensure the composite index robustly reflects the multidimensional and complex nature of agroecosystem sustainability. Figure 5 illustrates the methodological framework, highlighting key steps and their interrelationships to provide a clear overview of the process. Providing detailed methodological descriptions enhances transparency and reproducibility, enabling readers to fully grasp the structure, rationale, and decision-making behind the ASI development.
This figure presents an overview of the main stages involved in constructing the Agroecosystem Sustainability Index (ASI), emphasizing how each step connects to the next. Detailed methodological descriptions support transparency and reproducibility, helping readers understand the structure, rationale, and choices made throughout the ASI development process.
Weighting as a source of disputes
The development process of the ASI involved careful consideration of weighting methods, recognizing their significant influence on the resulting composite index within a benchmarking framework. We examined five different approaches: Structured Process Expert Elicitation, Equal Weights, and three variants of Principal Component Analysis (PCA)/Factor Analysis.
Structured expert elicitation was used to systematically capture expert opinions, ensuring informed and context-relevant weighting. Equal weighting was applied due to the prior reduction of highly correlated characteristics in our dataset, which increased transparency and helped avoid double counting40. PCA and factor analysis were chosen for their strength in revealing hidden data structures and relationships among indicators39,41.
PCA was applied in two distinct ways: first, by using the loadings of the first principal component as weights42, capturing the dominant pattern of variability; and second, by using cumulative loadings of components explaining 95% of the variance43, thereby encompassing a broader range of significant factors44,45.
A stepwise assessment was conducted to evaluate weighting and aggregation procedures. Kendall tau rank correlation measured the consistency of indicator rankings across methods, while comparisons of top-ranked indicators highlighted how different weighting strategies influenced index outcomes. This incremental testing exposed the sensitivity and convergence or divergence across computational approaches, offering insight into the robustness of the ASI. Figure 5 presents the methodological framework, illustrating the relationships between weighting strategies and their integration into the overall ASI development process.
Multi-criteria aggregation methods for assessing the sustainability of agroecosystems
Assessing the sustainability of agroecosystems requires a balanced approach that equally considers environmental, economic, and social dimensions, acknowledging their interdependence46. A holistic assessment is critical to avoid situations where poor performance in one dimension is masked by better outcomes in others, which can lead to misleading conclusions about overall sustainability47,48. To effectively address this complexity, we employed multicriteria analysis (MCA) methods, applying four common approaches that differ in their compensatory behaviors. This strategy aligns with core principles of agroecosystem sustainability and enables us to evaluate the sensitivity of the model to trade-offs among indicators and to explore how different balancing mechanisms affect the overall assessment.
-
Minimum Operator or Bottleneck Method (MOM): The sustainability index is the minimum value of all indicator values. This ensures that poor performance on one indicator or aspect cannot be compensated for by good performance on others. This method identifies the weakest link or the most critical point that determines the overall sustainability index49. The formula is given in Eq. (1):
where x1, x2, …, xn are the normalized indicator values and w1, w2, …, wn are the weights for each indicator.
-
Weighted Sum Method (additive) (WSM): This method allows for a complete balance between the indicators, meaning that a good performance on one indicator can completely offset a poor performance on another, depending on the relative weights assigned. The weights represent the relative importance or priority assigned to each indicator. By including weights, the method allows stakeholders or experts to reflect their preferences or priorities in the aggregation process50,51. The formula is given in Eq. (2):
where x1, x2, …, xn are the normalized indicator values and w1, w2, …, wn are the weights for each indicator.
-
Weighted sum of deviations (WSD) method: Due to its partially compensatory nature, the weighted sum of deviations method is well suited for complex sustainability assessments where some trade-offs are acceptable but a complete balance between different sustainability aspects is not desirable. This allows the use of a weighted linear combination of deviations from ideal reference points, with the ASI reflecting the overall deviation from a desired sustainable state20,26,52. Our nuanced approach establishes reference points tailored to the contribution of each indicator to sustainability, selected from three decades of historical benchmark data. Maximum values are selected for positive contributions, minimum values for negative impacts and median values for balanced factors. This three-step methodology—identifying reference values, determining the corresponding transformed values and applying them in aggregation - provides improved interpretability, consistency, flexibility and adaptability (see Supplementary Table 11 for more details). By integrating historical minimum, maximum and median values as reference points, we have developed a robust method that is consistent with the established sustainability assessment. The resulting ASI captures the complex nature of agroecosystem sustainability while ensuring transparency and reproducibility, effectively balancing scientific rigor and practical applicability in agricultural sustainability assessment53,54. The formula is given in Eq. (3):
where xi are the normalized indicator values, ri are the normalized reference values and wi are the weights for each indicator.
-
Geometric mean method (GMM): The geometric mean method enables a partial equalization between the indicators and takes all indicators into account in the calculation. Compared to the arithmetic mean, it reduces the compensation effect and leads to a lower sustainability index if the indicator values deviate greatly from each other. This method considers trade-offs and emphasizes the balanced performance of all indicators55,56. The formula is given in Eq. (4):
where x1, x2, …, xn are the normalized indicator values and w1, w2, …, wn are the weights for each indicator.
Sensitivity analysis for the ASI construction
To ensure the robustness and reliability of the ASI, we conducted complementary methodological and data sensitivity analyses. These analyses aimed to understand how uncertainties in data and methodological choices propagate through the index construction process and to identify the most robust approach for calculating the ASI.
Given the potential influence of subjective decisions in composite index construction, we performed a comprehensive evaluation focusing on the interactions between aggregation, weighting, and normalization procedures. This multi-layered assessment tested the robustness of the ASI under varying methodological scenarios.
We applied a Sobol’ sensitivity analysis—a variance-based method that decomposes the total variance of ASI results—to quantify the relative importance of each methodological factor. This technique provides global sensitivity indices, including first order and total effect measures, allowing us to pinpoint the methodological decisions that most significantly impact ASI variability.
To visualize the results, we used Kernel Density Estimation (KDE) plots to show index value distributions across different method combinations, including aggregation methods (WSM, MOM, WSD, GMM), weighting schemes (PCA, PCA1, FA, Equal, Expert Judgment), and normalization techniques (Min–Max, Z-score). Additionally, boxplots were employed to directly compare outcomes from various aggregation strategies.
This thorough sensitivity analysis enhances the credibility and applicability of the ASI for assessing agroecosystem sustainability in diverse contexts and offers valuable insights for future methodological refinements and practical applications.
Analytical framework for optimizing the ASI design
In developing the ASI, we employed a multi-layered analytical strategy to identify optimal index construction methods, focusing on weighting and aggregation techniques. Our objective was to build a methodological framework that accurately reflects agroecosystem sustainability while ensuring the stability and reliability of the results. This analytical process, illustrated in Fig. 6, involved a rigorous assessment of sensitivity, uncertainty, and robustness across various techniques.
This flowchart outlines the analytical process designed to build a robust and reliable framework for evaluating agroecosystem sustainability. It highlights the steps taken to assess sensitivity, uncertainty, and robustness across multiple techniques, ensuring the accuracy and stability of the results.
We used two complementary methods in our sensitivity analysis: the Sobol method and a variance-based analysis approach. The Sobol method quantified how each technique responds to changes in input parameters, whereas the variance-based approach calculated the mean and variance of index values, sensitivity indices, and visualized the results. Together, these methods provided comprehensive insights into method sensitivities and their roles in capturing overall variability.
To quantify uncertainty, we performed a Monte Carlo simulation with 10,000 iterations, randomly selecting calculation methods and introducing simulated data noise. From this, key statistics—means, standard deviations, and confidence intervals—were derived and visualized via histograms and confidence interval plots.
Robustness was evaluated using a 5-fold cross-validation procedure. For each fold and method, mean square error (MSE) and R² values were computed, and stability was quantified as the standard deviation of these metrics across folds. Results were displayed with scatter plots and bar charts, facilitating comparison of methods based on average performance and stability.
This comprehensive analytical framework enabled the identification of the most reliable and consistent ASI calculation approaches, ensuring a scientifically rigorous and practically applicable methodology for assessing agroecosystem sustainability.
ASI trends and patterns
In our final section, we present the ASI results through a comprehensive, multi-layered analysis (Supplementary Fig. 15). We begin with a temporal trend assessment, tracking percentage changes over a 25-year period in three farmer typologies— “Optimizers”, “Emerging Conservationists,” and “Challenged Growers”.
Geographic variation in ASI values was explored using state-level thermal maps that highlight spatial patterns by farmer type and over time. Descriptive statistical analysis provided summary metrics—including mean, median, standard deviation, range, and distribution shape—for each typology and for the overall dataset. To illustrate the ASI distributions, violin plots were used across six time points (1997, 2002, 2007, 2012, 2017, and 2022), incorporating both general and cluster-specific median lines to facilitate comparisons.
Temporal differences within each farmer typology were further analysed through a one-way ANOVA to test for overall differences across years. This was followed by paired t-tests for year-to-year comparisons and Tukey’s HSD test to assess multiple group differences. Statistical significance was evaluated at different levels, ranging from highly significant (p < 0.001) to non-significant (p ≥ 0.05), with results detailed in Supplementary Tables 7, 8, and 9.
This integrated analysis provides a nuanced understanding of ASI trends, spatial-temporal dynamics, and statistically significant changes across farmer profiles, offering valuable insights into the evolution of agricultural sustainability over time.
Data availability
Publicly available datasets were analyzed in this study. These data can be found here: https://quickstats.nass.usda.gov/; https://www.ers.usda.gov/; https://data.isric.org/geonetwork/srv/eng/catalog.search#/home; https://www.ecmwf.int/en/forecasts/datasets.
Code availability
The Python code used to generate the results is available from the corresponding author upon reasonable request. The analysis was conducted using Python version 3.9, employing libraries such as pandas, numpy, and scikit-learn. Key parameters for dimensionality reduction, weighting schemes, and aggregation methods are documented in the Methods section to ensure reproducibility.
References
Talukder, B., Blay‑Palmer, A., Vanloon, G. W. & Hipel, K. W. Towards complexity of agricultural sustainability assessment: main issues and concerns. Environ. Sustain. Indic. 6, 100038 (2020).
Salas‑Zapata, W. A. & Ortiz‑Muñoz, S. M. Analysis of meanings of the concept of sustainability. Sustain. Dev. 27, 153–161 (2019).
National Research Council. Toward Sustainable Agricultural Systems in the 21st Century (Natl Acad. Press, Washington, DC, 2010).
Karpatne, A. et al. Theory-guided data science: a new paradigm for scientific discovery from data. IEEE Trans. Knowl. Data Eng. 29, 2318–2331 (2017).
Bechini, L. & Castoldi, N. On-farm monitoring of economic and environmental performances of cropping systems: results of a 2-year study at the field scale in northern Italy. Ecol. Indic. 9, 1096–1113 (2009).
Spangler, K., Burchfield, E. K. & Schumacher, B. Past and current dynamics of US agricultural land use and policy. Front. Sustain. Food Syst. 4, 98 (2020).
Lichtfouse, E. et al. Agronomy for sustainable agriculture: a review. Agron. Sustain. Dev. 29, 1–6 (2009).
Robertson, G. P. et al. Farming for ecosystem services: an ecological approach to production agriculture. BioScience 64, 404–415 (2014).
Bennett, E. M., Peterson, G. D. & Gordon, L. J. Linking biodiversity, ecosystem services, and human well‑being: three challenges for designing research for sustainability. Curr. Opin. Environ. Sustain. 14, 76–85 (2015).
Chopin, P. et al. Avenues for improving farming sustainability assessment with upgraded tools, sustainability framing and indicators: a review. Agron. Sustain. Dev. 41, 1–20 (2021).
Binder, C. R., Feola, G. & Steinberger, J. K. Considering the normative, systemic and procedural dimensions in indicator-based sustainability assessments in agriculture. Environ. Impact Assess. Rev. 30, 71–81 (2010).
Joint Research Centre–European Commission. Handbook on Constructing Composite Indicators: Methodology and User Guide (OECD, Paris, 2008).
Mühlematter, D. et al. A quarter-century of data show that agroecosystem sustainability in the US Corn Belt is shaped by environmental, economic and social factors. npj Sustain. Agric. (submitted) (2025).
Osborne, J. W. Improving your data transformations: applying the Box–Cox transformation. Prac. Assess. Res. Eval. 15, 12 (2010).
Ghasemi, A. & Zahediasl, S. Normality tests for statistical analysis: a guide for non‑statisticians. Int. J. Endocrinol. Metab. 10, e486 (2012).
Razali, N. M. & Wah, Y. B. Power comparisons of Shapiro–Wilk, Kolmogorov–Smirnov, Lilliefors and Anderson–Darling tests. J. Stat. Model. Anal. 2, 21–33 (2011).
Beasley, T. M., Erickson, S. & Allison, D. B. Rank‑based inverse normal transformations are increasingly used, but are they merited?. Behav. Genet. 39, 580–595 (2009).
van der Waerden, B. L. Order tests for the two‑sample problem and their power. Indag. Math. 14, 453–458 (1952).
Gómez‑Limón, J. A. & Sanchez‑Fernandez, G. Empirical evaluation of agricultural sustainability using composite indicators. Ecol. Econ. 69, 1062–1075 (2010).
Gan, X. et al. When to use what: methods for weighting and aggregating sustainability indicators. Ecol. Indic. 81, 491–502 (2017).
Sobol’, I. M. Sensitivity estimates for nonlinear mathematical models. Math. Model. Comput. Exp. 1, 407–414 (1993).
Saisana, M., Saltelli, A. & Tarantola, S. Uncertainty and sensitivity analysis techniques as tools for the quality assessment of composite indicators. J. R. Stat. Soc. A Stat. Soc. 168, 307–323 (2005).
Greco, S., Ishizaka, A., Tasiou, M. & Torrisi, G. On the methodological framework of composite indices: a review of the issues of weighting, aggregation, and robustness. Soc. Indic. Res. 141, 61–85 (2019).
Sala, S., Ciuffo, B. & Nijkamp, P. A systemic framework for sustainability assessment. Ecol. Econ. 119, 314–325 (2015).
Becker, W., Saisana, M., Paruolo, P. & Vandecasteele, I. Weights and importance in composite indicators: closing the gap. Ecol. Indic. 80, 12–22 (2017).
Singh, R. K., Murty, H. R., Gupta, S. K. & Dikshit, A. K. An overview of sustainability assessment methodologies. Ecol. Indic. 15, 281–299 (2012).
Liebman, M., Helmers, M. J., Schulte, L. A. & Chase, C. A. Using biodiversity to link agricultural productivity with environmental quality: results from three field experiments in Iowa. Renew. Agric. Food Syst. 28, 115–128 (2013).
Gentry, L. F., Ruffo, M. L. & Below, F. E. Identifying factors controlling the continuous corn yield penalty. Agron. J. 105, 295–303 (2013).
Schlegel, A. J. et al. Long‑term tillage on yield and water use of grain sorghum and winter wheat. Agron. J. 110, 269–280 (2018).
Grassini, P., Thorburn, J., Burr, C. & Cassman, K. G. High‑yield irrigated maize in the western U.S. Corn Belt: I. On‑farm yield, yield potential, and impact of agronomic practices. Field Crops Res. 120, 142–150 (2011).
Prokopy, L. S. et al. Adoption of agricultural conservation practices in the United States: evidence from 35 years of quantitative literature. J. Soil Water Conserv. 74, 520–534 (2019).
Osborne, J. W. & Overbay, A. The power of outliers (and why researchers should always check for them). Prac. Assess. Res. Eval. 9, 6 (2004).
Leys, C. et al. How to classify, detect, and manage univariate and multivariate outliers, with emphasis on pre‑registration. Int. Rev. Soc. Psychol. 32, 5 (2019).
Anselin, L. Local indicators of spatial association—LISA. Geogr. Anal. 27, 93–115 (1995).
Fotheringham, A. S. & Rogerson, P. A. (eds) The SAGE Handbook of Spatial Analysis (SAGE, London, 2008).
Aksoy, S. & Haralick, R. M. Feature normalization and likelihood‑based similarity measures for image retrieval. Pattern Recognit. Lett. 22, 563–582 (2001).
Jain, A., Nandakumar, K. & Ross, A. Score normalization in multimodal biometric systems. Pattern Recognit. 38, 2270–2285 (2005).
Guyon, I. & Elisseeff, A. An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003).
Jolliffe, I. T. & Cadima, J. Principal component analysis: a review and recent developments. Philos. Trans. R. Soc. A 374, 20150202 (2016).
Manly, B. F. Multivariate Statistical Methods: A Primer (CRC Press, Boca Raton, 1994).
Abdi, H. & Williams, L. J. Principal component analysis. Wiley Interdiscip. Rev. Comput. Stat. 2, 433–459 (2010).
Wold, S., Esbensen, K. & Geladi, P. Principal component analysis. Chemom. Intell. Lab. Syst. 2, 37–52 (1987).
Jackson, J. E. Stopping rules in principal components analysis: a comparison of heuristical and statistical approaches. Ecology 74, 2204–2214 (1993).
Peres‑Neto, P. R., Jackson, D. A. & Somers, K. M. How many principal components? Stopping rules for determining the number of non‑trivial axes revisited. Comput. Stat. Data Anal. 49, 974–997 (2005).
Zou, H., Hastie, T. & Tibshirani, R. Sparse principal component analysis. J. Comput. Graph. Stat. 15, 265–286 (2006).
United Nations. Transforming our world: the 2030 Agenda for Sustainable Development (UN, New York, 2015).
Gasparatos, A., El‑Haram, M. & Horner, M. A. A critical review of reductionist approaches for assessing the progress towards sustainability. Environ. Impact Assess. Rev. 28, 286–311 (2008).
Phillis, Y. A. & Andriantiatsaholiniaina, L. A. Sustainability: an ill‑defined concept and its assessment using fuzzy logic. Ecol. Econ. 37, 435–456 (2001).
Gómez‑Limón, J. A. & Sanchez, G. Agricultural sustainability composite indicators: empirical evaluation. Ecol. Econ. 69, 1062–1075 (2010).
Bachev, H. Sustainability level of Bulgarian farms. Bulg. J. Agric. Sci. 23, 1–13 (2017).
Galié, A. et al. The Women’s empowerment in livestock index. Soc. Indic. Res. 142, 799–825 (2020).
Ness, B., Urbel‑Piirsalu, E., Anderberg, S. & Olsson, L. Categorising tools for sustainability assessment. Ecol. Econ. 60, 498–508 (2007).
Böhringer, C. & Jochem, P. E. Measuring the immeasurable—a survey of sustainability indices. Ecol. Econ. 63, 1–8 (2007).
Mori, K. & Christodoulou, A. Review of sustainability indices and indicators: towards a new City Sustainability Index (CSI). Environ. Impact Assess. Rev. 32, 94–106 (2012).
Sajasi, Z., Shobeiri, S. M., Karami, E. & Salmani, M. Sustainability assessment of rural tourism in Iran: a composite index approach. Tour. Plan. Dev. 16, 637–662 (2019).
Zahm, F. et al. Assessing farm sustainability with the IDEA method: from the concept of agriculture sustainability to case studies on farms. Sustain. Dev. 16, 271–281 (2008).
Acknowledgements
This work was funded by Syngenta Crop Protection through the authors’ employment. The authors would like to thank colleagues across various business units at Syngenta for their valuable feedback and insightful comments throughout the development of this manuscript.
Author information
Authors and Affiliations
Contributions
D.M., S.J.M., and M.N. jointly developed the research concept and defined the research questions and methodology. D.M. conducted the data analysis, created the visualizations, and wrote the first draft of the manuscript. M.N. reviewed and substantially restructured the manuscript. All authors contributed to discussions throughout the study and reviewed the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
All authors were employed by Syngenta during the time of the study. The authors declare no competing interests and no related patents or patent applications.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Mühlematter, D.J., Maund, S.J. & Nina, M. Agroecosystem sustainability index ASI for measuring environmental and socioeconomic sustainability. npj Sustain. Agric. 3, 51 (2025). https://doi.org/10.1038/s44264-025-00095-9
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s44264-025-00095-9








