Abstract
This study evaluates the relationship between soil erosion and 12 environmental and anthropogenic variables to identify erosion-susceptible areas in the Bistrita River basin (Romania), using machine learning algorithms (MLA). Three supervised classification algorithms Support Vector Machine (SVM), K-Nearest Neighbors (K-NN) and Random Forests (RF) were trained using 4761 sets of values, 1191 values to validate the models, and 1488 to test the models and to ensure that they could be applied in practice. The performance of each model was evaluated using Mean Squared Error (MSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and R-squared (R2). Using natural breaks, the results were spatially represented using 5 erosion susceptibility classes (very high, high, moderate, low, and very low). The best trained MLA (RF with R2 = 0.67) was used to simulate three scenarios i.e., what happens in 2050, 1) if the current deforestation trend (2001–2023) is continued and precipitation is reduced, on average by 0.5 mm/y, 2) if the trend is reversed and the forested areas of 2001 are returned, or 3) if the sub-Carpathian area, the most affected by erosion, is afforested by > 50,000 ha. In this last scenario, the model showed that the areas with very high erosion (23.2% of the surface area) are transformed into the high and moderate classes, but the estimated costs of planting and maintaining the forested areas are estimated at approximately 775 million euros, which may represent a serious limitation in achieving this goal. In the Subcarpathian area, where erosion is substantial, simulations have shown that afforestation of at least 50 ha/km2 significantly reduces the phenomenon. Each 10 ha increase above this value causes a reduction in erosion by 1 t/ha/year, so at 90 ha/km2 an erosion of 1 t/ha/year is reached. Developing scenarios to assess susceptibility to erosion using machine learning (ML) training models with the variable of increasing/decreasing different environmental and anthropogenic variables could help local authorities or various administrators to improve their ecosystem management.
Similar content being viewed by others
Introduction
Regardless of the cause, soil erosion is one of the main problems of humanity in this century, especially in the context of intensive use of agricultural land and current climate change1,2. Erosion is a natural process, but if it exceeds certain limits, it can lead to the removal of significant areas of land from agricultural production and sediment accumulation in river channels and reservoirs. It is estimated that between 2015 and 2030, more than 20 million hectares of European agricultural land are at risk of abandonment, the main causes being related to agricultural land management practices3,4, and wind erosion, which is why 5% of European agricultural land—the equivalent of 0.10 million km2—is affected by this destructive phenomenon5. As a result, the degradation of agricultural land represents a risk to the population’s food security6. The balance between soil loss and the rate of soil compaction is very fragile and can be destroyed quickly7,8. For this reason, estimating the areas affected by soil erosion is a concern worldwide, especially within the European Union (https://esdac.jrc.ec.europa.eu/). This concern is reflected in the establishment, in 2006, of the European Soil Data Centre (ESDAC)9, which aims to monitor the quality and main problems facing the soil resources of Europe and beyond10,11,12.
Soil erosion threatens global food security, as it directly affects soil fertility because of nutrient loss13 and, consequently, agricultural productivity14. From a socio-economic point of view, soil loss affects agri-food markets and the economic sector, negatively impacting the population located in disadvantaged areas, such as those in the tropics, where the soil erosion rate exceeds 100% of its recovery capacity15. In the European Union alone, the loss of crop productivity is estimated at an annual cost of 1.25 billion euros, equivalent to 0.43% of the 12 million hectares affected by erosion, of which 155 million euros of gross domestic product (GDP) is lost16. The increase in deforested areas and the expansion of agricultural land led to a 2.5% increase in degraded land worldwide between 2001 and 201217,18.
Climate warming scenarios combined with a population increase of over 9 billion by 2050 highlight the need to secure food resources by increasing agricultural production from 60–70% to 100%19. As the global population grows, so does the demand for food, which implies high water consumption and a reduction in forest cover20, with an annual average decline of 5.2 million hectares.
In these conditions, knowing the total amount of soil lost through erosion is a priority. Several methods can be used to estimate soil loss: numerous empirical/physical models (Universal Soil Loss Equation—USLE)21; revised soil loss equation due to soil erosion Revised Universal Soil Loss Equation (RUSLE)22,23; modified Mediterranean Desertification and Land Use (MEDALUS)24) or MLA being used in studies at a global level25,26,27,28 or a local level29. The advantages of applying calculation formulas to smaller areas lie in the fact that the resolution of the analysis is superior, beyond the general parameters at a global scale, using data recorded at a microscale.
Currently, the development of MLA, part of the large field of Artificial Intelligence (AI), aims to develop algorithms and models for data analysis and classification, and manages to make predictions and provide solutions in the context of a large volume of data and complex natural processes. For both supervised learning (classification and regression) and unsupervised learning (clustering and dimensionality reduction), SVM, RF or K-NN can be used both as a classification and a regression method30,31,32,3334 and can highlight the environmental relationships or the spatiotemporal evolution of various complex natural processes. ML models are successfully used in ,multiple fields of activity3233 with numerous applications in geoscience, including the identification of groundwater potential34, and susceptibility to flash floods35,36,37, gully erosion38, landslide39, or soil erosion40. Because ML tries to find patterns in existing data, without trying to explain physical processes, even if the data structure is not linear, we have the opportunity to avoid or reduce the costs of measuring difficult-to-measure parameters and use variables that are available in national monitoring networks (such as precipitation recordings) or can be obtained with the help of satellite images (forested areas, land use, etc.)
Identifying the areas vulnerable to erosion as efficiently as possible and making a forecast regarding the evolution of potentially affected surfaces in different scenarios can help prevent and reduce vulnerability at much lower costs than if early warnings had not been obtained. The application of algorithms provided by ML can provide conclusive results both in terms of the current distribution of erosion in a given area41 and to estimate which areas will be most affected and what will be the future trend42.
In Romania, the estimation of general soil erosion has been an important research topic since the 1970s43,44 when USLE methods began to be applied to specific areas of Romania, resulting in a locally developed Romanian Soil Erosion Model (ROMSEM)45. To date, customized RUSLE models have been applied in several studies in Romania, especially in the mountainous and hilly areas46,47,48. Most of these studies have evaluated the degree of land degradation and soil loss47,48,49,50, but ML has only been used in areas of the Subcarpathians, based on the RF algorithm and satellite images51, as well as on the dynamics of forest biomass concerning climate change52.
The purpose of our research is to evaluate the areas of susceptibility to soil erosion in the Bistrita River basin, Romania, based on the following steps: a) identification of the natural and anthropogenic factors that favor the acceleration of soil erosion and land degradation; b) application of the statistical analysis to establish the correlations between variables; c) utilization of the MLA i.e., SVM, K-NN and RF to establish the areas susceptible to erosion; d) forecasting a scenario for the year 2050 using the best trained model in the case of changing forested areas (reducing or increasing forested areas) and reducing average annual precipitation by 0.5 mm/y; e) forecast of the susceptibility of soils to erosion in the Subcarpathian area, the most affected by erosion, in the scenario in which its afforestation would be significant (minimum 50 ha of forest/km2).
Methods
Study area
The Bistrița River basin is located in the northeastern part of Romania, at the contact between the Eastern Carpathians and the Moldavian Subcarpathians (Fig. 1). The main river draining the basin is the Bistrița River, which has a total length of 283 km53, while the basin itself covers an area of 7045.17 km2. The basin includes two land units: the mountain unit, which takes up 81% of the basin area, and the Subcarpathian unit, which takes up the remaining 19%. The basin ranges in altitude from 2245 m in the Rodna Mountains to 133 m at the confluence with the Siret River, of which the Bistrita is a tributary. The Bistrița basin occupies 2.9% of the total area of Romania and 16.4% of the area of the Siret River basin.
Location of the Bistrița River Basin within Romania. (a) Relationship with the main landforms; (b) Population density in the localities of the Bistrița Basin; (c) land use in 2018. Map produced using ArcGis10.8 (https://desktop.arcgis.com/), using open sources (vector and raster data from https://www.openstreetmap.org/#map=7/45.996/24.981; https://earthexplorer.usgs.gov; https://geo-spatial.org and http://insse.ro).
From a geological point of view, the Bistrița basin overlaps from NW to SE over formations belonging to an extensive geological gap from the Precambrian to the Holocene. The extended geological variety, along with the high altimetric gap, is also expressed in the diverse morphology of the basin in the two sectors. In the mountain basin (in which the Bistrița flows for 200 km, being the river with the longest mountain course in Romania) the glacial morphology is present in the Rodna Mountains54,55, the periglacial relief, the petrographic and structural relief in the Ceahlău and Tarcău Mountains, the fluvial relief being well represented in all the mountain massifs. The Subcarpathian relief is characterized especially by the series of fluvial terraces developed on both banks of the Bistrița56. On these terraces, many human settlements have developed.
The Bistrita River basin presents a diversity of erosional processes, determined by the fragmented relief and the heterogeneous substrate, resulting from the diversity of the relief topography on which it is overlain. The predominant forms of erosion are dependent on the types of vegetation covering the land and lithology (shales, marls, or clays) and include splash and sheet erosion, rill and gully erosion, or bank erosion. Areas developed on crystalline or volcanic structures are more stable due to their structural strength. The dominant soils include cambisols, luvisols, and umbrisols (Soil atlas of Europe—SAE). Cambisols and luvisols are common in mountain basins, with medium to fine texture and moderate drainage capacity, but are vulnerable to scouring on deforested or arable-covered slopes. The general climate is temperate-continental, specific to the mountainous and hilly plateau, with mean annual temperatures decreasing from 9.5 °C at Bacău (174 m) to 2 °C at Rarău (1,220 m) and − 0.5 °C at Călimani (2,011 m). Average precipitation decreases from about 800 mm per year in the mountainous western plateau to less than 600 mm in the Subcarpathians due to continental influences.
Significant changes in land use occurred starting in the 1950s and 1960s, during the construction of the reservoirs that make up the Bistrița hydropower system (Izvorul Muntelui Lake, built between 1950 and 1960, Bâtca Doamnei Lake in 1962, and Lilieci Lake in 1966). The 272 localities within the Bistrița basin have been grouped into 110 territorial administrative units (TAUs) in 7 counties, at the level of 2023 (http://www.dpfbl.mdrap.ro/populatie_uat-uri.html), concentrating approximately 840,000 inhabitants. Administratively, the largest areas belong to Neamț and Suceava counties (47% and 31% of the entire basin area, respectively). The anthropogenic pressure in the entire area is unevenly distributed within the basin. The average population density is 122 inhabitants/km2, being higher in the main valleys and lower on the slopes, higher in the Subcarpathians, and lower in the mountainous area. The highest densities are recorded in the area of the major cities, Piatra Neamț and Bacău, with population densities of 1,409 and 4,484 inhabitants/km2, respectively (Fig. 1b).
This inequality is also reflected in the current land cover, which highlights the reduction of forest areas in favor of those occupied by infrastructure and agricultural land, especially in valleys, depressions, and in the sub-Carpathian space (Fig. 1c), according to Corine Land Cover (CLC) 2018 data (https://land.copernicus.eu/en/products/corine-land-cover).
Statistics and machine learning regressions
To create the database used in the assessment of erosion susceptibility in Bistrita River basin, a fishnet with a size of 1 km2 was created in ArcGIS 10.8. This resolution was used because of limitations in computing power and because 1 km2 was judged sufficient for the purpose of the study. The corresponding average values (n = 7440) were extracted or calculated for each surface for 12 variables considered as new determinants ( precipitation, rainfall erosivity (R factor), length slope (LS factor), forest area, Topographic Wetness Index (TWI), slope, road density, profile curvature, drainage density, plan curvature, lithology, and exposure) (the fishnet was clipped according to watershed limits resulting in 800 polygons with a surface of < 1 km2 and 6640 polygons with a surface of 1 km2).
To determine the morphometric parameters of the watershed, the Digital Elevation Model (DEM) was used from the SRTM (https://earthexplorer.usgs.gov/ accessed on 1 December 2023) with a resolution of 1:30 m combined with 1:25,000 topographic maps (Stereo 70 map projection). The data w processed to be adapted to the same projection system. Precipitation, forest area, road, and lithology data were extracted from national databases (Romanian daily gridded climatic dataset (ROCADA) climatic dataset provided by the National Meteorology Administration; National Forest Inventory of Romania http://roifn.ro/site/rezultate-ifn-2/; Open Street Map (OSM) https://www.openstreetmap.org/#map=6/46.00/24.98; Geological Institute of Romania maps 1:200,000).
In addition to the quantitative parameters, resulting from calculations or measurements, variables such as lithology or slope orientation (qualitative variables) were quantified through reclassification, being assigned values from 1 to 5. The estimated soil erosion in the Bistrita River basin was assessed based on ESDAC on the RUSLE Equation57 [Eq. 1].
where E: annual average soil loss (t ha−1 yr−1), R: rainfall erosivity factor (MJ mm ha−1 h−1 yr−1), K: soil erodibility factor (t ha h ha−1 MJ−1 mm−1), C: cover-management factor (dimensionless), LS: slope length and slope steepness factor (dimensionless), and P: support practices factor (dimensionless). Statistical and MLA of the dataset were analyzed using JASP 0.19.1.0. Given that the selected variables had different scales, the data were normalized accordingly. Normalization (Min–Max Scaling) rescales data to a fixed range and is adequate for algorithms sensitive to the magnitude of features. This preprocessing step provided a fair comparison between features and improved algorithm performance.
SVM is a supervised learning algorithm used for data classification and regression, proposed by V. Vapnik (1995)58, with the help of which, by identifying a hyperplane (a 2D line or a 3D plane), the analyzed data can be separated into different classes, based on a pattern58. The best advantage of using SVM is the large amount of data that can be analyzed simultaneously.
K-NN is a simple and very efficient algorithm30, which works on the principle that similar objects have closer positions to each other than different objects. Starting from this, the algorithm calculates for each given pixel the distance to neighboring pixels with similar properties, based on given characteristics, classifying them into classes of elements with similar properties. Two elements are important in the calculation method, i.e., the measurement of the distance between two pixels, which is the Euclidean distance between them, and the parameter k that determines how many neighbors will be chosen for the algorithm. The appropriate choice of the k factor is essential for the functioning of the model.
RF also belongs to the supervised learning algorithms. It is an extension of the decision tree algorithm and works by separating several smaller decision trees and training them on different subsamples of the dataset used. In this way, overtraining and re-testing of values is avoided, making it usable for large data sets from the analysis of data with very good accuracy results59.
The 12 variables were selected based on established relationships with erosion, statistical, and principal component analysis (PCA), and have also been successfully used in similar studies2,9,12,18,20,25. Although land use /land cover (LULC) has been shown to influence erosion strongly, we included only forest cover based on this variable having been shown to have the strongest and most consistent influence on erosion worldwide, and the need to limit the number of classes to make a practical 2050 forecast. The 7440 values for each of the 12 variables were randomly divided into 3 groups i.e., 4761 values were used to train the models, 1191 data to validate the models, and 1488 to test the models and to ensure that it could be applied in practice.
To compare the accuracy and efficiency of the three methods used (SVM, RF and K-NN) the errors were measured using various statistical indices according to the next equations (Eq. 2–6):
The representation of the data obtained regarding the susceptibility to erosion in the Bistrita River basin was carried out using 5 classes (very high, high, moderate, low, and very low), and the classification was performed using the natural breaks method. This method represents a relevant approach for classifying mathematical data into different categories. It is based on the most perfect form of arranging the values into different classes, so that the internal variation at the level of each class is as low as possible, and the differences between the categories are easily noticeable. It is an ideal solution for our study based on complex and heterogeneous datasets, unlike the Equal Intervals method, which groups the values into classes of equal intervals, without taking into account the internal variation and the density of the data. Consequently, it contributes to the situation in which some classes may be devoid of values59,60.
For the prediction of the evolution of deforested areas in 2050, we analysed deforestation data from Global Forest Watch from the 2001–2023 period combined with data from National Forest Inventory of Romania (http://roifn.ro/site/rezultate-ifn-2/) and National Forest Strategy 2030. The estimated costs for afforestation and forest maintenance for hill and mountain areas has been realised according to National Recovery and Resilience Plan (PNRR) (Component C2: Forests and Biodiversity Protection) (https://www.mmediu.ro/app/webroot/uploads/files/SFN%2030%20FINAL.docx). Spatio-temporal changes of precipitations, and prediction for 2050 were done using data from the ROCADA climatic dataset provided by the National Meteorology Administration60.
Results
Evaluation of the environmentaland anthropogenic variables
The curvature in the plane represents the average curvature of the contour lines on the surface of a pixel and measures the rate of change of the direction of runoff on a horizontal plane, practically, the concentration of runoff on the slope. The average values per km2 are between − 0.24 and 0.5, with the negative values characterizing surfaces with convergent runoff and the positive ones characterizing surfaces with divergent runoff. Negative values are characteristic of surfaces with accelerated runoff, being more frequent on interfluves, and the positive ones are characteristic of surfaces with decelerated runoff, with maxima in minor riverbeds. High values, above 0.3, have a more pronounced distribution in the Subcarpathian sector along the Bistrita River as well as in the perimeter of the Izvoru Muntelui reservoir (Fig. 2a).
Maps of Bistrita River basin regarding plan curvature (a), profile curvature (b), drainage density (c), slope (d), slope aspect (e), TWI (f), road density (g), forest surface (h), precipitation (i), R factor (j), LS factor (k), lithology (l) Map produced using ArcGis10.8, using open sources (vector and raster data from https://www.openstreetmap.org/#map=7/45.996/24.981; https://earthexplorer.usgs.gov; https://geo-spatial.org).
The density of the hydrographic network was calculated based on the drainage network extracted from the DEM and provided values ranging between 0 and 5.75 km/km2, with the highest values being present in depression areas and confluence sectors. This indicator highlights the evolution and degree of fragmentation of the relief, surface runoff, and torrentiality because it stands for the ratio of the length of all river segments, both with permanent runoff and especially with temporary runoff, to the surface of the Bistrita River basin (Fig. 2c). The average values of 1.23 km/km2 correlate with the positioning of the basin predominantly in the Carpathian area (81%) where the lithological hardness of the consolidated rocks (metamorphic rocks—mica schists, phyllites or sedimentary—sandstones and conglomerates), presents an inversely proportional relationship with the density of relief fragmentation. The transition to the subcarpathian sector, where loessoid, sand, gravel, and marly clay deposits predominate, implies a higher degree of fragmentation, because of the accentuated erosion and dislocation of soil particles. Thus, in the Subcarpathian sector, the values of the density of relief fragmentation are high, over 5 km/km2 in the area downstream of the city of Buhuși. These high values are also a consequence of the change in land use, the expansion of urban areas, uncontrolled deforestation, and agricultural practices that favor erosion processes.
The slope within the Bistrița basin is between 0 and 33°, the high values being a consequence of the hard rocks predominantly found in the mountainous area (Fig. 2d). Declivity is a factor with implications on liquid and alluvial runoff, on the gravitational displacement processes and on which the amount of solar radiation received from the Sun depends. In the Eastern Carpathians, the slopes frequently exceed 20° and concentrate extensive and compact areas that are grafted to the hard metamorphic strip represented by mica schists and phyllites that occupy the central area of the Bistrița basin (NE flanks of the Massifs: Giumalău and Pietrosu, Bîrnaru, and Grintiesul Mic). Thus, the right slope of the Bistrița is steeper and exposed to erosion processes as a result of the eastern exposure that facilitates insolation, but this effect is dampened by the reduced sunlight time because of the solar angle resulting from the high inclination of the slope. These high values of the slope also characterize the strip of Carpathian sandy flysch, represented in the Ceahlău and Tarcău Mountains, within which the slopes show a rapid transition between the mountainous and the depressional relief, represented by the Bicaz Depression. Areas with slopes exceeding 25° become indicators of torrential valleys through which water is concentrated in a shorter time on the slope, and these particularly characterize the slopes related to the spring areas. In the Moldavian Subcarpathians, on the other hand, the slope values decrease to below 10°, with slopes below 3° prevailing in the Bistrița corridor, as well as in the lower part of the slopes of the secondary valleys. Low slope values favor the processes of alluvial deposition and erosion reduction. Also, slope values below 10° are found at the level of quasi-horizontal surfaces in the intra-mountainous depressions: Dorna Cândreni, Șaru Dornei, Drăgoiasa, or Bilbor.
The role of slope orientation is important both in terms of vegetation vitality and density (Singh, 2018) and, therefore, indirectly with an impact on slope stability, as well as in terms of current morphodynamics. For the Bistrița basin, the location in the northern hemisphere leads us to the idea that quasi-horizontal surfaces are the least vulnerable to erosion, followed by slopes oriented to the north, while southern slopes are the most exposed to insolation and therefore the most vulnerable. A higher rate of peaks oriented in the south and southeast direction can be observed throughout the Bistrița River basin (Fig. 2e).
The potential of slope orientation derives from the stage of development of the hydrographic network, which has a general NW–SE drainage direction. Therefore, this has imposed a predominantly NE-SW arrangement of the tributaries located on the left bank and a reverse flow direction, from SW to NE of those located on the left bank of the basin. The gorge sector created by the Bicaz River determines a southern orientation of the Ceahlău Massif, extending to the Neamț Mountains, which have a southern fall along the Bicaz Depression.
On the southern slopes, the direct exposure to solar radiation determines increased evaporation and reduced vegetation due to moisture limitation, which creates a high morphodynamic potential in these areas, making them more prone to erosion and processes of breaking the cohesion of the rocks. The shaded and colder northern slopes occupy the smallest percentages, and are generally found in areas located in the proximity of quasi-horizontal surfaces, at the level of the Șaru Dornei and Drăgoiasa Depressions, to the right of the Negrișoaia stream, and at the level of Lake Izvoru Muntelui.
TWI (Topographic Wetness Index) expresses the terrain-determined balance of the water supply of the watershed and the local drainage. High values of TWI are found in areas prone to over-wetting because of the cantonment of runoff, which can increase the instability of geological deposits and soil (Fig. 2f). Also, excess water can accumulate, generating a range of erosional processes, from gullies to torrential bodies. On the other hand, the greater water availability can have positive consequences for the vegetation, which can be more vigorous in these areas61,62. Therefore, this parameter must be used in association with other parameters, the most important being the slope. Within the Bistrița River basin, the index values range between 4.4 in secondary basins and on interfluves, where the higher slope prevents water retention and generates accelerated erosion, and 14.6 in the area of the Izvorul Muntelui reservoir and in the Subcarpathian area, where water retention favors the occurrence of areas with excessive humidity.
A correlation can be observed between TWI and slope, with slope values, below 10°, being associated with an increase in TWI value, due to lower rates of throughflow. In the Eastern Carpathians, high TWI values are found in the depression areas, Dorna Cândreni, Șaru Dornei, Bilbor, and Borsec. Overall, however, high TWI values are most concentrated in the Subcarpathian region, on the left bank of the Bistrița River, where the slopes drop below 3°. However, high TWI values superimposed on the clayey rocks of the Subcarpathian region increase the susceptibility to erosion, as a result of the soil impermeability and wetting of the superficial soil layer.
The road density was calculated based on all existing roads on Open Street Map (https://www.openstreetmap.org/#map=6/46.00/24.98), which includes all categories of roads, including the ones within localities. The resulting values ranged between 0 and 39.4 km/km2 (Fig. 2g). The highest values of the road network density were found in localities, especially in cities. While most surfaces in the mountainous areas were found to have a road density of 0, in some mountainous areas, logging roads are present and are elements of imbalance in the relief due to their rudimentary construction and, use by heavy machinery for transporting logs by dragging. Moreover, once exploitation is ceased, most logging roads are abandoned mostly without conservation measures having been applied.
The main road arteries in the Bistrita basin are the national road, DN15, which crosses the Carpathians, and the European roads E58, E85, and E578. In addition, there is a network of national, county, and municipal roads that connect the main settlements, resulting in important road junctions in almost all the cities of the Bistrita basin (Bacău, Piatra Neamț, Bicaz, Borsec, and Vatra Dornei).
The forest cover weights for each km2 extracted from the Corine Land Cover 2018 (https://land.copernicus.eu/en/products/corine-land-cover) show that the least forested areas (below 0.2 km/km2) are found mainly in the subcarpathian area and in mountainous depressions where forests have been gradually replaced by agricultural lands, inhabited areas or various economic activities (exploitation of useful minerals, expansion of tourist infrastructure, etc.) (Fig. 2h). However, the basin shows average values for forested areas of 0.59 km/km2, except in the high mountain areas (occupied by alpine grasslands) and the subcarpathian area (dominated by agricultural lands). At the level of the administrative units, the counties with the highest percentage of forested areas within the Bistrița River basin are Neamț County (62%) and Suceava County (64%) (https://insse.ro). The dominant tree species are deciduous, followed by conifers. Not coincidentally, given these patterns, in 2022, the mentioned counties occupied the first and third positions in the total quantity of wood harvested in Romania, with 10.4% and 6%, respectively.
During the period 2001–2023, forested areas in the study area have consistently declined, with the total area deforested over this period being more than 58,000 ha, while the areas reforested was only 7,700 ha. The largest annual losses were recorded in 2007 (9,279 ha), while the smallest losses were recorded in 2003 (777 ha). Simultaneously with the reduction of forested areas, there is also a sharp fragmentation of forest bodies, which increases the risk of falling during periods of increased wind and reduces the efficiency of water retention in the soil. Even if the general trend of deforestation is slightly downward, in the absence of serious reforestation measures, the total forested area in the Bistrita River basin will continue to decline, accelerating the processes of slope degradation and intensifying soil erosion.
The average annual precipitation ranges from 574 mm at the confluence of the Bistrița and the Siret and 702 mm in the mountain range (Fig. 2i). Precipitation influences the process that shapes slopes and river channels through its influence over the duration and volume of surface runoff and river discharge, respectively. The highest precipitation values are recorded in the high regions of the Rodna Massif at over 1900 m and the Călimani Massif at over 1850 m, where the main tributaries of the Bistrița arise (the Bila, Putreda, Lala, and Dorna). Precipitation values of over 700 mm are also found in the central part of the Bistrița basin, within the mountain range of the Ceahlău Massif. As the altitudes decrease, in the valleys of the Bistrița River and its main tributaries (Dorna, Neagra Bicaz, and Tarcău), precipitation decreases below 600 mm. Temporal trends in precipitation are also evident in the study area, as for the southeastern region of Europe generally60, with an average decline of approximately 0.5 mm/y compared to the period 1961–201363.
The R factor is a multiannual parameter that quantifies the kinetic energy and intensity of precipitation (MJ mm ha−1 h−1 yr−1) 11, and has values ranging from 771 to 1016 (Fig. 2j). The erosivity of rain influences the process of fluvial modeling and slope destabilization, especially in the case of long-term or torrential precipitations. The minimum values of the R factor are typically within the Subcarpathian region, and the maximum ones are found only on the high mountain peaks at the level of the Pârâului Sec stream or the level of the Crucea locality (1711 m), to the right of the Bistrița River. In the Bistrița Subcarpathians, the R factor has values, in general, below 850 mm, the exception being the southern extremity, along the Bistrița valley corridor. In the urban areas of this valley corridor, the absence of vegetation favors the appearance of “heat islands”, which combined with the atmospheric pollution (increase in condensation nuclei), leads to a higher precipitation of 850–900 mm and an intensification of extreme climatic phenomena with implications for soil erosion in small areas.
The LS factor (Slope length and Steepness) is a relevant morphometric parameter in analyzing water runoff on the slope, as well as in determining the degree of erosion because it expresses the influence that the relief configuration has in the universal equation of soil erosion23. The values of the LS factor, calculated based on the ratio between the slope and the length of the slopes64, are between 0 and 70.8, with the average value being 31, which indicates a more significant predisposition to slope erosion. A tendency for LS factor values to decline in a direction from the NW to the SE (Fig. 2k), concomitant with the reduction of altitude, is evident, while values over 40 are recorded around the human settlements of Borșa, Șanț, and Rodna. The highest LS factor values are found in the localities of Șaru Dornei and Bilbor, reflecting short, steep slopes, which gives the hydrographic courses a torrential character. In contrast, in the Moldavian Subcarpathians, the low slopes and altitudes of 300–400 m result in LS factor values below 2.
The lithology of the basin has been classified in values ranging from 1 to 5 depending on the cohesion of the rocks and their resistance to erosion (Fig. 2l). The geological complexity of the region can be simplified by identifying within the central sector a belt composed mainly of metamorphic formations and old, stable magmatites, with high resistance to erosion, bordered to the east and west by formations more prone to erosion, either friable, sedimentary deposits belonging to the Triassic-Holocene interval, or Neogene magmatites, composed predominantly of deposits of the last stratovolcanoes of the Neogene Carpathian magmatism. About the drainage network, the hard strata are disposed of at high altitudinal elevations, exceeding 1100 m, for which reason there is a close relationship between the hardness of the rocks and the presence of torrential organisms. Erosion resistance generates different hydrological behavior within the drainage network, with the smallest streams presenting long lengths in longitudinal profile, high slopes, and, consequently, high runoff velocities. On the other hand, the sedimentary rocks in the Moldavian Subcarpathian area favor shallow erosion, with the deluge and proluvial-type movements being present because of the attack of the rock by physical–mechanical agents.
Based on the variables used, the RUSLE model indicates that the Bistrița River basin can be expected to experience annual soil losses ranging between 0 and 114 t/ha/y (Fig. 3), and at the 1 km2 grid level, the average values are a maximum of 26 t/ha. The largest area of the basin is within values below 1 t/ha/year (65%), followed by the 1–5 t/ha/year class, which represents 19%. Values of 15 t/ha/year occupy only 2% of the entire basin (almost 10,000 ha), while very high values appear in isolation, associated with specific exrtractive activities, For example, within Bicazu Ardelean, where a limestone quarrying has created slope instability throughout the basin, and Poiana Teiului, upstream of Lake Izvorul Muntelui, where deforestation and very rugged slopes have combined to produce very high rate of soil loss. Areas with soil losses between 5 and 15 t/ha/year make up 14% of the study area, and are typically found within Subcarpathians and the large intra-mountainous depressions i.e., Dornelor, Tulgheș, Bilbor, and Borsec.
Soil losses through erosion in the Bistrita River basin according to RUSLE. Map produced using ArcGis10.8, using open sources (vector and raster data from https://www.openstreetmap.org/#map=7/45.996/24.981; https://earthexplorer.usgs.gov; https://geo-spatial.org).
The immediate impact of erosion in the mentioned areas is related to the removal of large areas of land from agricultural use, the degradation of road infrastructure, especially secondary infrastructure, the establishment of permanent erosion processes (gullies, torrents), and, in the long term, the triggering of landslides that transform large areas into unproductive lands.
Statistical analyses and machine learning algorithms
Correlation analysis was used to find redundancy, to reduce multicollinearity in regression and MLA, and to group highly correlated variables into components for the PCA. We found that precipitation are strongly correlated with LS-factor (0.816) and moderately with R-factor (0.691), reflecting the expected environmental interactions between erosion susceptibility, and rainfall intensity (Fig. 4). Thus, the strong relationship between LS-factor, precipitation, and R-factor suggests that rainfall intensity significantly influences erosion and landform stability.
The TWI positively correlates with lithology (0.432), suggesting that geological formations influence wetness levels. TWI is also negatively correlated with slope (− 0.729), showing that flattened areas tend to accumulate more moisture. The negative correlation between forest areas and road density suggests deforestation pressures in urbanized or densely populated areas.
Stream length and road length are strongly correlated, as expected, with drainage density and road density, respectively. To avoid redundancy, the former were excluded from the following analysis.
PCA was conducted to simplify the remaining set of 12 variables and to create new, uncorrelated features (principal components). PCA was based on the correlation matrix of the data, using the oblique rotation method promax. The first three components explain 59% of the variance in the dataset (Table 1).
The first rotated component (RC1) emphasizes the erosion-related characteristics and has a strong positive association with variables precipitation, LS-factor, and R-factor. The second rotated component (RC2) mainly captures land use and environmental patterns (e.g., forested area, road density, TWI), while the last two components (RC3 and RC4) are related to terrain morphology and drainage characteristics.
In Fig. 5, the arrows going from the variables to the principal components represent the loading of the variable on the component. Red shows a negative loading, and green represents a positive loading. The wider the arrows, the higher the loading.
The generated MLA used (SVM, K-NN and RF), starting from the 12 variables, a data series based on which we delimited the areas with different degrees of susceptibility to erosion, from surfaces with very low potential to areas with very high erosion potential. The evaluation of the performance of the 3 models used highlights the fact that RF (R2 = 0.67) represents the model with a high potential, followed by K-NN (R2 = 0.58) and SVM (R2 = 0.44) (Table 2). The same order is repeated in the case of errors (MSE, RMSE and MAE). RF is the model in which the RMSE has the lowest value of 1.49 followed by K-NN with 1.73 and SVM 2.03. Figure 6 highlights the predictive models’ performance between the values tested for RUSLE and those anticipated by SVM, K-NN, and RF.
The training of the data sets (n = 4761) showed that there are variables with higher importance in improving the prediction and their removal would lead to a decrease in accuracy. Mean dropout loss indicates the relative significance of each variable used in predicting the susceptibility of soil erosion in the Bistrita River basin. In Table 3, the forest has the highest importance for all 3 models, followed in similar proportions by LS factor, precipitation, or TWI, while plan curvature or profile curvature has the lowest importance.
Mean decrease accuracy (MDA) for the RF model measures the average reduction of model performance when one of the variables undergoes permutation after training but before making predictions (Fig. 7a,b). Because it considers the entire dataset and compares the accuracy on the modified test set with the initial accuracy, this indicator provides an assessment of how a variable influences the model’s performance. Analysis of the residuals (Fig. 7c) for RF model indicated a significant overestimation of erosion values on an area of < 1% from Bistrita River basin with erosion > 15t/ha/y while for 69% of the area, where erosion is reduced < 1t/ha/y the values are underestimated, on average, by 57%. These results are consistent with the hydrological role of forests in reducing erosion, with forested areas being less affected by land degradation processes and ecosystems being more resilient to change.
The spatial distribution of the values derived from all three applied models results in a clear separation of the mountain and Subcarpathian areas in terms of susceptibility to erosion (Fig. 9). Thus, RF attributes 8% of the surface area for the entire basin to areas with very high erosion potential, but if the analysis is carried out at the level of the two units (mountain and sub-Carpathian), we observe that only 5% of the mountain area falls into this category while for the subcarpathian sector the value is 23%. Similarly, values generated by K-NN average 7% for the entire basin (4% in the mountain area and 20% in the sub-Carpathian area). In contrast, SVM identifies areas with maximum erosion potential roughly double that of the other models at 16% of the entire basin (10% in the mountain area and 44% in the subcarpathian area).
Areas categorized as having very low erosion potential occupy the greatest percentage of the entire basin regardless of the model applied (41% RF, 30% SVM, 43% K-NN), with this pattern strongest in the mountainous area (47% RF, 34% SVM, 50% K-NN) and weakest (between 9 and 15%) in the Subcarpathians. This demonstrates the lower importance that geomorphological parameters have concerning the share of forests.
RF generated the most balanced distribution of areas concerning the five erosion potential categories, while the distribution generated under SVM areas was skewed towardshigher erosion potential, while the distribution generated under K-NN was skewed more towards low erosion potential categories (Fig. 8).
Spatial distribution of erosion in Bistrita River Basin susceptibility using (a) SVM; (b) RF (c) K-NN regressions and surface distribution in: (d) Bistrita River Basin; (e) Carpathian area; (f) Subcarpathian area. Map produced using ArcGis10.8, using open sources (vector and raster data from https://www.openstreetmap.org/#map=7/45.996/24.981; https://earthexplorer.usgs.gov; https://geo-spatial.org).
The analysis of the two areas (mountain and subcarpathian areas) highlights the contrasting character of the areas exposed to erosion in direct relation to the forest areas. Thus, areas with low erosion potential dominate the Carpathians (Fig. 8e), while the Subcarpathian area (Fig. 8f) is dominated by areas prone to erosion because of intensive deforestation. However, the unequal share of the areas distributed in the two areas (81% mountain and 19% subcarpathian) determines, in the Bistrita River basin as a whole, the predominance of areas with very low and low erosion potential.
In all three applied models (SVM, K-NN and RF), several areas characterized by very high erosion potential are distinguished. First of all, the Subcarpathians appear as a distinct unit where, based on the sharp decrease in forested areas, the friable lithology, and the predominantly agricultural use of the land, the slopes have become vulnerable to erosion. Second, the expansion of urban areas after 1990 resulted in the replacement of forests by residential spaces and the relocation of industrial units to the new outskirts of the city. Thus, the high, forested mountain space (including 21.8 km2 of virgin and quasi-virgin forests—https://www.mmediu.ro/articol/catalogue-national-al-padurilor-virgine-si-cvasivirgine/6233), difficult to access with low erosion values, is very well differentiated from the valley depressions and corridors where human settlements are located with high (4–6 t/ha/y) or very high erosion values (> 6 t/ha/y). Finally, in the intramountain depressions (Vatra Dornei, Borsec, Bilbor, and Tulgheș) and along the main hydrographic arteries, especially at the contact with the mountain space, the expansion of settlements and the diversification of economic activities also resulted in clearing forests, which led to the exposure of the soil to the direct action of precipitation and accelerated erosion processes.
Discussions
MLA aims to develop algorithms and models for the analysis and classification of large quantity of geo-environmental and anthropogenic datasets and manages to make predictions and provide solutions in the context of complex natural processes. Many developed models are able to identify areas susceptible to flooding35,36, landslides2,5,37 or soil erosion29,40 with high efficiency. From the analysis of the 3 models applied in our research (SVM, K-NN and RF), it appears that RF (r2 = 0.67) is the algorithm that most accurately identifies areas susceptible to erosion. This result is similar to those obtained in other studies, which use RF65,66 in assessing susceptibility to soil erosion but are less accurate compared to studies using the 3 models for classifiers of Land Cover. Also, previous studies that use other MLAs for soil erosion susceptibility indicate a higher prediction accuracy67,68.
In Romania, previous studies have evaluated the degree of land degradation and soil loss using various methods69,70,71,72,73, including RUSLE47,48,49,50, but the use of ML was limited to the Subcarpathians22 and to the dynamics of forest biomass concerning climate change in Romania5. Various studies have also used complex algorithms to establish erosion rates, using other predictors such as air temperature, soil type73,74, indices obtained based on satellite images (Normalized Difference Vegetation Index—NDVI)75 and their integration into various ML models could improve the results obtained.
In our research, compared to K-NNN and SVM, the RF model drops overestimated values, both in the maximum and minimum sectors, but some studies indicate suboptimal predictive accuracy as a limit76,77. K-NN tends to overestimate the model towards the minimum values, while SVM assigns higher values to the entire area. All three models provide a spatially comprehensive image of the distribution of areas affected by erosion, but with different degrees of accuracy.
RF regression showed that of the 12 variables, forest area and precipitation have a high importance in the prediction accuracy, while morphometric parameters, such as relief fragmentation and curvature in plan and profile, have a lower importance in the applied algorithms, being the expression of the result of erosion rather than factors that influenced it.
In Romania, many studies have addressed the issue of reducing forested areas, and reductions in forest area were found in all these studies. For the period 2000–2014, for example, in the Apuseni Mountains, the deforested area was 3.8%78, while in the Bucegi mountains, the area was 2.9%79, and in the northern group of the Eastern Carpathians the reduction was 6.75% (46,995.5 hectares were deforested)80. In addition to the reduction, there was also substantial fragmentation of forest areas81.
Increasing the rate of afforestation in Romania is a priority both in the National Recovery and Resilience Plan (PNRR) (Component C2: Forests and Biodiversity Protection) and in the National Forest Strategy 2030 (https://www.mmediu.ro/app/webroot/uploads/files/SFN%2030%20FINAL.docx).
According to the National Forest Inventory (http://roifn.ro/site/rezultate-ifn-2/), Romania has a forest area of 7 million ha. This area represents 29% of the total area of the country, which is below the EU average of 39%. The strategy aims to expand the forest extent, considering that forest areas include land of at least 0.25 ha, with trees reaching a minimum height of 5 m at maturity. However, during the period 2001–2023, Romania lost 443 thousand ha of forest vegetation due to fires and 0.1% due to deforestation (Global Forest Watch).
The PNRR also aimed to plant at least 25,000 ha of new forest vegetation areas by December 31, 2023, and extend this figure to 56,700 ha by June 30, 2026. The purpose of this state aid scheme is to restore the forest affected by fires, extreme weather phenomena, plant pathogen infestations, as well as other risk phenomena. The estimated costs for hill regions are 14,900 euros/ha (3,831 establishment of oak forests and 11,069 maintenance costs granted in the first 6 years) and 9,658 euros/ha for mountain regions (3,987 for establishment and 5,671 maintenance). Based on these considerations, and because numerous studies identify an increase in degraded land areas and financial losses for 205016,82, we have made a forecast regarding the evolution of areas susceptible to erosion in the Bistrita River basin using the best-trained model (RF) in 3 scenarios: a) maintaining the current rate (2001–2023) of reduction of forested areas (− 20%); (Fig. 9) b) 10% increase in forested areas (minimum 10 ha/km2); c) 50% increase in forested areas in the Subcarpathian area (most affected by erosion) (minimum 50 ha/km2). In all 3 scenarios, a decrease in average annual precipitation by 0.5 mm/y was considered according to the estimates made by Pravalie et al. (2019)63.
In both scenarios of reduction/increase (− 20%; + 10%) of forested areas, both in percentage and in absolute areas, there are no significant changes in the areas affected by erosion because there is a significant difference in the degree of forest cover between the mountain and the subcarpathian area (Fig. 10). The high average values of erosion per km2, double in the Subcarpathians (3.6 t/ha/y) compared to the mountain area (1.9 t/ha/y) are the result of the low degree of forestation and the high density of the hydrographic network in conditions of friable sedimentary rocks even if the slopes are 1/3 compared to the mountain area.
Soil erosion potential using Random Forest in the scenario of an increase (+ 10%) and decrease (− 20%) of forested areas. Map produced using ArcGis10.8, using open sources (vector and raster data from https://www.openstreetmap.org/#map=7/45.996/24.981; https://earthexplorer.usgs.gov; https://geo-spatial.org).
The increase of forested areas (+ 10%) causes a slight improvement in the very low class (+ 2.15%) while in the remaining classes, a slight redistribution between classes occurs (0.15–1.35%). For the scenario of decreasing forested areas (− 20%), the redistribution occurs in the opposite direction. Even if the decrease in forested areas (− 20%) is double compared to the afforestation scenario (+ 10%), the differences between the erosion classes are not significant because the increase/decrease in forested areas was uniform for each km2. Thus, since the degree of afforestation is asymmetrical (Carpathians > 70%; Subcarpathians < 30%), applying a decrease or increase of only 10–20% does not significantly change the degree of afforestation or deforestation per km2. Under these conditions, without a notable change in the degree of afforestation/deforestation (± 50%), there is no significant redistribution of potential erosion values and reclassification of erosion potential.
The spatial analysis did not identify a pattern of distribution of areas with various degrees of erosion in the Carpathians because there is great geological diversity, the arrangement of the layers is perpendicular to the direction of river flow, and the slopes are high on the slopes, but these variables are compensated by the very high degree of afforestation. The only areas where erosion is very active are those in contact with depressions and slopes devoid of vegetation and subject to pressure from anthropogenic activities.
In the Subcarpathians, however, a direct correlation is observed between slope and degree of afforestation, under conditions of geological uniformity, and we can identify a pattern regarding erosion potential. Thus, a zoning of erosion is identified depending on slope distributed over 3 categories: slopes < 5°, slopes 5–8°, and slopes > 9°.
In the category of areas with low slopes (< 5 degrees) positioned in the main river meadows, erosion is reduced, and each degree of slope contributes, on average, 1.5 t/h/y. At the same time, for areas with slopes > 9°, where the degree of afforestation is high (> 50%), potential erosion values are similar to those in areas with slopes < 5°. The highest erosion values are recorded in the range of slopes 5–8°, with values over 6 t/ha/y, which can reach over 20 t/h/y in restricted areas.
In the third scenario, in which the forested areas in the Subcarpathians would increase significantly and almost the entire deforested area, in the interval 2001–2023 in the entire Bistrita river basin (> 50,000 ha) would be planted uniformly, a significant reduction in erosion is observed: the very high class (Fig. 11) is no longer present and thosevalues are re-distributed to the moderate and high class. Thus, in the Subcarpathian area, the maximum values of potential erosion would not exceed 5 t/ha/y, and in areas with more than 50 ha/km2 of forest, each 10 ha of the forest will determine a decrease of 1t/ha/y of erosion, resulting in erosion below 1t/ha/y once forest extent reaches 90 ha/km2. However, afforestation and maintenance of such a large area in just one geographical subunit involves high costs (> 775 million euros according to the PNRR), and there are also limitations in the application of this strategy due to the type of land ownership.
Estimating susceptibility to erosion in the Subcarpathian area using RF under current afforestation conditions and in the scenario of increasing afforested areas to 50 ha/km2. Map produced using ArcGis10.8, using open sources (vector and raster data from https://www.openstreetmap.org/#map=7/45.996/24.981; https://earthexplorer.usgs.gov; https://geo-spatial.org).
The use of ML in studies on soil erosion in Romania will help to provide a clearer picture of the future by showing the influence of each element on the intensity of erosion, as well as the total amount of soil moved by erosion.
The limits of the present study are given by the fact that both the increase (+ 10% in the entire river basin and + 50% in the Subcarpathians) and the decrease in forested areas ( − 20%) were uniform across the entire area except for areas already 100% forested or those without forests, but, in the future, local studies can be applied at a microscale level take into account the differentiated degree of forestation, slope, land use or type of ownership (public/private). Even if urban areas are more difficult to “forest”, the presence of green spaces, parks, or green belts around cities82 brings multiple benefits not only by reducing erosion but also by improving air quality83.
In the long term, the increase in forested areas does entail high financial costs associated with maintenance and adequate management. Moreover, an additional risk factor for these areas is global warming, which is forecast to bring increases in heat waves, decreases in precipitations, and increases in the number of storm events84, which can cause losses of forested areas through felling, significant vegetation fires , or the drying of the forest due to drought and pests85,86,87,88,89.
Conclusions
The assessment of the susceptibility of the Bistrița watershed to erosion based on supervised learning algorithms (SVM, K-NN, RF) highlights the importance of knowing the main natural and anthropogenic factors that favor or limit degradation processes.
Conversion from natural forestland to bare soil increases the magnitude of erosion of soils, and even ifmost of the sediment produced remains initially near the site in unstable deposits, depending on the slopes, it may be remobilized and conveyed further downslope by subsequent erosional events.
Our analysis highlighted the role of forested areas in limiting erosion processes, stabilizing the soil through roots, and reducing the effect of raindrops, these being the most important variables in all 3 MLAs . Also, TWI and precipitation are factors that favor erosion because they influence humidity and water runoff at the soil level. The results of the 3 models (SVM, K-NN and RF) showed that the Subcarpathian sector is the most prone to erosion, because of the reduction of forested areas and the expansion of urban areas and agricultural areas. These factors contribute to erosion rates that can exceed 20 t/ha/y. The RF algorithm proved to have the best accuracy of the predicted results (r2 = 0.67) in estimating soil erosion, due to the averaging of the influence of the maximum and minimum variables. The unequal the extents of the Carpathian (81%) and the Subcarpathian (19%) sectors within the entire basin and the high degree of afforestation in the mountainous sector (> 70%) results in the majority of the surfaces in the basin being in very low and low erosion categories and only 8% of the basin being in the high erosion categories. However, the differentiated analysis showed that in the Subcarpathian sector, 44% of the surface falls into the very high and high category compared to only 13% in the Carpathian sector.
RF, the best-trained model in our research, was used to forecast the evolution of erosion for 2050 under 3 scenarios: a continuing deforestation trend (resulting in a forest decline of 20%), a reversal of the current trend (a forest gain of 10%), and an increase of 52,045 ha of forest in the Subcarpathians. In each scenario, precipitation would be reduced by 0.5 mm/y. In the case of a reduction/increase (− 20%; + 10%) of forested areas, at the level of the entire Bistrita River basin, the change in susceptibility to erosion does not change significantly, with only a slight redistribution (0.15–1.35%) between classes and ± 2.15 in the very low class.
In the Subcarpathians area, a direct correlation is observed between slope and degree of afforestation, and we can identify a pattern regarding erosion potential. Thus, a zoning of erosion is identified depending on slope distributed over 3 categories: slopes < 5°; slopes 5–8°, and slopes > 9°. The highest erosion values are recorded in the range of slopes 5–8° with values over 6 t/ha/y. If the forested area increases significantly in the Subcarpathians, erosion is significantly reduced, the very high class disappears, and for over 50 ha/km2, each increase by 10 ha determines a reduction in erosion by 1 t/ha/y, up to 90 ha forested and erosion is below 1 t/ha/y.
However, these high values of the degree of afforestation would require excessive costs (> 775 million euros), and practical application is limited due to the different types of ownership and the current way of using the land. Developing scenarios to assess susceptibility to erosion using ML in various physical-geographical units in Romania under conditions of increase/decrease in forest-covered areas and changing climatic conditions could help local authorities or various administrators to improve their management of managed ecosystems, sustainably exploit resources, and contribute to the achievement of national development strategies.
Data availability
The data are available in Supplementary Material.
References
Turner, B. L., Menendez, H. M., Gates, R., Tedeschi, L. O. & Atzori, A. S. System Dynamics Modeling for Agricultural and Natural Resource Management Issues: Review of Some Past Cases and Forecasting Future Roles. Resources 5(4), 40. https://doi.org/10.3390/resources5040040 (2016).
Hitouri, S. et al. Gully erosion mapping susceptibility in a Mediterranean environment: A hybrid decision-making model. Int. Soil Water Conservation Res. 12(2), 279–297. https://doi.org/10.1016/j.iswcr.2023.09.008 (2024).
Sanderman, J., Hengl, T. & Fiske, G. J. Soil carbon debt of 12,000 years of human land use. Proc. Natl. Acad. Sci. U.S.A. 114(36), 9575–9580. https://doi.org/10.1073/pnas.1706103114 (2017).
Perpiña Castillo, C., Kavalov, B., Diogo, V., Jacobs-Crisioni, C., Batista e Silva, F & Lavalle C. JRC113718, European Commission (2018). Web: https://ec.europa.eu/jrc/en/luisa, https://urban.jrc.ec.europa.eu/t-board
Prăvălie, R. et al. A unifying modelling of multiple land degradation pathways in Europe. Nat Commun 15, 3862. https://doi.org/10.1038/s41467-024-48252-x (2024).
Pozza, L. E. & Field, D. J. The science of Soil Security and Food Security. Soil Security 1, 100002. https://doi.org/10.1016/j.soisec.2020.100002 (2020).
Verheijen, F. G. A., Jones, R. J. A., Rickson, R. J. & Smith, C. J. Tolerable versus actual soil erosion rates in Europe. Earth Sci. Rev. https://doi.org/10.1016/j.earscirev.2009.02.003 (2009).
Bui, E. N., Hancock, G. J. & Wilkinson, S. N. ‘Tolerable’ hillslope soil erosion rates in Australia: Linking science and policy. Agri. Ecosyst. Environ. 144(1), 136–149. https://doi.org/10.1016/j.agee.2011.07.022 (2011).
Panagos, P. et al. European Soil Data Centre 2.0: Soil data and knowledge in support of the EU policies. European J. Soil Sci. 73(6), e13315. https://doi.org/10.1111/ejss.1331518 (2022).
Panagos, P., Karydas, C., Ballabio, C., Borrelli, P. & Meusburger K. Advances in soil erosion modelling through remote sensing data availability at European scale. Proceedings of SPIE 9229, Second International Conference on Remote Sensing and Geoinformation of the Environment (RSCy2014), 92290I, (2014), https://doi.org/10.1117/12.2066383
Panagos, P. et al. Rainfall erosivity in Europe. Sci. Total Environ. 511, 801–814. https://doi.org/10.1016/j.scitotenv.2015.01.008 (2015).
Das, S. et al. GloRESatE: A dataset for global rainfall erosivity derived from multi-source data. Sci Data 11, 926. https://doi.org/10.1038/s41597-024-03756-5 (2024).
Ma, K., Zhang, Y., Ruan, M., Guo, J. & Chai, T. Land subsidence in a coal mining area reduced soil fertility and led to soil degradation in arid and semi-arid regions. Int. J. Environ. Res. Public Health 16(20), 3929. https://doi.org/10.3390/ijerph16203929 (2019).
Alewell, C. et al. Global phosphorus shortage will be aggravated by soil erosion. Nat. Commun. 11(1), 4546. https://doi.org/10.1038/s41467-020-18326-7 (2020).
Vanacker, V., Molina, A., Govers, G., Poesen, J. & Deckers, J. Spatial variation of suspended sediment concentrations in a tropical Andean river system: The Paute River, southern Ecuador. Geomorphology 87(1–2), 53–67. https://doi.org/10.1016/j.geomorph.2006.06.042 (2007).
Panagos, P. et al. Cost of agricultural productivity loss due to soil erosion in the European Union: From direct cost evaluation approaches to the use of macroeconomic models. Land Degrad. Dev. 29(3), 471–484. https://doi.org/10.3390/rs12091365 (2018).
Pereira, S. et al. A landslide risk index for municipal land use planning in Portugal. Sci. Total Environ. 735, 139463. https://doi.org/10.1016/j.scitotenv.2020.139463 (2020).
Borrelli, P. et al. An assessment of the global impact of 21st century land use change on soil erosion. Nat. Commun. 8(1), 1–13. https://doi.org/10.1038/s41467-017-02142-7 (2017).
Tester, M. & Langridge, P. Breeding technologies to increase crop production in a changing world. Science 327(5967), 818–822 (2010).
Prăvălie, R. et al. Arable lands under the pressure of multiple land degradation processes. A Global Perspective. Environ. Res. 194, 110697. https://doi.org/10.1016/j.envres.2020.110697 (2021).
Wischmeier, W.H. & Smith, D.D. Predicting Rainfall Erosion Losses. A Guide to Conservation Planning. The USDA Agricultural Handbook No. 537, (1978). Maryland.
Borrelli, P., Panagos, P., Märker, M., Modugno, S. & Schütte, B. Assessment of the impacts of clear-cutting on soil loss by water erosion in Italian forests: first comprehensive monitoring and modelling approach, Catena, 149. Part 3, 770–781. https://doi.org/10.1016/j.catena.2016.02.017 (2017).
Panagos, P. et al. The new assessment of soil loss by water erosion in Europe. Environ. Sci. Policy 54, 438–447. https://doi.org/10.1016/j.envsci.2015.08.012 (2015).
Abuzaid, A. S. & Abdelatif, A. D. Assessment of desertification using modified MEDALUS model in the north Nile Delta, Egypt. Geoderma 405, 115400. https://doi.org/10.1016/j.geoderma.2021.115400 (2022).
Panagos, P. et al. Projections of soil loss by water erosion in Europe by 2050. Environ. Sci. Policy 124, 380–392. https://doi.org/10.1016/j.envsci.2021.07.012 (2021).
Biddoccu, M. et al. Evaluation of soil erosion risk and identification of soil cover and management factor (C) for RUSLE in European vineyards with different soil management. Int. Soil Water Conservation Res. 8(4), 337–353. https://doi.org/10.1016/j.iswcr.2020.07.003 (2020).
Zhang, L. et al. Assessment of soil erosion dynamics and implications for sustainable land management: A case study using the RUSLE model. Int. J. Sediment Res. https://doi.org/10.1016/j.ijsrc.2024.12.001 (2024).
Ghavami, M. S., Ayoubi, S., Khaleghpanah, N., Mosaddeghi, M. R. & Gohari, A. Soil loss estimation using RUSLE model: Comparison of conventional and digital soil data at watershed scale in central Iran. Soil Tillage Res. 244, 106238. https://doi.org/10.1016/j.still.2024.106238 (2024).
Minea, G. et al. Designing grazing susceptibility to land degradation index (GSLDI) in hilly areas. Sci. Rep. 12(1), 9393 (2022).
Vandana, N. & Bhatia, N. Survey of Nearest Neighbor Techniques. Int. J. Comput. Sci. Inform. Security https://doi.org/10.48550/arXiv.1007.0085 (2010).
Souza, D. V., Nievola, J. C., Corte, A. P. D. & Sanquetta, C. R. K-nearest neighbor and linear regression in the prediction of the artificial form factor. Floresta 50(3), 1669–1678. https://doi.org/10.5380/rf.v50i3.65720 (2020).
Steinwart I. & Christmann A. Support Vector Machines, Information Science and Statistics, Springer Science & Business Media, (2008). ISBN 0387772421, 9780387772424
Udousoro, I.C. Machine Learning: A Review, Semiconductor Science and Information Devices, Volume 02, Issue 02, (2020). https://ojs.bilpublishing.com/index.php/ssid
Nguyen, H. D., Nguyen, Q. H., Dang, D. K., Nguyen, T. G., Truong, Q. H., Nguyen, V. H., & Petrisor, A. I. Integrated machine learning and remote sensing for groundwater potential mapping in the Mekong Delta in Vietnam. Acta Geophysica, 1–19. (2024)
Ha, M. C. et al. Machine learning and remote sensing application for extreme climate evaluation: example of flood susceptibility in the Hue Province. Central Vietnam Region. Water 14(10), 1617 (2022).
Diaconu, D. C. et al. Developing flood mapping procedure through optimized machine learning techniques Case study: Prahova river basin, Romania. J. Hydrology: Regional Stud. 54, 101892. https://doi.org/10.1016/j.ejrh.2024.101892 (2024).
Islam, A. R. M. T. et al. Enhancing flood susceptibility mapping in Meghna River basin by introducing ensemble Naive Bayes with stacking algorithms. Geomatics, Natural Hazards Risk 16(1), 2464049 (2025).
Bammou, Y. et al. Machine learning models for gully erosion susceptibility assessment in the Tensift catchment, Haouz Plain, Morocco for sustainable development. J. Afr. Earth Sc. 213, 105229. https://doi.org/10.1016/j.jafrearsci.2024.105229 (2024).
Nguyen, D. D., Tiep, N. V., Bui, Q. A. T., Le, H. V., Prakash, I., Costache, R., ... & Pham, B. T. Landslide Susceptibility Mapping Using RBFN-Based Ensemble Machine Learning Models. CMES-Comput. Modeling in Eng. Sci., 142(1). (2025).
Folharini, S. et al. Soil Erosion Quantification using Machine Learning in Sub-Watersheds of Northern Portugal. Hydrology 10, 7. https://doi.org/10.3390/hydrology10010007 (2023).
Mohammady, M. & Davudirad, A. Gully Erosion Susceptibility Assessment Using Different Machine Learning Algorithms: A Case Study of Shazand Watershed in Iran. Environ. Model. Assess. https://doi.org/10.1007/s10666-023-09910-4 (2023).
Gianinetto, M. et al. Future Scenarios of Soil Erosion in the Alps under Climate Change and Land Cover Transformations Simulated with Automatic Machine Learning. Climate 8(2), 28. https://doi.org/10.3390/cli8020028 (2020).
Moțoc, M., Munteanu, S., Băloiu, V., Stănescu, P. & Mihai G. Eroziunea solului si metodele de combatere, Edit. Ceres, București.(1975)
Moțoc, M., Stanescu, P. & Taloescu I. Metode de estimare a eroziunii totale si efluente pe bazine hidrografice mici. ICPA, București. (1979)
Popa, N. Sheet and Rill Erosion. In: Radoane, M., Vespremeanu-Stroe, A. (eds) Landform Dynamics and Evolution in Romania. Springer Geography. Springer, Cham. (2017). https://doi.org/10.1007/978-3-319-32589-7_15
Patriche C.V. Soil Erosion Modelling. In: Radoane, M., Vespremeanu-Stroe, A. (eds) Landform Dynamics and Evolution in Romania. Springer Geography. Springer, Cham. (2017). https://doi.org/10.1007/978-3-319-32589-7_17
Patriche, C. V. Applying RUSLE for soil erosion estimation in Romania under current and future climate scenarios. Geoderma Reg. 34, e00687. https://doi.org/10.1016/j.geodrs.2023.e00687 (2023).
Nistor, C., Săvulescu, I., Ioana-Toroimac, G. & Carablaisă, S. Exploring soil erosion and reservoir sedimentation through the RUSLE model and bathymetric survey. Int. Soil Water Conservation Res. https://doi.org/10.1016/j.iswcr.2024.10.005 (2024).
Niacșu, L., Ioniță, I., Samoila, C., Grigoras, G. & Blebea-Apostu, A. M. Land degradation and soil conservation measures in the Moldavian Plateau, Eastern Romania: A case study from the Racova Catchment. Water 13(20), 2877. https://doi.org/10.3390/w13202877 (2021).
Prăvălie, R., Săvulescu, I., Patriche, C., Dumitraşcu, M. & Bandoc, G. Spatial assessment of land degradation sensitive areas in southwestern Romania using modified MEDALUS method. CATENA 153, 114–130. https://doi.org/10.1016/j.catena.2017.02.011 (2017).
Vîrghileanu, M., Săvulescu, I., Mihai, B. A., Bizdadea, C. G. & Paraschiv, M. G. RUSLE-based scenarios for sustainable soil management: Case studies from Romanian Subcarpathians. Eur. J. Soil Sci. 75(4), e13526. https://doi.org/10.1111/ejss.13526 (2024).
Prăvălie, R. et al. Modelling forest biomass dynamics in relation to climate change in Romania using complex data and machine learning algorithms. Stochastic Environ. Res. Risk Assessment 37(5), 1669–1695 (2023).
Donisă, I., Geomorfologia văii Bistriței, Edit. Academiei RSR, (1968). București
Sîrcu, I., Munții Rodnei, Edit. Academiei Republicii Socialiste Romania, (1978). București
Popescu, R., Urdea, P. & Vespremeanu-Stroe A. Deglaciation History of High Massifs from the Romanian Carpathians: Towards an Integrated View, In: Radoane, M., Vespremeanu-Stroe, A. (eds) Landform Dynamics and Evolution in Romania. Springer Geography. Springer. (2017). Cham. https://doi.org/10.1007/978-3-319-32589-7_5
Donisă, I. & Poghirc P. Valea Bistriţei, Edit. Ştiinţifică, (1968). Bucureşti
Panagos, P. et al. Estimating the soil erosion cover-management factor at European scale. Land Use Policy 48C, 38–50. https://doi.org/10.1016/j.landusepol.2015.05.021 (2015).
Vapnik, V. N. The Nature of Statistical Learning Theory (Information Science and Statistics Springer New York, NY, 2000). https://doi.org/10.1007/978-1-4757-3264-1.
Salman, H. A., Kalakech, A. & Steiti, A. Random Forest Algorithm Overview, Babylonian Journal of. Mach. Learn. 2024, 69–79. https://doi.org/10.58496/BJML/2024/007 (2024).
Cheval, S., Dumitrescu, A. & Birsan, M. V. Variability of the aridity in the South-Eastern Europe over 1961–2050. CATENA 151, 74–86 (2017).
Pastor, I. et al. Morphometric Analysis and Prioritization of Sub-Watersheds Located in Heterogeneous Geographical Units—Case Study: The Buzău River Basin. Sustainability 16(17), 7567. https://doi.org/10.3390/su16177567 (2024).
Dunea, D. et al. Effects of riparian vegetation on evapotranspiration processes and water quality of small plain streams. Ecohydrol. Hydrobiol. 21(4), 629–640 (2021).
Prăvălie, R. et al. Spatio-temporal changes of the climatic water balance in Romania as a response to precipitation and reference evapotranspiration trends during 1961–2013. CATENA 172, 295–312 (2019).
Moore, I. D. & Burch, G. J. Physical Basis of the Length Slope Factor in the Universal Soil Loss Equation. Soil Sci. Soc. Am. 50, 1294–1298. https://doi.org/10.2136/sssaj1986.03615995005000050042x (1986).
Sharma, Y. et al. Soil loss estimation and susceptibility analysis using RUSLE and random forest algorithm: a case study of Nainital district. India. Spat. Inf. Res. 33, 21. https://doi.org/10.1007/s41324-025-00620-5 (2025).
Adugna, T., Xu, W. & Fan, J. Comparison of Random Forest and Support Vector Machine Classifiers for Regional Land Cover Mapping Using Coarse Resolution FY-3C Images. Remote Sens. 14, 574. https://doi.org/10.3390/rs14030574 (2022).
Kundu, M., Ghosh, A. & Zafor, M. A. Evaluating the integrated performance and effectiveness of RUSLE through machine learning algorithm on soil erosion susceptibility in tropical plateau basin. India. J. Sediment. Environ. 9, 665–693. https://doi.org/10.1007/s43217-024-00181-x (2024).
Hildemann, M., Pebesma, E. & Verstegen, J. A. Multi-objective allocation optimization of soil conservation measures under data uncertainty. Environ. Manage. 72(5), 959–977 (2023).
Brânduș, C. & Găman, C. Aspects regarding the evolution of slope processes in the Izvoru Alb – Bicaz territory (Neamţ County) during 2005–2014. Revista de geomorfologie 16, 55–60 (2014).
Roșu, C.E. & Irimuș I.A. The context of the geomorphological processes in urban area of Piatra-Neamț City, Riscuri și Catastrofe, nr. XIII, Vol. 14, Nr. 1 (2014).
Muscalu I.M. Soil erosion and landslides in Izvorul Muntelui Lake area, Annals of the University of Craiova – Series Geography Vol. 18 (new series). (2017).
Muscalu, I. M. Forms and processes of soil degradation and impacts on agriculture in Lake Basin Izvorul Muntelui, Analele Universităţii din Oradea, Seria Geografie. Year XXVIII, 1(June), 72–81 (2018).
Codru, I. C. & Niacșu, L. Landslide susceptibility assessment on the left side of the Izvorul Muntelui Lake bank, Romania. Present Environ. Sustain. Dev. https://doi.org/10.47743/pesd2022161001 (2022).
Sestras, P. et al. (2023) Soil Erosion Assessment Using the Intensity of Erosion and Outflow Model by Estimating Sediment Yield: Case Study in River Basins with Different Characteristics from Cluj County. Romania. Appl. Sci. 13, 9481. https://doi.org/10.3390/app13169481 (2023).
Vîrghileanu M.R. & Mihai B.A. Abordare integrată bazată pe tehnici de teledetecție și GIS pentru cartografierea și modelarea eroziunii solului in România, Contract PD 18/2022, Raport științific final, (2024). https://rs4soil.unibuc.ro/?page_id=149
Bui, D., Tsangaratos, P., Tien, N., Liem, N. & Trinh, P. Comparing the prediction performance of a Deep Learning Neural Network model with conventional machine learning models in landslide susceptibility assessment. CATENA 188, 104426. https://doi.org/10.1016/j.catena.2019.104426 (2020).
Gautam, S. et al. Prediction of Various Sizes of Particles in Deep Opencast Copper Mine Using Recurrent Neural Network: A Machine Learning Approach. J. Inst. Eng. India Ser. A 103, 283–294. https://doi.org/10.1007/s40030-021-00589-y (2022).
Andronache, I. et al. Dynamics of Forest Fragmentation and Connectivity Using Particle and Fractal Analysis. Sci Rep 9, 12228. https://doi.org/10.1038/s41598-019-48277-z (2019).
Diaconu, D. C. et al. Using fractal fragmentation and compaction index in analysis of the deforestation process in Bucegi Mountains Group, Romania. Carpathian J. Earth Environ. Sci. 14(2), 431–438 (2019).
Drăghici, C. C., Andronache, I., Ahammer, H., Peptenatu, D., Pintilii, R. D., Ciobotaru, A. M., Simion, A. G., Dobrea, R. C., Diaconu, D. C., Vișan, M. C. & Papuc, R. M. Spatial evolution of forest areas in the northern Carpathian Mountains of Romania. Acta Montanistica Slovaca, 22(2). (2017).
Diaconu, D. C., Andronache, I., Gruia, A. R., Bazac, T. & Băloi, A. M. Evaluation of forest loss data using fractal algorithms: case study Eastern Carpathians-Romania. Front. For. Glob. Change. 7, 1432739. https://doi.org/10.3389/ffgc.2024.1432739 (2024).
Kabisch, N. et al. Nature-based solutions to climate change mitigation and adaptation in urban areas: perspectives on indicators, knowledge gaps, barriers, and opportunities for action. Ecol. Soc. 21(2), 39. https://doi.org/10.5751/ES-08373-210239 (2016).
Curtis, P., Slay, C., Harris, N., Tyukavina, A. & Hansen, M. Classifying drivers of global forest loss. Science 361, 1108–1111. https://doi.org/10.1126/science.aau3445 (2018).
Jactel, H., Koricheva, J. & Castagneyrol, B. Responses of forest insect pests to climate change: not so simple. Current Opinion in Insect Sci. 35, 103–108. https://doi.org/10.1016/j.cois.2019.07.010 (2019).
Giannico, V. et al. Green spaces, quality of life, and citizen perception in European cities. Environ. Res. 196, 110922. https://doi.org/10.1016/j.envres.2021.110922 (2021).
Portmann, R. et al. Global forestation and deforestation affect remote climate via adjusted atmosphere and ocean circulation. Nat Commun 13, 5569. https://doi.org/10.1038/s41467-022-33279-9 (2022).
Gajendiran, K., Kandasamy, S. & Narayanan, M. Influences of wildfire on the forest ecosystem and climate change: A comprehensive study. Environ. Res. 240(2), 117537. https://doi.org/10.1016/j.envres.2023.117537 (2024).
Lecina-Diaz, J., Martínez-Vilalta, J., Alvarez, A., Vayreda, J. & Retana, J. Assessing the Risk of Losing Forest Ecosystem Services Due to Wildfires. Ecosystems 24(7), 1687–1701. https://doi.org/10.1007/s10021-021-00611-1 (2021).
Albert, M. et al. Quantifying the effect of persistent dryer climates on forest productivity and implications for forest planning: a case study in northern Germany. For. Ecosyst. 5, 33. https://doi.org/10.1186/s40663-018-0152-0 (2018).
Funding
This research was supported by a grant from the Ministry of Research, Innovation and Digitization, CNCS/CCCDI-UEFISCDI, project number COFUND-DUT-FEED4FOOD, within PNCDI IV Project title: Vulnerable communities fostering innovation and governance of sustainable building systems in European cities, through project number 50/2024 and these results received logistic support from the CNFIS-FDI-2025-F-0421 project financed by the Romanian Ministry of Education and Research.
Author information
Authors and Affiliations
Contributions
All authors have an equal contribution to the preparation of this scientific paper. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Muratoreanu, G., Dunea, D., Tanislav, D. et al. Soil erosion susceptibility assessment in the Bistrița River basin (Romania) using machine learning algorithms and GIS. Sci Rep 15, 44381 (2025). https://doi.org/10.1038/s41598-025-28184-2
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-28184-2














