Introduction

Invasive alien species (IAS) pose a significant threat to biodiversity in the Anthropocene, alongside agriculture, deforestation, habitat fragmentation, and climate change1. These species can cause extensive habitat transformation and ecosystem disruption, negatively affecting both natural and human-modified environments2. IAS often outcompete native flora and fauna, reduce biodiversity, and alter ecosystem processes such as nutrient cycling and energy flows3,4. One such invasive genus is Solidago (goldenrods), particularly Solidago canadensis and Solidago gigantea, which have become widespread across Europe due to their aggressive dispersal strategies, prolific seed production, and vegetative reproduction5,6,7. Goldenrods severely impact native biodiversity and ecosystem services. Their dense monocultures displace native plant communities, degrade habitats for pollinators, and reduce agricultural productivity5,6,8. In Poland and Central Europe, they now dominate large portions of post-agricultural land, particularly abandoned fields, where they may occupy up to 70% of the area6,9. Despite being considered invasive in 49 countries, goldenrods are still cultivated by beekeepers for honey production, complicating efforts to control their spread2,10. In Kampinos National Park (KNP) in Poland, the invasion of goldenrods presents a pressing challenge for conservation and land management, particularly in areas recovering from agricultural abandonment11. Early detection and accurate monitoring are essential for targeted management of these species before they become widespread and difficult to control.

Remote sensing has emerged as a powerful tool for ecological monitoring and invasive species detection, offering efficient, repeatable, and large-scale observation capabilities8,9. It is important so that botanists working on the goldenrod ecology and eradication, as well as practitioners who should manage invasion, plan actions and make decisions are aware that the goldenrod can be easily detected from satellite images. This method is much less time consuming and more cost-effective, comparing to in-situ inventory.

Both passive (e.g., multispectral and hyperspectral) and active (e.g., radar, LiDAR) remote sensing systems have been used to monitor invasive plant species. Among them, multispectral sensors have proven especially useful due to their sensitivity to vegetation spectral characteristics. However, mapping species of the Solidago genus using remote sensing has been addressed in only a few studies, most of which have relied exclusively on optical data from the visible and reflective infrared spectra.

The detection of Solidago altissima was studied by Ishii and Washitani10 as well as Rizaludin Mahmud et al.11. In the former case, the authors applied generalized linear regression models to high-resolution airborne hyperspectral data collected in spring, achieving a prediction accuracy of 85% for goldenrod mapping. In the latter study, detailed field spectroscopy data acquired during the flowering period enabled the development of a goldenrod-specific vegetation index. This index was then applied to a WorldView-2 image from autumn, with classification performed using a maximum likelihood classifier and thresholding. Notably, both studies relied on single-date remote sensing imagery. Koco et al.12 and Zagajewski et al.13 mapped occurrences of Solidago gigantea and Solidago canadensis. Koco et al. used a GNSS system and unmanned aerial vehicle (UAV) imagery to assess goldenrod spread over four consecutive years. Although precise plant positioning was achieved, automated classification of UAV multispectral images resulted in very low accuracy (< 50%). In contrast, Zagajewski et al. carried out a more advanced study using multitemporal Sentinel-2 and PlanetScope data representing late summer and autumn. The classification, performed with Random Forest (RF) and support vector machine (SVM) algorithms, yielded high overall accuracies (mostly above 90%), with the best results obtained using Sentinel-2 and RF. Sentinel-2 imagery was also used by Chadoulis et al.14 to map Solidago gigantea in a riparian environment. In this case, classification was based on phenological metrics derived from multitemporal data, using RF, and achieved approximately 80% accuracy. Additional studies attempted to distinguish goldenrods from other invasive or expansive species using airborne hyperspectral data in combination with RF, SVM, or both algorithms15,16,17. These approaches achieved varying mapping accuracies depending on the composition of the training dataset (specifically, the proportion of the target class within samples) and its size. In several cases, high accuracies above 90% were reported when sufficient training samples were available.

Collectively, these studies confirm the high potential of remote sensing for accurate identification and classification of goldenrod species. However, many of the cited works relied on airborne data, which presents scalability challenges for broad-area mapping. Additionally, the research areas were often limited in spatial extent, which may hinder scalability of their findings. Satellite-based approaches offer a more practical solution for large-scale, operational monitoring. Among the available sensors, Sentinel-2 and PlanetScope stand out due to their high-resolution multispectral capabilities, making automated detection of goldenrod invasions more feasible than ever. Sentinel-2, as part of the Copernicus Programme, provides free, high-resolution (10–60 m) multispectral data with frequent revisit cycles and wide swath coverage, making it suitable for large-scale ecological assessments12. PlanetScope, a commercial constellation, offers near-daily data at approximately 3-meter spatial resolution, allowing for detailed, site-specific analysis13. The complementary characteristics of these systems, spatial resolution, temporal frequency, and availability, make them highly promising for detecting invasive species across heterogeneous landscapes. Importantly, both provide regular and systematic acquisitions that enable temporal analysis of plant phenology, which is particularly relevant in the case of goldenrods.

Despite these opportunities, few studies have performed comparative assessment of Sentinel-2 and PlanetScope use for goldenrod detection or examined how different machine learning algorithms perform on multitemporal datasets. Furthermore, many investigations prioritize algorithmic or sensor comparisons rather than integrating phenological information and considering operational needs for invasive species management.

The aim of this study is twofold. First, we address this research gap by evaluating the performance of two machine learning algorithms, RF and one-class support vector machine (OCSVM), for detecting Solidago spp. using Sentinel-2 and PlanetScope imagery. RF is widely recognized for its robustness to overfitting and its ability to process high-dimensional datasets18,19thus it is used in our study as a benchmark method. OCSVM is particularly effective when training data are available for only one class, a common limitation in IAS mapping due to difficulties in obtaining reliable ground truth for different land cover types20,21. Moreover, OCSVM when tested for defect predictions, even if did not outperform RF, it proved to be superior comparing to binary classifiers22. Second, this study investigates the role of phenological timing in classification performance. Goldenrod patches often remain visible even into autumn and winter due to their tall, persistent biomass, which can be green or dry. Therefore, we assess detection performance across the entire growing season, from early spring to late autumn, using multitemporal imagery. Our overarching objective is to evaluate how spatial and temporal resolution, classification strategy, and image timing affect mapping accuracy, with a focus on practical implications for goldenrod monitoring and control.

By focusing on Solidago invasion in an ecologically diverse and well-documented site in Poland, this study contributes to the development of scalable, remote sensing–based monitoring frameworks. It also provides valuable insights for optimizing the choice of satellite data, phenological windows, and classification approaches for improved ecological management and conservation planning.

Materials

Study taxon and area

Canadian goldenrods (Solidago canadensis L.) and giant goldenrods (S. gigantea Aiton) are two highly invasive plant species native to North American that have successfully colonized vast areas of European land23. Due to their similar morphology and habitat preferences within their secondary range24we considered these species together in this study and referred to them as goldenrods25. In Central Europe, goldenrods are known to form dense single-species patches that sometimes cover hundreds of hectares5,6 (Fig. 1). Despite their relatively late emerging in spring, goldenrods significantly impact on vegetation throughout the growing season due to their rapid growth, high biomass accumulation, substantial litter production, and shading potential, reaching heights of up to 2 meters5,6,26. Additionally, goldenrods release allelopathic chemicals that inhibit the growth of other plant species27,28,29. Goldenrod shoot density typically ranges from 150 to 200 per square meter, with some exceeding 300 shoots per square meter30. A single goldenrod stem can produce over 10,000 seeds31,32which exhibit high vigor and maintain viability even under diverse temperature and moisture conditions31. Approximately 50% of goldenrod seeds successfully germinate33. After initial seed establishment, goldenrods primarily reproduce vegetatively through rhizomes, forming dense clonal patches that comprise up to 90% of the total plant cover in abandoned fields5,30.

Fig. 1
figure 1

A vast meadow colonized densely by goldenrod.

Our study on goldenrod mapping was conducted in the Mazovia Province in central Poland (Fig. 2), specifically in the area west of Warsaw, encompassing the entire KNP and its surrounding neighborhood (SW corner: 52.26°N, 20.28°E; NE corner: 52.42°N, 20.92°E). The study area spans 18 km × 43 km, covering a total area of 774 km2. These areas included a variety of land use and land cover (LULC) types, predominantly vegetated areas, such as forests, grasslands, croplands, and settlements. The site was chosen because it represents a mixture of natural ecosystems including protected areas, human-altered environments, and transitional areas.

Over the past centuries the KPN area has been significantly influenced by human activity. Originally dominated by forests, the land was deforested, drained, and finally converted into farmlands and meadows34. Kampinos National Park was established in 1959 to protect its unique forest, wetland, and dune ecosystems, and its species diversity and historical heritage. Mandatory land acquisition began in 1975, leading to the abandonment of extensive agricultural activities in the area. This shift allowed ecological processes, such as secondary succession to take place, which facilitated the encroachment of invasive species onto abandoned rural areas. Goldenrod have notably invaded these fallow and abandoned lands34. Among other types of nature conservation, the KNP area is also designated as a Special Area of Conservation and Special Protection Area - Puszcza Kampinoska (PLC140001) – under the Natura 2000 network, as part of the Bird and Habitats Directives.

Fig. 2
figure 2

Illustration of the analyzed area covering the Kampinos National Park and nearby area in the north-west of Warsaw. Orthophotomap from Geoportal WMS service is used as a background in the map.

Satellite imagery

Our research explored the feasibility of recognizing Solidago spp. based on two sets of multispectral imagery from the Sentinel-2 and PlanetScope constellations. Sentinel-2 offers multispectral data with spatial resolutions ranging from 10 m to 60 m, covering visible to short-infrared wavelengths (Table 1)35. An additional advantage of the Sentinel-2 is its ability to provide multitemporal observations, enhancing the likelihood of obtaining cloud-free images. In mid-latitude regions of Europe, the revisit period ranges from two to three days. Sentinel-2 images downloaded from CREODIAS36 represent the year 2022, corresponding to the reference period for the in-situ data collection. For our study, cloud-free satellite images were selected from the Level-2 A product, which provides the bottom-of-atmosphere reflectance data. The selected image data covered several months of the growing season, including spring, summer, and autumn (Table 2). For images with limited, sporadic cloud cover, cloudy pixels were replaced with non-cloudy pixels from the nearest available dates (Table 2). Spectral bands of varying resolutions were resampled to a 10 m pixel size using the nearest-neighbor method.

PlanetScope is a satellite constellation developed and operated by Planet Labs, Inc37. The constellation comprises 130 small CubeSat 3U satellites, a number that continues to grow37. It is the only currently operating constellation that enables the daily acquisition of high-resolution satellite imagery for the entire world’s land surface. The latest third generation PlanetScope satellite, SuperDove, acquires multispectral data across eight spectral bands, six of which align with Sentinel-2 channels (Table 1). PlanetScope images selected for this study represent cloud-free bottom-of-atmosphere surface reflectance (Level-3B) at a spatial resolution of three meters. The PlanetScope data acquisition dates correspond to those of Sentinel-2, except for autumn data acquired in October for PlanetScope and November for Sentinel satellites (Table 2).

Table 1 Spectral characteristics of Sentinal-2 and planetscope data. The italics bands indicate those that are interoperable once between the two sensing systems.
Table 2 Acquisition dates (YYYY.MM.DD) of satellite image used in the study. Dates in parentheses indicate additional data used to replace cloudy pixels.

Reference data

Accurate reference data is essential for training and validating remote sensing-based classification models, particularly when mapping specific plant taxa such as invasive Solidago spp. In this study, we developed a reference dataset using two main sources: the national Topographic Objects Database (Baza Danych Obiektów Topograficznych, BDOT)38 and in-situ field surveys conducted during the 2022 growing season (August–November). We used BDOT polygons to identify and delineate non-Solidago LULC classes including buildings, roads, grasslands, croplands, forest stands, and water bodies (Table 3). These polygons were additionally visually verified and confirmed using high-resolution orthophotos and PlanetScope imagery from the same period. To account for possible temporal mismatches and classification ambiguity, grasslands and croplands were additionally confirmed via field visits. Reference data for Solidago spp. were collected only during field surveys when polygons with full cover of Solidago spp. were recorded in SW Maps software for smartphones (SW Maps, version 2.9.1.1 (06.07.2022), https://aviyaantech.com/swmaps/index.html). The reference polygons for all analyzed LULC classes were collected to represent homogenous cover with the minimum area of at least 20 m x 20 m, ensuring that each polygon could be accurately represented by a single pixel in both the PlanetScope and Sentinel-2 images. These reference polygons were designed to provide high-quality samples for training machine learning classifiers and validating classification results. The homogeneity of the cover within each sample (pixels) allowed for use of the same sets of samples for the classification of both satellite datasets.

Next, all polygons (obtained from BDOT and from field surveys) were overlayed with Sentinel-2 pixels grid (10 m x 10 m). Final sets of reference polygons were generated by selecting only those pixels, that were covered in 100% by a polygon of a certain LULC class.

To minimize the potential influence of autocorrelation on classification and accuracy assessment, reference polygons for all classes were divided into two groups: training and validation. This division ensured an unbiased accuracy assessment by preventing the use of samples from a reference polygon dedicated to training the classifiers in the validation process. Finally, to prepare the sets of training and validation samples, stratified random sampling method was applied within the reference polygons. An area representing a polygon of 10 m x 10 m corresponding to the Sentinal-2 pixel was set as a single sample. Details regarding the number of samples used during the training phase are provided in Section “Random forest” for the Random Forest algorithm and in Section “One-class classifier”, for the OCSVM algorithm.

Table 3 Description of land cover and land use (LULC) classes of interest from the topographic objects database (BDOT) database used for Preparation of reference data.

For the validation samples the proportion was estimated from the BDOT dataset, with the assumption that the Solidago spp. class constituted one-third of the grassland class. Due to the limited number of Solidago polygons, the number of Solidago samples was set at 500, and the sample number for the remaining classes were adjusted accordingly, based on their class coverage in the BDOT database. The estimated number of validation samples for each LULC class is listed in Table 4.

Table 4 Analyzed LULC classes and number or validation samples.

Methods

Classification procedure

Two machine learning algorithms were tested for the identification of Solidago spp.: RF and OCSVM. The characteristics of these algorithms, their usage descriptions, and the applied parameters are provided in the following subsections.

For both algorithms and image data sources, various combinations of classification features were utilized. This included the classification of spectral bands from single and multiple images within the prepared time series, as well as vegetation indices and statistics derived from these features. The selected indices were among the most commonly used, facilitating the analysis of different vegetation characteristics39 (Table 5).

Table 5 Vegetation indices used in the classification of Solidago spp. Including their equations and information of the spectral bands utilized.

Summing up, four types of features have been used

  • Spectral reflectance data from ten bands of Sentniel-2 and eight bands of PlanetScope. Classifications on Sentinel-2 were performed only on bands with 10 and 20 m spatial resolutions (strategy 1), excluding the 60 m spatial resolution bands due to their limited usefulness for land surface analyzes48,49. The exception was the direct comparison of Sentinel-2 and PlanetScope data, where all corresponding bands, including band 1 (coastal blue) from Sentinel-2 (strategy 2) were used. PlanetScope imagery was analyzed with native 3 m spatial resolution.

  • Vegetation indices;

  • Multitemporal statistical parameters from spectral bands: minimum, maximum, mean, median, and standard deviation values from five dates.

  • Multitemporal statistical parameters from vegetation indices: minimum, maximum, mean, median, and standard deviation values from five dates.

Different combinations of these features were used to design classifications for 17 scenarios for each satellite dataset, as summarized in Table 6.

Table 6 Overview of classification feature combinations used in various classification scenarios.

Finally, to directly compare the satellite datasets, an additional test was performed (strategy 2). This test involved classifying Solidago spp. using the RF classifier on the interoperable spectral bands of Sentinel-2 and PlanetScope (six bands). For each of the five data acquisition dates, classification was performed on the equivalent bands from both sensing systems. Bands 1 and 2 correspond between both sensors. Band 3 from Sentinel-2 corresponds to band 4 from PlanetScope and band 4 to band 6. Finally, band 5 and 8a of Sentinel-2 sensors are equivalent to bands 7 and 8 of PlanetScope data, respectively (Table 1).

Random forest

RF is a well-tested and frequently used machine learning algorithm for various classification tasks, particularly in EO analyzes18,19,50,51,52.

For the RF classification, two main parameters were set. The first was the number of decision trees that built the forest (500). Such high number of trees may be time consuming but ensures the repeatability of results in the context of accuracy performance51. the second parameter determined the number of features selected for each split in the tree. The square root of the number of input features was selected to optimize the classification computation. Based on previous research indicating improved performance of the RF classifier with a larger number of training samples13,16we used 1000 samples per class for all RF classification tests. These samples were randomly selected from the previously prepared reference datasets and represented eight different LULC classes.

One-class classifier

The second classifier used was the OCSVM algorithm20a machine-learning technique for anomaly detection. The OCSVM models a one-class problem, to identify unusual and infrequent occurrences in a dataset. It constructs a hyperplane to maximize the margin around the majority data points, effectively isolating potential outliers or anomalies. This method is particularly useful for unbalanced datasets with a large proportion of normal cases relative to anomalies. The OCSVM has a variety of applications, including fraud detection, network security, and quality control, due to its ability to classify instances as either normal or abnormal. In satellite data classification, OCSVM has been successfully applied, for example, for building detection with Sentinel-2 images21 or sea ice detection on Sentinel-1 SAR images53. Unlike multiclass classifiers such as RF, it does not require sample collection for classes of no interest. For instance, in the current tests using RF, samples from seven additional classes were used.

The performance of the OCSVM classifier can be optimized by adjusting two specific parameters, γ (gamma) and ν (nu). γ indicates the kernel width, that is, the Gaussian radial basis function while ν describes the fraction of outliers in the training set. To determine the optimal combination of both parameter values, a grid search was conducted as applied by Krupinski et al.21. This search tested 15 γ values and 10 ν parameter values, resulting in a total of 150 combinations. The applied parameter values are listed in Table 7.

Table 7 List of values used in the grid search of OCSVM classifications.

In all 17 scenarios (Table 6), Solidago was classified using each possible combination of γ and ν parameters. The same procedure was applied for Sentinel-2 and PlanetScope imagery.

Each classification process was divided into two steps. First, parameter tuning was conducted using a training set (500 Solidago samples only) and a testing set (7500 samples: 500 Solidago and 7000 non-Solidago) for each of the 150 parameter combinations within each scenario. This step aimed to identify the optimal parameter combination.

Figure 3 illustrates the parameter tuning procedure for one of the 17 scenarios, showing 150 OCSVM classifications in the form of an accuracy board. Each rectangle on the board displays accuracy metrics for a combination of OCSVM parameters. Values in red indicate the F1-score of the Solidago class, and values in blue indicate the producer’s (top of rectangle) and user’s (bottom of rectangle) accuracy. The green line marks the best result for that scenario. The board also included values of the γ and ν parameters used on X and Y axis.

Fig. 3
figure 3

Example of a grid search results - accuracy board for the classification of Sentinel-2 dataset (the first step of analysis - the parameters tuning).

The second step is an accuracy assessment based on a validation set. Although the training of the OCSVM classifier required only samples from one class, non-Solidago samples were also required to perform the complex accuracy assessment. In this phase 5230 samples (500 from Solidago and 4730 from non-Solidago) were used, as presented in Table 4.

Results

Comparison of classification results

The accuracy of Solidago spp. classification was assessed using a standard remote sensing method for computing the accuracy matrix and deriving accuracy measures, including F1-score, user’s, producer’s and overall accuracy. We used mainly F1-score as an indicator of the quality of classification because it represents the balance between the measures of the user’s and producer’s measures for each analyzed class. Figures 4 and 6 present the F1-score values obtained for the Solidago spp. class in 17 classification scenarios using two types of satellite images (Sentinel-2 and Planet-Scope) for the RF and OCSVM classifiers, respectively. The accuracy measures of the remaining (opponent) classes were of little interest in this study.

Random Forest classification.

For the RF classifier, the results (F1-scores) demonstrated relatively high accuracy across all scenarios and for both imagery sources. The highest F1-score of 0.98 was achieved using all possible combinations of Sentinel-2 data and their derivatives (scenario 17). In contrast, the lowest score of 0.71 was observed with the classification based on a single date (June) of PlanetScope data (Fig. 4; Table 8).

Sentinel-2 data generally yielded more consistent accuracy across different dates, except for the September image, which had the lowest values. PlanetScope data provided more differentiated results between the analyzed dates, with the June data providing the weakest result. For both datasets, autumn imagery (October or November) was crucial for achieving a high classification accuracy for Solidago spp. Additionally, F1-scores were higher for Sentinel-2 data compared to the PlanetScope data, with differences ranging from approximately 1 and 16%, depending on the scenario (Fig. 4). Overall accuracy for all scenarios was also higher for Sentinel-2 classifications (Table 8), reaching over 0.98 in the best scenario (17), compared to slightly lower values for PlanetScope data in the same scenario (0.97).

Scenarios utilizing multitemporal data generally resulted in higher accuracy scores compared to those based on single date images. Most F1-scores exceeded 0.9, except for scenario 15. which used yearly statistics of vegetation indices and yielded the lowest values from all classifications utilizing multitemporal data (0.87 and 0.8 for Sentinel-2 and PlanetScope data, respectively). The best results for the Sentinel-2 data were obtained using all possible data sources (scenarios 17–0.98), while the optimal classification for PlanetScope data was achieved with spectral bands from multitemporal imagery (scenario 11–0.97).

Fig. 4
figure 4

Graph representing F1-scores of the Solidago class for 17 classification scenarios using Random Forest classifier (strategy 1).

Table 8 Accuracy measures for the Solidago class obtained from the random forest classification scenarios used in strategy 1.

Figure 5 shows the F1-score for Solidago spp. classification using the RF algorithm performed on a reduced number of spectral bands (strategy 2). Only the bands that correspond between the two sensing systems were used. The analysis did not reveal obvious trends or advantages of either system for mapping Solidago spp., and the received F1-score values differed by a maximum of 5.6% (Table 9). For August and September, Sentinel-2 data yielded higher F1-scores, while for the three remaining dates, the PlanetScope system received better results. The highest F1-score was achieved with the PlanetScope data for the image from October (0.87), and the best result for Sentinel-2 was only slightly worse for the image from August (0.86). The lowest values for both sensors were observed for June with PlanetScope scoring 0.71 and Sentinel-2 scoring 0.68. Both sensors, demonstrated low user’s accuracy of less than 0.6 (Table 9).

Fig. 5
figure 5

Graph of F1-score values for the RF classification of Solidago spp. based only on the corresponding spectral bands.

Table 9 Accuracy measures derived for the Solidago class in the random forest classification at different acquisition dates in strategy 2.

Features importance

As part of the RF classification, Gini importance was calculated, expressing the mean decrease in impurity19. Such measures can be considered as indications of the level of importance of the features used in the classification, though should be interpreted with caution54. The Gini index was compared for classification based on the spectral bands for each acquisition date. It was not analyzed for classification using multitemporal data or spectral indices because of the extensive combination of classification features. Tables 10 and 11 present the relevance of particular bands of the Sentinel-2 and PlanetScope data, respectively, from the classifications in which all available spectral bands were used. For Sentinel-2 data, the SWIR bands B11 and B12 were consistently the most informative for all separate acquisition dates used in the study, followed by the NIR band (B8). Band B5, representing the red edge part of the electromagnetic spectrum, was the least important, considering the mean Gini index values from all images (dates). In the classifications based on PlanetScope imagery, the NIR band (B8) outperformed the other channels, whereas B1, the coastal blue band, was the least useful. In the second classification strategy (Tables 12 and 13), in which only the corresponding bands of both sensing systems were used, the NIR bands were found to be the most crucial for both EO datasets. The B3 band (green) of Sentinel-2 and B1 band of PlanetScope were the least informative.

Table 10 Classification feature relevance expressed by Gini importance for RF classifications with Sentinel-2 imagery. The highest mean value is in bold and underlined and the lowest is in italics.
Table 11 Classification features relevance expressed by Gini importance for RF classifications using planetscope imagery. The highest mean value is bolded and underlined, while the lowest is in italics.
Table 12 Classification features relevance expressed by Gini importance for RF classifications using Sentinel-2 imagery based on spectral bands that corresponds to planetscope data. The highest mean value is bolded and underlined, while the lowest is in italics.
Table 13 Classification feature relevance expressed by Gini importance for RF classifications using planetscope imagery based on spectral bands that corresponds to Sentinel-2 data. The highest mean value is bolded and underlined, while the lowest is in italics.

OCSVM classification

The highest accuracy of the OCSVM classification, measured by F1-score was 0.9 for Sentinel-2 and 0.91 for PlanetScope data (Table 14). For the Sentinel-2 data the best accuracy was achieved for scenario 16, which utilized statistics from the spectral bands and vegetation indices across five dates. The lowest result for the Sentinel-2 data (F1-score of 0.7) was obtained for the September image combined with the VIs (scenario 8). The highest accuracy for PlanetScope data was obtained for the combined use of spectral bands from all dates and their derivatives (scenario 17) (Fig. 6). The lowest accuracy for PlanetScope was seen with the spectral bands from the August image (0.55). With only a single Sentinel-2 image used, the highest F1-score of 0.86 was achieved from June data, where spectral bands were combined with vegetation indices (scenario 4). For PlanetScope, the best results for a single date were obtained for images from March (0.81), which were based purely on the spectral bands.

Fig. 6
figure 6

Graph illustrating F1-score values of the Solidago class for 17 classification scenarios using OCSVM classifier (strategy 1).

Table 14 Accuracy measures derived for the Solidago class for the OCSVM classification scenarios used in this study (strategy 1).

A comparison of both classifiers shows that RF enables more accurate classification than the OCSVM across the available datasets. The differences in the F1-score values for particular scenarios ranged from 1 to 28%. In all scenarios, the Sentinel-2 data yielded higher accuracies with the RF classifier. For PlanetScope, RF outperformed OCSVM in all scenarios except scenarios 3 and 4. For both the classifiers, the Sentinel-2 image from September provided the weakest results. Similarly, for both algorithms, multitemporal datasets (scenarios 11–17) tended to ensure more accurate results.

Solidago maps

Classification scenarios that performed best for both classifiers and datasets were used to prepare maps of the extent of goldenrods within the KNP. Figure 7 shows the distribution of Solidago spp. derived from Sentinel-2 and PlanetScope imagery using the RF algorithm. The maps resulting from the RF classification presented a very similar distribution of Solidago spp. in both datasets, with patches of different sizes scattered in the analyzed area. Slightly larger fragmentation is visible in the map based on PlanetScope data (Fig. 7b), particularly in the western part, which may originate from the different classification results and pixel sizes of the dataset used. From the perspective of the entire study area, good differentiation was observed between Solidago spp. and forested areas (as seen on the orthophoto map in the background). Most Solidago spp. occurs in grasslands or agricultural areas.

Fig. 7
figure 7

Maps of the study area with classified extent of Solidago spp. using the RF classifier; (a) classification of Sentinel-2 data (scenario 17) and (b) classification of PlanetScope imagery (scenario 11). AOI 1 and AOI 2 indicate locations of subsets presented in Fig. 9.

The maps representing the extent of Solidago spp. derived from the OCSVM classifier and the two datasets (Fig. 8) were even more similar to each other than those derived from the RF classifier (Fig. 7). It can also be observed that the plant extent was larger in the OCSVM classification map than in the RF maps. To some extent, this may result from the lower classification accuracy of the OCSVM results as compared to RF, with a slightly larger overestimation of the Solidago extent (Tables 8 and 14). Additional computations confirm this fact. The area covered by Solidago spp. classified by OCSVM on Sentinel-2 data (9895 ha) was 2180 ha larger than that derived from RF. Similarly, the area estimated from PlanetScope imagery and OCSVM (10650 ha) exceeded the RF result by 4250 ha. In general, the distribution of Solidago spp. on the maps was similar between the classification algorithms and satellite datasets, with the occurrence of invasive plants dominated either grassland or agricultural land. Figure 9 illustrates subsets of classification results for two areas (AOIs) for both satellite datasets and two classification algorithms. The illustrations present differences in the classification results and allow comparison with orthophoto (Fig. 9a and f).

Fig. 8
figure 8

Maps of the study area with classified extent of Solidago spp. with OCSVM classifier; (a) classification of Sentinel-2 data (scenario 16) and (b) classification of PlanetScope data (scenario 17).

Fig. 9
figure 9

Subsets of Solidago spp. classification in two locations (AOI 1 and AOI 2 indicated in Fig. 7) representing result from different classifier and datasets; a and f orthophotomap from Geoportal WMS service of the selected AOI 1 and AOI 2 (also used as a background in all subsets maps); b classification of Sentinel-2 data with RF algorithm in AOI 1; c classification of Sentinel-2 data with OCSVM algorithm in AOI 1; d classification of PlanetScope data with RF algorithm in AOI 1; e classification of PlanetScope data with OCSVM algorithm in AOI 1; g classification of Sentinel-2 data with RF algorithm in AOI 2; h classification of Sentinel-2 data with OCSVM algorithm in AOI 2; i classification of PlanetScope data with RF algorithm in AOI 2; j classification of PlanetScope data with OCSVM algorithm in AOI 2.

Discussion

This study demonstrates the applicability of remote sensing and machine learning techniques for detecting and monitoring of Solidago spp. in a real-world conservation setting. Unlike many previous studies that focus on algorithm performance in controlled or small-scale settings, our research evaluated classification accuracy for a well-established IAS across a large, heterogeneous protected area. Using operationally available multitemporal satellite data, we integrated phenological insights with multi-sensor observations to map Solidago spp. within KNP, Poland, a landscape experiencing increasing invasion pressure. The results confirmed that the RF classifier outperformed OCSVM across most scenarios, particularly when multitemporal data were included.

RF’s ability to handle high-dimensional input and minimize overfitting has been demonstrated in many remote sensing applications18,19and this study confirms its strength in the context of seasonal mapping of invasive plants. In contrast, the OCSVM classifier showed greater variability and tended to overestimate the presence of Solidago spp., especially when based on early- or mid-season data. This is consistent with known limitations of one-class classifiers, which lack information on background heterogeneity20,21. The analysis also provides practical insight into the relative utility of Sentinel-2 and PlanetScope imagery. Sentinel-2 consistently achieved higher classification accuracy, particularly in multitemporal scenarios, and its broader spectral range. (SWIR bands) proved valuable for detecting phenological shifts in Solidago spp. populations. Bands B11 and B12 were repeatedly identified as key features in RF classifications. These results support some previous work55but our study adds the important observation that these spectral advantages translate into measurable differences in seasonal detection performance, especially in autumn. Importantly, Sentinel-2 is freely available, making it more feasible option for national-scale IAS monitoring. Although PlanetScope data produced slightly lower accuracies, its finer spatial resolution (3 m) allowed for more detailed detection of fragmented or narrow Solidago patches that might be missed at Sentinel-2’s coarser resolution. This has implications for targeted control efforts or site-scale habitat restoration. However, PlanetScope’s commercial licensing presents a significant barrier to wide-scale use, particularly for public agencies or conservation Non-Governmental Organizations with limited budgets. These findings highlight that the practical trade-off between spatial detail and cost must be considered, and that Sentinel-2 data alone may be sufficient for most management applications when classification is optimized with temporal and spectral information.

One of the novel contributions of this study is the detailed assessment of phenological timing in classification accuracy. While phenology is frequently cited as important in vegetation studies9,56our work quantifies how seasonal differences affect the spectral separability of Solidago spp. The highest classification accuracy was achieved using autumn imagery (October–November), surpassing even the flowering period in August–September. This outcome contradicts some earlier findings13,16 that emphasized the blooming phase as the optimal detection window. Instead, our results suggest that the senescent stage — when seed heads are prominent and co-occurring vegetation is declining—offers better spectral distinction for Solidago. These findings provide operational guidance on when to acquire imagery for mapping this species. Additionally, March and June images also yielded high classification accuracy under some scenarios. Early spring imagery likely captured the dry biomass from the previous season, which remains visible before native vegetation regenerates. Meanwhile, June corresponds to a rapid biomass accumulation phase of Solidago spp., producing strong spectral signals in the NIR and red-edge bands. This finding underscores the importance of understanding species-specific growth stages rather than assuming that general vegetation vigor (e.g., NDVI peaks) aligns with optimal detection. Future monitoring efforts would benefit from aligning image acquisition with known phenological milestones of target IAS.

Another important finding concerns the relative utility of vegetation indices. While indices like NDVI and NDRE are widely used to enhance vegetation classification, their addition in this study did not consistently improve classification accuracy. In fact, several scenarios showed slightly lower performance when VIs were included, compared to using raw spectral bands alone. This suggests that the spectral signature of Solidago spp., especially when tracked across multiple dates, is already sufficiently distinct, and that the computational effort of calculating additional indices may be unnecessary in this case. These results align with recent observations that spectral richness and multitemporal statistics can outperform indices under certain conditions17. The visual comparison of classification maps between sensors and classifiers revealed that both systems captured a similar spatial pattern of Solidago spp. distribution, although PlanetScope-based classifications showed slightly more fragmentation. OCSVM-derived maps tended to overestimate species extent, highlighting the need for careful interpretation when using one-class methods without extensive field validation. While pixel-level accuracy metrics (F1-score, UA, PA) are commonly used, future work could consider incorporating area-adjusted accuracy assessments57,58 to better estimate the actual extent of invasion. This is especially relevant when classification outputs are used to quantify infested area for planning eradication or restoration measures.

It is worth noting that other open satellite data sources, such as Sentinel-1, provide a different type of information (Synthetic Aperture Radar) about objects on the Earth’s surface. These data could potentially enhance the performance of algorithms for Solidago spp. detection by contributing complementary features. However, verifying this hypothesis requires further research, and thus represents a promising direction for developing more advanced and accurate goldenrod classification methods. Notably, our literature review did not identify studies that have explored the use of Sentinel-1 data for Solidago spp. detection.

Practical recommendations

The results have broader implications for conservation planning and invasive species management. First, the study confirms that combining multitemporal satellite imagery with machine learning provides a viable, scalable approach to IAS monitoring. In areas like KNP, where goldenrods are widespread and expanding, timely and accurate maps are essential for prioritizing management actions. Second, the identification of optimal detection periods enables more efficient monitoring campaigns by aligning image acquisition with phenological windows. Finally, the methods tested here should be transferable to other invasive plant species with similar growth dynamics, especially those dominating disturbed or abandoned land. In conclusion, this study provides a practical and ecologically relevant framework for detecting and monitoring Solidago spp. using remote sensing and machine learning. The integration of phenological knowledge, multitemporal imagery, and sensor comparison advances the field beyond algorithm benchmarking toward real-world application. The findings can support environmental agencies and land managers in making informed decisions about where and when to act against Solidago invasions, while also contributing to the development of more effective remote sensing tools for biodiversity conservation.

Based on our findings, we recommend to use multitemporal Sentinel-2 data (in our study, 5 images from March to November) for Solidago spp. mapping, especially for large-scale monitoring. For finer-scale early detection or site-specific interventions, PlanetScope data may offer additional spatial detail, however, its cost limits its applicability for broad-scale use. The highest classification accuracy was achieved when training was performed on stratified samples representing various land cover and land use classes. However, acquiring such comprehensive reference data is often time- and cost-intensive. If only Solidago spp. samples are available, a decrease in classification accuracy can be expected - from an F1-score of 0.98 (using a Random Forest classifier) to 0.90 (using a One-Class SVM classifier). When using only a single image, the maximum achievable F1-scores were 0.90 (RF) and 0.85 (OCSVM), respectively.

Our results indicate that the timing of image acquisition throughout the season plays a more critical role in detection performance than the use of vegetation indices. These insights can support protected area managers, ecologists, and policymakers in designing cost-effective and timely invasive species monitoring strategies.

Conclusion

This study demonstrates the practical potential of combining satellite-based remote sensing and machine learning techniques to support the monitoring of invasive goldenrod (Solidago spp.) in Central Europe. By integrating multitemporal imagery from Sentinel-2 and PlanetScope with two classification approaches, RF and OCSVM, we provide a comparative assessment of methods suited to large-area and fine-scale detection of this ecologically impactful species. Our findings highlight the importance of selecting appropriate phenological windows for image acquisition. Contrary to common assumptions, autumn (October - November) emerged as the most effective period for Solidago detection, likely due to increased spectral contrast with native vegetation. While PlanetScope’s higher spatial resolution improved the detail of local mapping, Sentinel-2 consistently delivered superior classification accuracy, especially when using its shortwave infrared bands and multitemporal data, making it a cost-effective tool for regional monitoring. Among classifiers, RF outperformed OCSVM in nearly all scenarios, offering greater stability and accuracy when sufficient reference data were available. OCSVM, while useful in data-limited contexts, tended to overestimate species presence and requires careful tuning. From a management perspective, this study supports the operational use of Sentinel-2 for broad-scale Solidago monitoring and suggests that combining spatial and temporal features can improve detection outcomes without relying heavily on complex vegetation indices. These results can inform the design of early-warning systems and targeted control efforts in protected and invaded areas. The approach might be adaptable to other invasive plant species with distinct seasonal growth patterns, contributing to scalable, satellite-based solutions for biodiversity conservation and ecological restoration.