Introduction

The Southern Ocean (SO) is one of the most remote regions of the planet and has a pivotal role in the global climate system1. It is responsible for taking up to 40–50% of the oceanic carbon dioxide (CO2)2,3 and up to 75% of the excess heat4,5. The SO also supports global marine productivity by transporting nutrient-rich deep waters, known as Antarctic Bottom Waters (AABW)6,7, to the surface, which are subsequently exported to the lower latitudes, promoting primary productivity8,9 in those areas. The dynamics of this polar ecosystem is dominated by the Antarctic Circumpolar Current (ACC) and its frontal systems, which serve as transition zones where distinct water masses converge, but also diverge, resulting in notable changes in temperature and salinity6,10. The Southern Ocean gyres are created through forces conceived by the Earth’s rotation and global wind patterns7,11, are another key circulation features in this region. These gyres, located in the Ross Sea and in the Weddell Sea, act as important dynamic links between the ACC and the Antarctic coastline, facilitating the exchange of water masses7,11 and contributing to the thermohaline circulation11. By redistributing cold, dense waters from polar regions to the global abyssal ocean, the gyres regulate the flow of relatively warmer ACC waters towards the Antarctic margins while ensuring the movement of colder, denser waters away from the coast. This mechanism also enables the transport of AABW away from its formation areas, integrating them into the broader Southern Ocean overturning circulation and contributing to global deep-water ventilation7.

In addition to its circulation patterns, the SO is shaped by a pronounced polar seasonality, characterised by high changes in sea ice coverage, wind intensity and temperature. During the austral winter, a sharp decrease in irradiance and water temperature occurs, accompanied by increased mixed layer depth, nutrient concentration and sea ice extent6, with the austral winter being the least productive season. While the austral summer is characterized by a surge in primary production. However, phytoplankton growth during the austral summer can still be limited by the availability of iron, a crucial micronutrient required for their development6,12. The highest marine primary production occurs in shallow coastal waters, driven by increased light availability and the input of dissolved iron from melting ice during spring-summer6,13. The interplay of such seasonal factors in the SO, such as sea ice extent, wind patterns, irradiance, and temperature can also be highly variable across different regions (e.g., Weddell Sea, Ross Sea, Amundsen Sea). These regional differences contribute to the formation of distinct biogeographic zones, each characterised by unique physical, chemical, and biological properties6.

Among these regions, the Ross Sea stands out. Located in the Western Pacific sector of the SO (longitude: 120°E to 120°W and latitude: 60°S to 85°S; Fig. 1)6,10,14,15, it has the largest continental shelf in Antarctica, exhibits high phytoplankton biomass, and includes diverse subsystems, such as polynyas and marginal ice zones. This region’s physical and chemical characteristics are strongly influenced by katabatic winds and cyclones or low-pressure systems11,16,17. Katabatic winds are downslope winds that descend from the Transantarctic Mountains across the southern and western edges of the Ross Ice Shelf and the Victoria Land coast, flowing down to the coastal ocean, while cyclones make the Ross Sea one of the most active cyclogenetic regions in the world11,16,17. A clear example of the joint influence of these elements is the occurrence of the Terra Nova Bay polynya (TNBP) and the Ross Sea polynya (RSP) (Fig. 1), near the central Ross Sea Ice Shelf11. Polynyas are ice-free areas surrounded by sea ice, formed by strong winds pushing sea ice away, exposing open water18. These areas play a crucial role in bottom water formation11 by facilitating the rapid formation of sea ice, which increases salinity and water density, causing the surface waters to sink and contributing to global ocean circulation19,20.

Fig. 1: Location of the Ross Sea, highlighted by a red rectangle, within the context of the Southern Ocean.
Fig. 1: Location of the Ross Sea, highlighted by a red rectangle, within the context of the Southern Ocean.
Full size image

The left plot presents the Southern Ocean. The right plot presents a close-up of the Ross Sea, detailing key regional features, including the Ross Sea Polynya (RSP) and Terra Nova Bay Polynya (TNBP). Both plots also showcase the bathymetry (depth in metres).

In recent decades, the Ross Sea has experienced physical and chemical changes, including a shallowing of the mixed layer21,22, a rise in atmospheric temperature22,23 and a decrease in the average of sea ice coverage in the Ross Sea polynya22,24. These changes have profound effects on the region’s biological communities, including phytoplankton25.

Phytoplankton communities are fundamental to the marine ecosystem of the SO, forming the foundation of the Antarctic food web. They are a primary food source for Antarctic krill (Euphausia superba), a keystone species in the region26,27. Moreover, phytoplankton plays a key role in the global carbon cycle. By fixing remarkable amounts of CO2, they contribute to the sequestration of carbon as organic matter sinks to the ocean floor as “marine snow”, trapping CO2 in the deep ocean for extended periods of time27,28,29. This process not only helps regulate atmospheric CO2 levels but also sustains marine ecosystems by providing a continuous supply of organic material. Phytoplankton are widely distributed and abundant across the Ross Sea, yet their distribution are highly influenced by changes in their physical and chemical environment6,22,28. For example, in the oceanic/offshore waters of the Ross Sea (north of 65°S), phytoplankton abundance is generally low and varies depending on vertical mixing, sea ice coverage, and iron availability11,18. In contrast, the continental shelf is characterized by high phytoplankton production, as macronutrients (nitrate, phosphate, and silicate) are rarely depleted from surface water during the growing season11. Nevertheless, phytoplankton growth can still be limited by certain conditions, for example iron depletion, low-temperature, and lack of optimal irradiation conditions11.

Understanding the dynamics of phytoplankton communities is crucial for assessing ecosystem productivity. To study these communities, both in-situ and remote sensing methodologies can be used. In-situ sampling in remote areas such as the Ross Sea is mostly limited to research expeditions. Weather conditions including strong winds, heavy snowfall, and extreme cold might impede equipment deployment, jeopardize researcher safety, and limit the duration and frequency of data collection. Consequently, these limitations frequently result in data gaps, making it difficult to obtain continuous and comprehensive datasets, which are essential to understand the dynamics of phytoplankton communities30,31.

In contrast, ocean colour remote sensing offers substantial advantages, providing continuous monitoring with broad temporal and spatial coverage. As such, it is a cost-effective and efficient tool for analysing phytoplankton biomass patterns, promoting a better understanding of ocean ecology32,33. Additionally, remote sensing facilitates long-term analysis (e.g., 20+ years), helping address the gaps inherent to in-situ sampling. However, ocean colour remote sensing data in the SO is roughly limited to the months between October-April, when sunlight is more prevalent in this region, and can underestimate local chlorophyll a (chl-a) concentration, presenting challenges to data accuracy22,34.

Despite these challenges, remote sensing studies have already revealed valuable insights into the dynamics and phenology of the phytoplankton blooms in the Ross Sea, including their seasonal and spatial patterns in polynya areas, as well as their strong dependence on sea ice conditions22,25. These studies observed that phytoplankton dynamics in the Ross Sea Polynya (RSP) are heavily influenced by atmospheric circulation and sea ice decline, suggesting a potentially greater biological carbon uptake than previously estimated. Such findings emphasise the importance of conducting further research into phytoplankton dynamics throughout the Ross Sea and their important role in carbon cycle under the threat of climate change. However, most studies are often restricted to limited timeframes or specific regions like the RSP, overlooking the broader spatial and temporal variability across the entire Ross Sea, thereby limiting our understanding of long-term trends and interannual variability.

By analysing the full spatial coverage of the Ross Sea over a longer period (1998–2021) of remote sensing data, this study addresses critical knowledge gaps by providing a more comprehensive understanding of phytoplankton dynamics across the region. The primary goal of this research was to further understand how regional phytoplankton biomass has changed over the past decades, using remote sensing data from 1998 to 2021 (nearly 24 years of data). To achieve this goal, three specific objectives were delineated: (i) investigate the spatial-temporal variation of chl-a concentrations; (ii) analyse phytoplankton bloom phenology changes over the 24 years; (iii) assess how the abiotic parameters influenced chl-a variability.

This study highlights the complex interplay of environmental factors influencing phytoplankton dynamics (including its phenology) in The Ross Sea region. This approach, which distinguishes between oceanic, intermediate and coastal zones, provides a framework for understanding regional variations in productivity. This study offers a valuable perspective for comparing contrasting Antarctic environments and contributes to our understanding of how changes in the Southern Ocean impact global ocean processes.

Results

Spatial and temporal variation of phytoplankton biomass

Chl-a concentrations in the Ross Sea (Fig. 2a) ranged from less than 0.3 mg m⁻³ in low-productivity oceanic regions to over 3 mg m⁻³ in highly productive coastal regions. The most productive areas were located near to the coastal zones of Victoria Land and Cape Colbeck. In contrast, the lower chl-a concentrations were observed in the most offshore waters around 60°S. Notably, in the region between ~65°S to 73°S, the chl-a concentrations were highly variable, with values around 1 mg m⁻³.

Fig. 2: Satellite-based chlorophyll-a (chl-a) data for the Ross Sea during the growing season (September to April) from 1998 to 2022.
Fig. 2: Satellite-based chlorophyll-a (chl-a) data for the Ross Sea during the growing season (September to April) from 1998 to 2022.
Full size image

a Average chl-a concentration (mg m⁻³), where cooler colours indicate lower concentrations, and warmer colours represent higher concentrations. Key geographical locations along the coast, including Cape Adare, Terra Nova Bay, Victoria Land, the Ross Ice Shelf, and Cape Colbeck, are highlighted. b Pixel-wise chl-a linear trend (mg m⁻³ year⁻¹) over the same period, showing the annual change in biomass. Positive values indicate an increase, while negative values indicate a decrease (considering only values with a significant p-value (p < 0.05)).

The overall chl-a trend from 1998 to 2022 was not uniform across the Ross Sea (Fig. 2b), ranging from −0.100 mg m⁻³ year⁻¹ (indicating a decrease in chl-a overtime) to 0.100 mg m⁻³ year⁻¹ (indicating an increase in chl-a). The most significant increasing trends were observed in areas near the coastline, where polynyas are typically located, and off the Balleny Islands (longitude: 160°E to 165°W; latitude: 66°S to 67°S). Significant positive trends were also observed in the western offshore area around a latitude of 65°S. Conversely, declining chl-a trends were observed in the offshore region further north of the Balleny Islands and in coastal waters near Cape Adare and Cape Colbeck. In the transition zone, between the most coastal and offshore waters, no significant trend was detected, which might be related to insufficient data in this region.

Trends in abiotic variables

The analysis of the linear trends between 1998 and 2021 for the abiotic variables revealed distinct spatial patterns. In particular, sea ice concentration, Sea Surface Temperature (SST), mixed layer depth, and sea surface salinity exhibited areas with statistically significant trends.

The trend in sea ice concentration (Fig. 3a) was predominantly negative (up to −1.5% year⁻¹), indicating a decrease in sea ice cover. However, a few very limited areas also exhibited a positive trend (up to 1.5% year⁻¹), suggesting localized increases in sea ice coverage in the RSP and off Cape Colbeck.

Fig. 3: Pixel-wise linear trends from 1998 to 2021 for different abiotic variables.
Fig. 3: Pixel-wise linear trends from 1998 to 2021 for different abiotic variables.
Full size image

a Sea ice concentration (% year⁻¹); b Sea surface temperature (SST, °C year⁻¹); c Mixed layer depth (m year⁻¹); d Sea surface salinity (year⁻¹) in the Ross Sea. The trends were calculated using the slope of the linear regression over time, considering only values with a significant p-value (p < 0.05). For each year, only the growing cycle (September–April) was considered. Trends greater than 0 indicate a positive trend (increase in the abiotic variable), while trends below 0 indicate a negative trend (decrease in the abiotic variable).

SST (Fig. 3b) showed a predominantly positive trend (up to 0.06 °C year⁻¹), indicating an increase in ocean temperatures, especially around TNBP, off Cape Colbeck and in offshore waters north of 65°S. The trend analysis for mixed layer depth (Fig. 3c) revealed a spatially heterogeneous pattern, ranging from −3 m year⁻¹ (indicating a shallowing of the mixed layer) to 3 m year⁻¹ (indicating deepening). The negative trends were observed in the offshore area north of 65°S, as well as near Cape Adare and Cape Colbeck. In contrast, the positive trends were found in the transitional zone around 65°S, as well as in the RSP and above Cape Colbeck, north of 75°S.

Finally, the trend in sea surface salinity (Fig. 3d) was primarily negative, indicating a decline in salinity at a rate of up to 0.02 year⁻¹, indicating freshening of surface waters along the Ross Sea.

Phytoplankton bloom phenology

In the offshore regions around 60°S-65°S, the Bloom Start (BStart; Fig. 4a) mostly occurred in November and ended (BEnd, Fig. 4b) around January, with a duration of ~8–10 weeks (BDuration; Fig. 4c), exhibiting a bloom area (BArea; Fig. 4d) of 4–6 mg m⁻³. An intermediate area, around 65°S to 70°S, was characterised by lower biomass, associated with short and late blooms that started in January and ended in February-March, with a duration of 6–8 weeks. The coastal Ross Sea could be divided into two areas: (i) the area off Victoria Land that exhibited phytoplankton blooms that began in January and ended in March, with a duration of 6 weeks, and (ii) the area in the Ross Sea polynya that exhibited phytoplankton blooms that started earlier, in November-December, with a duration of 8–10 weeks. In addition, coastal blooms in the Ross Sea accumulated high biomasses, reaching up to 14 mg m⁻³.

Fig. 4: Average values of annual phenological metrics obtained for the growing cycle (September to April) from 1998 to 2022 in the Ross Sea.
Fig. 4: Average values of annual phenological metrics obtained for the growing cycle (September to April) from 1998 to 2022 in the Ross Sea.
Full size image

a Bloom Start (BStart; week of the year; week when bloom begins); b Bloom End (BEnd; week of the year; week when bloom ends); c Bloom Duration (BDuration; number of weeks; difference between the end and beginning of bloom); d Bloom Biomass (BArea; chl-a (mg m⁻³); accumulated production of the main bloom). Although the data in (a and b) are presented on a weekly scale, d the legend displays months for reference. Different colours represent different months. The methodology used to determine these metrics is described in Section “Phytoplankton phenology metrics”.

The results of the seasonal cycle reproducibility (SCR) (Fig. 5) were classified into three categories: low SCR, ranging from 0 to 40% (indicating that cycles over the years were not similar to the average cycle); medium SCR, ranging from 40 to 70% (indicating that cycles over the years were somewhat similar to the average cycle); and high SCR, defined as >70% (indicating that cycles were similar to the average cycle). The regions with a high SCR included the northwesternmost oceanic waters (around 60°S, eastern), the coastal areas off Cape Adare and Cape Colbeck, the Ross Sea polynya, and the Balleny Islands. Medium SCR values were mainly observed in the Ross Sea polynya region and in the oceanic areas in the northeastern section of the Ross Sea (around 60°S, western). Low SCR values were primarily observed in the Intermediate region (around 63°S to 70°S) of the Ross Sea and above the Ross Sea polynya.

Fig. 5: Reproducibility of the annual seasonal cycle of chl-a (SCR, in %) as an indicator of interannual variability between 1998 and 2022 (based in Thomalla et al., 2023).
Fig. 5: Reproducibility of the annual seasonal cycle of chl-a (SCR, in %) as an indicator of interannual variability between 1998 and 2022 (based in Thomalla et al., 2023).
Full size image

The SCR varies between 0 and 40% (yearly cycles that are not very similar to the climatological cycle); 40–70% (yearly cycles that are somewhat similar to the climatological cycle) and 70–100% (yearly cycles that are similar to the climatological cycle). The phytoplankton growing season (September to April), during the period 1998–2022, was considered for this study.

Regional patterns of bloom phenology

Using cluster analysis, three distinct phenoregions were identified (Fig. 6): the Oceanic phenoregion located in the offshore area north of 65°S; the coastal phenoregion, covering coastal areas around 75°S and the region near the Balleny Islands; and the intermediate phenoregion, representing the transition zone between the oceanic and coastal regions, extending approximately from 65°S to 75°S.

Fig. 6: Phenoregions identified through cluster analysis, with the respective climatologist cycle.
Fig. 6: Phenoregions identified through cluster analysis, with the respective climatologist cycle.
Full size image

a Graphical representation of the location of each of the phenoregions created: Oce (Oceanic, represented in blue), Int (Intermediate, represented in green), and Coa (Coastal, represented in orange). In the graphs on the right is the average seasonal chlorophyll-a cycle between 1998 and 2021 for each phenoregion: b represents the Oceanic; c represents the Intermediate and (d) represents the Coastal. In panels (bd), the solid line represents the average concentration of chl-a (mg m⁻³), and the shaded area represents the range between the 10th and 90th percentiles of chl-a concentrations.

The Oceanic phenoregion was located north of 65°S. This region was the least productive, with a mean chl-a concentration (BMean) of 0.26 mg m⁻³ (Table 1). Phytoplankton blooms in this region presented very similar bloom dynamics throughout the years, characterised by one well-defined peak, with higher chl-a concentrations from November to January (Supplementary Fig. 1a).

Table 1 Descriptive statistical metrics and linear trend results for each phenology metrics (see Table 4 for full names and description) and phenoregion

In the Intermediate phenoregion, the temporal variation of chl-a reflected the influence of adjacent phenoregions (oceanic and coastal). It had a mean chl-a concentration of 0.44 mg m⁻³ (Table 1), and the blooms exhibited high interannual variability, occurring mostly from December to March, as in the previous region (Fig. 6; Supplementary Fig. 1b).

Lastly, in the Coastal phenoregion was the most productive (BMean of 1.44 mg m⁻³ and a BPeak of ~3 mg m⁻³; Table 1) and included waters off the Ross Sea Ice Shelf (the Ross Sea polynya, Terra Nova Bay polynya, Victoria Land, and Cape Colbeck) and the Balleny Islands (Fig. 6; Table 1). Phytoplankton blooms in this region exhibited a very variable bloom start throughout the years (Supplementary Fig. 1c). Overall, the main bloom occurred from November-December to January-February. When the bloom started later (December or later), presenting a later BPeak, the bloom productivity was typically lower.

Drivers of bloom phenology

The Random Forest (RF) models were used in the three phenological regions and served to identify the main environmental factors behind the interannual variability in phytoplankton biomass (chl-a) and the bloom phenology metrics (BStart, BEnd, BDuration, BArea), which were considered the most relevant metrics. The explanatory power (R²) between the different models created for the different regions ranged from 59% to 75% (Table 2).

Table 2 Random forest model results for each phenoregion (Oceanic, Intermediate, Coastal)

In the Oceanic region, the RF models for the chl-a were not able to identify one single factor as the most its important driver. The first three drivers, i.e. wind direction, seawater current direction and SST (Supplementary Fig. 2) had similar importances in this analysis. The same occurs in the BArea, with the three main drivers being: SST, seawater current direction, wind direction (Supplementary Fig. 2). For the other metrics, it was possible to identify the most important driver, since they have a more marked variable importance. For BEnd, the main driver was seawater current direction, while for BStart and BDuration it was wind speed. In this region, when the wind speed was lower, the bloom started earlier. The blooms ended earlier were more associated with south-westward seawater currents (Supplementary Figs. 2 and 5).

For the Intermediate region, RF models indicated that sea ice concentration was the main driver of the chl-a model, suggesting that when the sea ice concentration was lower, the chl-a was higher. For the phenology metrics, the main predictors were seawater surface salinity for BStart and sea ice concentration for BEnd. In this area, when the sea surface salinity was high, the bloom started later, and with higher sea ice concentration, the bloom ended later (Supplementary Figs. 3 and 5). For the BDuration and BArea metrics, the RF models were not able to identify one main driver, due to their very similar importance (Supplementary Fig. 3). The BDuration was found to be influenced by seawater current direction, sea ice concentration and wind direction, while BArea was found to be influenced by seawater current direction and sea ice concentration.

The Coastal region was also a very dynamic area. The main driver for chl-a was the seawater current velocity; when the seawater current velocity was higher, the chl-a was also higher. The RF’s most frequent predictors of the phenology metrics were wind speed (for BEnd and BDuration) and sea ice concentration (for BStart and BDuration). When the sea ice concentration was lower, the bloom started earlier, and if the wind speed was higher, the bloom tended to end later (Supplementary Figs. 4 and 5).

Discussion

Spatial variation of phytoplankton

The spatial distribution of phytoplankton biomass in the Ross Sea exhibited high variability within this vast and complex ecosystem (Fig. 2a). To better understand this variability, we have divided the Ross Sea into three distinct regions: Oceanic (northern offshore region); Intermediate (between offshore and coastal zones) and Coastal (enclosing the most southern waters).

The Oceanic region, spanning offshore areas north of 65°S, exhibited the lowest chl-a concentrations. This is consistent with previous results13, although our observed maximum chl-a concentration (0.4 mg m−3) was lower than their reported values. This discrepancy likely arises from differences in the chl-a products used, as the earlier study13 used chl-a derived from a 3D NPZD model, illustrating the challenge of directly comparing studies in such complex regions. Nonetheless, both studies confirm that this region consistently exhibits the lowest chl-a concentrations within the Ross Sea. Despite favourable light conditions due to minimal sea ice cover (Supplementary Fig. 6c), phytoplankton biomass remains limited by this area’s distance from coastal nutrient inputs, especially critical micronutrients such as iron11,18. Moreover, the absence of consolidated sea ice and its meltwater further reduces the input of micronutrients35,36. The primary nutrient source allowing phytoplankton growth in the Oceanic region is the ACC, which promotes nutrient resuspension through the interaction between the Ross Gyre and the ACC11,37, and transport from adjacent areas, such as the transport of some nutrients from the Balleny Islands carried out by the ocean gyre38,39,40.

The Intermediate region exhibited high spatial variability linked to sea ice dynamics. Wind and ocean currents play an important role for the formation of polynyas south of the intermediate zone, promoting sea ice movement into this area41. Increased sea ice coverage here reduces light availability and temperature at the near-surface, constraining phytoplankton growth and resulting in zones of lower concentration of chl-a (around 0.6 mg m−3) in the region41. Surrounded by the less productive offshore waters of the Intermediate region, the Balleny Islands stand out as a hotspot of high chl-a concentration (around 2 mg m−3). This higher productivity likely results from two factors: i) nutrient resuspension caused by the intensified transport between the boundaries of the Ross Gyre and the islands coast (Supplementary Fig. 6d), supplying essential micronutrients, such as iron from nearbysediments37, and ii) enhanced upwelling driven by the shallower bathymetry around the islands, further promoting phytoplankton biomass38,39.

The Coastal region, south of 70°S, exhibited the highest chl-a concentrations with the Ross Sea, which is consistent with previous findings22. This high productivity is driven primarily by the Terra Nova Bay polynya and the Ross Sea polynya, two key ice-free areas that enhance nutrient availability and light penetration, when open, allowing for high concentrations of chl-a11. Nonetheless, the relationship between the sea ice and chl-a can be complex and vary regionally (Supplementary Fig. 7). For instance, the Terra Nova Bay polynya showed a negative correlation between both variables due to sea ice reducing light penetration into the water column, thereby limiting phytoplankton growth. In this polynya, sea ice appears to persist longer under weaker winds or when their direction changes (Supplementary Fig. 6a). Conversely, the Ross Sea polynya exhibited a positive correlation between sea ice and chl-a concentration, as the melting sea ice enhances light penetration and water column stratification, supporting phytoplankton growth11,25. Here, wind-driven sea ice displacement further increases open water exposure and phytoplankton productivity. The relationship that each of these polynyas has with sea ice has already been described in separate studies for the Ross Sea Polynya22 and in small detail for the Terra Nova Bay polynya11.

Trend analysis suggests that chl-a concentration is increasing in the eastern part of the oceanic region, north of the Balleny Islands, and within the polynya since 1998, potentially driven by large-scale climate variability. This aligns with findings by other studies24, who highlighted how climate-induced changes in regional winds and sea surface temperature affect phytoplankton growth and distribution. Their observations, of increasing chl-a in the Ross Sea Polynya, align with the results presented herein, suggesting a broader ecological response to climatic shifts. In this case, the localized increasing trend in chl-a might reflect a long-term trend in reduced sea ice within the polynya due to enhanced wind displacement25,36, potentially transporting sea ice northward and consequently chl-a concentrations in the intermediate region. Overall, our results show that the Ross Sea is a highly variable and interconnected system, where changes in one area can notably influence others, highlighting the importance of studying it as a whole.

Phytoplankton bloom phenology

In addition, phytoplankton bloom phenology in the Ross Sea exhibited considerable variability in timing, duration and intensity. The phenoregion approach used here allowed for a detailed analysis of phytoplankton bloom dynamics, across the three distinct phenoregions proposed herein.

The Oceanic phenoregion exhibited the least variability in bloom timing, typically starting in September, peaking between December and January, and declining thereafter. This stability likely reflects the lower environmental variation of this phenoregion when compared to the more dynamic Coastal and Intermediate phenoregions, which are strongly influenced by sea ice. The Oceanic phenoregion also had the least productive blooms, due to its distance from nutrient-rich coastal waters and the scarce sea ice, mirroring the low biomass already discussed in Section “Spatial variation of phytoplankton”. SCR and trend analysis confirmed these relatively stable conditions with minimal changes in bloom intensity and timing, in accordance with previous studies42.

The more complex Intermediate phenoregion exhibited high fluctuation in bloom timing, typically starting in early October, achieving its maximum value in January, and decreasing after that. The high variability of its bloom timing suggests that bloom phenology is particularly sensitive to local environmental conditions, especially sea ice presence22. Sea ice plays a crucial role in regulating light availability, as its presence limits light penetration into the water column, constraining phytoplankton growth and influencing the timing of bloom initiation43. This high variability is also clearly reflected by the low SCR observed in this area of the Ross Sea (Fig. 5), which is in accordance with the observations of previous studies42.

Lastly, the Coastal phenoregion was defined by a summer bloom that lasted approximately eight weeks, exhibiting the shortest bloom duration, yet the highest intensity (BArea) compared to the other phenoregions. In general, the bloom timing and duration observed for this phenoregion agree with other studies focusing on the Ross Sea polynya25, although with some slight differences that could be attributed to the size of the Coastal phenoregion identified here, which includes more than the Ross Sea polynya. This Coastal phenoregion benefits from a combination of bathymetry and katabatic winds, which lead to major water mixing. This process allows for the upwelling of nutrients to the surface, supporting rapid phytoplankton growth and contributing to the high productivity observed in this phenoregion11.

Bloom timing and intensity were seen to vary spatially and annually. This strong interannual variability is influenced by key abiotic factors such as sea ice extent, ocean currents, and wind patterns. For instance, the strong influence of sea ice cover is further evidenced by previous studies documenting high interannual variability in the mixed layer depth, which is closely linked to changes in sea ice cover in the Ross Sea44. Prevalent sea ice prevents mixing, thus influencing the stability and stratification of the water column. Moreover, large-scale climate oscillations, such as the El Niño Southern Oscillation (ENSO) and the Southern Annular Mode (SAM) further modulates environmental conditions across phenoregion. For example, shifts in westerly winds associated with SAM can affect sea ice retreat and upper-ocean mixing, particularly in the Oceanic and Intermediate regions, where wind and current dynamics are very important for bloom development45. Positive phases of SAM or ENSO are typically associated with reduced sea ice cover in the Ross Sea, potentially leading to earlier and more intense blooms. In contrast, negative SAM or La Niña phases tend to be linked to greater sea ice presence, resulting in delayed bloom onset11,41. These changes not only alter the physical environment but also play a key role in regulating nutrient availability, which is essential for sustaining phytoplankton growth11,45. The high spatial and temporal variability in bloom timing is further reflected by the low SCR coefficient values observed across the Ross Sea.

Bloom frequency was also seen to be highly influenced by interannual shifts in the environmental conditions of each phenoregion, as the number of phytoplankton blooms ranged between one to three per year, which is in accordance with the observations22. It is possible that this secondary or third bloom could be reflecting a continuation of the first bloom, prolonged by enhanced nutrient input provided by sea ice melting or upwelling, which may have been interrupted by the frequent limitations in satellite-acquired data in this region. Accordingly, as suggested in earlier studies22, it is recommended to interpret these additional blooms with caution, particularly in the more complex Intermediate phenoregion.

Drivers of phytoplankton change

Understanding the drivers of phytoplankton phenology is crucial, especially in large and complex areas, such as the Ross Sea, where multiple factors interact to shape phytoplankton communities. The Random Forest (RF) models used in this study allowed for identifying abiotic drivers of bloom phenology in each phenoregion, with varying levels of complexity, therefore showcasing the significance of using RF models to explore phytoplankton ecology. In less complex phenoregion, such as the Oceanic one, the RF models were able to clearly distinguish one or two drivers as the main influence for the analysed phenology metrics. However, in more complex phenoregions like the Intermediate one, some of the phenological metrics appeared to be influenced by a combination of multiple (3+) key drivers, which can make it difficult to draw conclusions. For instance, BDuration in the Intermediate region was seen to be modulated by seawater current direction, sea ice concentration and wind direction, all with similar importance for the performance of the model. Especially in these cases, the RF models should be analysed with caution given the complexity of the interplay of the drivers, as well as the high interannual variability of the Ross Sea.

According to the RF models, the main driver for the BStart in the Oceanic phenoregion was the wind speed. The Ross Sea gyre is partly formed by the westerly winds associated with the ACC, which also contribute to the region’s ocean dynamics. The gyre presents seasonal wind variations, with periods of strong winds (before the initial bloom period, end of winter/beginning of spring) which lead to more intense circulation, promoting mixing40. The results indicated that years with lower average wind speed were associated with earlier bloom; this could be due to reduced ocean mixing and subsequent shallowing of the mixed layer, which keeps phytoplankton in the euphotic zone where they can access light and nutrients40. Additionally, wind speed influences the duration of the bloom (BDuration) as lower wind speeds not only trigger an earlier bloom (BStart) but also promote conditions that help retain phytoplankton near the surface, where they have access to more light and surface-trapped nutrients, hence sustaining the bloom for a longer period. The intensity of the bloom, BArea is mainly influenced by sea surface temperature, seawater current direction and wind direction. When the sea surface temperature was higher (Supplementary Fig. 6b), the bloom tended to be more productive, which could be related to warmer waters enhancing the metabolic rates of phytoplankton, leading to faster growth and higher biomass accumulation. Additionally, higher SST can stabilise the water column, reduce mixing and keep phytoplankton in the euphotic zone, where they have better access to light and nutrients, further promoting bloom productivity40,46. Bloom productivity (BArea) was also influenced by seawater current direction and wind direction, with blooms attaining higher biomasses when both maintained a similar direction, which promoted a balance between stratification and mixing, favouring nutrient retention and phytoplankton growth11. In contrast to the other two phenoregions identified in this study, sea ice concentration was not identified as a driver for any of the phenology metrics. This likely occurs due to the low sea ice coverage in this area, which melts early in the spring and, therefore, is not a major driver throughout the summer, unlike what is observed in the other two regions (Supplementary Fig. 6c).

In the Intermediate phenoregion, the main driver of BStart was the sea surface salinity. RF results highlighted that when the average sea surface salinity was lower, the phytoplankton blooms began earlier. Lower salinity could be related to a decline in the sea ice, since the seasonal sea ice melting can result in a gradual and persistent reduction in the ocean’s surface salinity12,47. In this case, the sea surface salinity could be a proxy47 for the sea ice concentration, so when the salinity is lower, the sea ice is also lower, as shown in other studies13, and blooms can start earlier. Nevertheless, using salinity as proxy of sea ice melting should be done with care since salinity can also be considerably affected by changes in water column structure and in ocean circulation. For instance, despite this relationship between sea ice and bloom start, the key drivers of BDuration and BArea were seawater current direction, sea ice concentration and wind direction (less important for BArea), for both cases. Hence, it is possible that strong winds can have in an effect of pushing sea ice away, decreasing sea ice concentration and creating favourable conditions for extending bloom duration and increasing bloom productivity (and accumulated biomass). This is a testament to the complex dynamics of this phenoregion, showcasing the need for additional studies focused on this intermediate sector of the Ross Sea.

In the Coastal phenoregion, the main drivers of the bloom initiation were the sea ice concentration and SST. As sea ice melts, solar radiation penetrates more efficiently into the water column, increasing light availability for photosynthesis22,43,48. With less sea ice coverage and also more space to grow phytoplankton can expand horizontally during blooms. For instance, as described in provisors studies22, 2003 was characterised by very high sea ice concentration in the Ross Sea polynya, causing the bloom to start later, which goes in line with the results. In this phenoregion, the decrease in sea ice is mostly related to the wind. The katabatic winds flow from land over to the ocean, pushing sea ice away11, creating polynyas. The role of wind speed in this region is essential for phytoplankton bloom phenology because the concentration of sea ice in the area remains low as long as the wind remains at a relatively high intensity11,22, allowing for the phytoplankton bloom to last longer (BDuration) and end later (BEnd), which is line with the findings. Similar to the effect of wind in maintaining lower sea ice concentrations, the velocity of ocean currents can also play an important role, as both forces work together to redistribute sea ice, maintaining lower concentrations of sea ice in specific areas12. In coastal waters, the current velocity can also help with the replenishment of nutrients to the surface.

Overall, the timing and magnitude of phytoplankton blooms can shift in response to key environmental drivers, such as a decrease in sea ice driven by rising temperatures. Similar changes were reported in the Ross Sea during the austral summer of 20144, demonstrating how environmental changes can lead to unusual distributions of functional groups and phytoplankton abundance. These shifts in phytoplankton phenology and community composition are expected to become more frequent with ongoing climate change4,42. Such changes have potential to cascade through the food web, generating trophic mismatches that may disrupt higher trophic levels, including krill, fish, and penguin populations. Beyond trophic mismatches, changes in bloom timing and community composition may also affect the efficiency of the biological carbon pump and carbon sequestration in the ocean. Shifts in species composition or bloom phenology can reduce primary production and sinking rates, potentially decreasing the amount of organic carbon reaching the deep ocean, ultimately impacting global biogeochemical cyles4,49.

Final considerations

This study provides a valuable insight into the large-scale assessment of chl-a concentration, phytoplankton phenology, and their abiotic drivers across the entire Ross Sea over a 24-year period (1998–2021) using remote sensing data and advanced machine learning models, such as random forests. It reveals clear spatial variability, with coastal waters being the most productive due to nutrient-rich polynyas, while oceanic waters showed lower productivity due to nutrient limitations. Phytoplankton bloom dynamics also varied, with shorter but more intense blooms in coastal waters and longer, weaker blooms in oceanic regions, while the intermediate zone exhibited the greatest environmental variability. Key abiotic drivers, including wind speed, sea surface temperature, sea ice concentration, and ocean currents, significantly influenced bloom timing, duration, and productivity. The observed trends indicate increasing productivity in areas such as the Ross Sea Polynya and near the Balleny Islands, coinciding with reduced sea ice and warming waters.

Given the essential role of phytoplankton in carbon fixation and export, such changes in bloom dynamics may influence the regional carbon cycle by altering the timing, magnitude, and efficiency of carbon uptake and sequestration to deeper ocean layers. These shifts could have broader implications for nutrient cycling and biogeochemical processes in the Ross Sea and the Southern Ocean as a whole. Future research should prioritise integrating high-resolution remote sensing data with in-situ observational datasets given recent advancements in satellite-derived algorithms and products9,13,22. Additionally, long-term studies incorporating detailed abiotic and biological data (e.g., species composition, size structure) are crucial for better understanding the complex feedback between climate change and ecosystem responses in the Southern Ocean

Methods

Remote sensing data on chl-a concentrations

In order to estimate phytoplankton biomass, this study used the chl-a product provided by the Climate Change Initiative (OC-CCI, v6.0; available online at https://www.oceancolour.org/). The OC-CCI chl-a dataset has a spatial and temporal resolution of 4 km and 1 day, respectively, and covers the period 1998–2022 (25 years). It is generated by bias-correcting and band-shifting specific spectral bands in the ocean radiance (Rrs) measured by different satellite sensors: the Sea-Viewing Wide Field-of-View Sensor (SeaWiFS), Moderate Resolution Imaging Spectroradiometer (MODIS), Visible Infrared Imaging Radiometer Suite (VIIRS), Sentinel-3A and Sentinel-3B Ocean and Land Colour Instrument (Sentinel 3A and 3B OLCI) data to match the Medium Resolution Imaging Spectrometer (MERIS). After the creation of the dataset, chl-a is estimated by applying a blending algorithm based on optical water type classification50. The blending process integrates multiple algorithms: the Ocean Colour Index (OCI) algorithm, originally developed by NASA, which itself is a combination of Chlorophyll Index (CI) and Ocean Colour 4 (OC4); the Ocean Colour Index 2 (OCI2) algorithm, an updated parameterisation of OCI; the Ocean Colour 2 (OC2) algorithm, and the Ocean Colour x (OCx) algorithm. By blending these four algorithms while considering the optical properties of each pixel, the dataset achieves a more accurate and robust estimation of chl-a50. This approach was preferred over single-sensor data datasets, such as MODIS or VIIRS, because these datasets tend to underestimate chl-a in high-biomass areas and overestimate it in oligotrophic waters in the Southern Ocean due to a narrower blue-green reflectance ratio31. In contrast, the OC-CCI product integrates data from multiple satellite sensors, ensuring more consistent spatial and temporal coverage. Additionally, the blending algorithm used in OC-CCI accounts for different optical water types, thereby enhancing the accuracy of chl-a estimations across optically complex environments such as the Ross Sea.

The OC-CCI chl-a dataset product was chosen after a validation exercise which compared the observations of multiple satellite algorithms and products and a dataset of in-situ observations51 from the Ross Sea. The in-situ chl-a data used for validation were determined by High-Performance Liquid Chromatography (HPLC) and were compiled from previously published sources52,53. Spatially, most samples were obtained from the coastal regions of the Ross Sea, mainly around the Ross Sea polynya and the Terra Nova Bay polynya, with some collected from open ocean. Sampling was conducted between 2005 and 2014 (2005, 2006, 2011, 2013, and 2014) and occurred during the austral spring and summer (September–March), coinciding with the region’s peak in primary production. A total of 195 valid matchups (i.e., coincident satellite and in-situ observations) were identified, using two key criteria51: i) the in-situ and satellite observation had to occur within 24 h (1-day interval); and, if the first criterion was met, ii) the satellite chl-a observation was averaged over a 3 × 3 pixel box centred on the coincident observation. Four algorithms/products were evaluated to identify the most suitable one for this study: i) the global product OC-CCI, v6.054; and three regional algorithms ii) Johnson-GlobColour55; iii) Rrs667-based33; and iv) OC4-SO34. To compare the performance of the four algorithms/products, a Target Diagram analysis was conducted. This is an analysis that provides a visual assessment of the observed and predicted values, allowing to identify biases, random errors, and overall accuracy55. Based on this evaluation, the OC-CCI product was selected as it exhibited an overall higher accuracy and lower error (R = 0.58; Fig. 7). Although this product exhibited the best performance when compared to the in-situ dataset, it is important to acknowledge that one of the main limitations of satellite-based ocean colour products in high-latitudes regions is cloud and ice contamination. These significantly affects the availability and quality of remote sensing reflectance’s, often resulting in missed data and substantial errors31. Nevertheless, as mentioned above, the OC-CCI product uses optical water types in its development and integrates data from multiple satellite sensors, helping reduce the impact of these factors.

Fig. 7: Target diagram illustrating the relationship between satellite retrievals for chl-a concentrations and in situ match-up data.
Fig. 7: Target diagram illustrating the relationship between satellite retrievals for chl-a concentrations and in situ match-up data.
Full size image

Each algorithm is represented by a coloured circle: blue for OC-CCI, orange for OC4-SO, green for GlobColour, and red for Rrs667Linear. The closer an algorithm’s position is to the centre of the target diagram and to the in-situ data point (represented by the black star), the better its performance.

Abiotic data

For the abiotic data, different abiotic variables were considered (Table 3): sea surface temperature (°C; SST), sea ice concentration (%), mixed layer depth (m), sea surface salinity (unitless); seawater current velocity (m s−1), seawater current direction (°), wind speed (m s−1) and wind direction (°).

Table 3 Summary of the different biotic (chl-a) and abiotic sea ice concentration, sea surface temperature, mixed layer depth, sea surface salinity, seawater current velocity, seawater current direction, wind speed, wind direction) variables used in this study, including their spatial and temporal resolutions, units and data sources

Remote sensing SST and sea ice concentration data were obtained from the Global Ocean OSTIA Sea Surface Temperature and Sea Ice Reprocessed (available at: https://data.marine.copernicus.eu/product/SST_GLO_SST_L4_REP_OBSERVATIONS_010_011/description). Both variables have a spatial and temporal coverage of 5 km and 1 day, respectively, for the period 1998–2021 (24 years). This product presents gap-free maps of foundation sea surface temperature and ice concentration using a combination of in-situ and satellite data56.

Data on sea surface salinity, mixed layer depth, seawater current velocity and seawater current direction at surface were obtained from the Global Ocean Physics Reanalysis (GLORYS12V1) product57, available at the Copernicus Marine Environment Monitoring Service (available at: https://doi.org/10.48670/moi-00021). This is an ocean model which incorporates both satellite and in-situ data and has a spatial and temporal resolution of 8 km and 1 day, respectively, for the period 1998–2021 (24 years). Seawater current velocity, Eq. (1), and seawater current direction, Eq. (2), were calculated using the meridional component of seawater velocity moving towards the north (v-component; m s−1) and zonal component of seawater velocity moving towards the east (u-component; m s−1).

$${{{\rm{Seawater}}}}\; {{{\rm{Current}}}}\; {{{\rm{Velocity}}}}=\sqrt{{{u}-{component}}^{2}+{{v}-{component}}^{2}}$$
(1)
$${{{\rm{Seawater}}}}\; {{{\rm{Current}}}} {{{\rm{Direction}}}}= arctan \left(\frac{v-{component}}{u-{component}}\right)$$
(2)

Direction

Wind speed and wind direction were calculated from wind data (at 10 m above surface) from 1998 to 2021 obtained from the ERA558, a reanalysis product for the global climate and weather produced by the Copernicus Climate Change Service (C3S) (available at: https://doi.org/10.24381/cds.adbb2d47). This product has a spatial coverage of 25 km and provides daily average observations. To determine the wind speed and wind direction, the meridional component of wind velocity (v-component of wind, m s−1) and the zonal component of wind velocity (v-component of wind, m s−1), were used, as described in Eqs. (3) and (4).

$${{{\rm{Wind}}}}\; {{{\rm{Speed}}}}=\sqrt{{{\mbox{u}}-{\mbox{component}}}^{2}+{{\mbox{v}}-{\mbox{component}}}^{2}}$$
(3)
$${{{\rm{Wind}}}}\; {{{\rm{Direction}}}}=arctan \left(\frac{v-{component}}{u-{component}}\right)$$
(4)

Correlation between phytoplankton biomass and abiotic variables

The Pearson correlation coefficient and corresponding p-value were calculated between chl-a and sea ice concentration (Supplementary Fig. 7), using the pixel average value between September to April for every year of the dataset (n = 24). The correlation was considered when there were at least five valid data points were available, ensuring statistical robustness.

Phytoplankton phenology metrics

To evaluate phytoplankton phenology metrics between 1998 and 2022, the average growing cycle (spanning from September to April, excluding 29 February; n = 242 days) during the full dataset period was first derived for each pixel. Weekly (8-day) averages were then computed for each pixel (n = 31) to minimise the impact of missing data25.

The following phenology metrics were computed (Table 4): (1) week of bloom initiation (BStart); (2) week of bloom termination (BEnd); (3) duration of the bloom (BDuration; weeks); (4) biomass/productivity of the bloom (BArea; mg m−3); (5) amplitude of the bloom (BAmplitude; mg m−3); (6) yearly chl-a mean (BMean; mg m−3); (7) yearly chl-a maximum (BMax; mg m−3); (8) week when the maximum chl-a concentration was recorded during the bloom (BPeak); (9) Bloom frequency (BFrequency; blooms year−1); and (10) Total area within the growing cycle, as an indicator of the total biomass attained during the cycle (mg m−3; TArea).

Table 4 Summary and description of all phenological metrics calculated in this study (BMean, BMax, BAmplitude, BPeak, BStart, BEnd, BDuration; BFrequency, BArea, TArea), including its full name and units

To identify phytoplankton blooms, two conditions must have been met simultaneously53,59,60: (i) chl-a must have exceeded a threshold of +5% above the annual median (Fig. 8); and (ii) maintain the first condition for a minimum of 15 days. While this approach excludes blooms shorter than 15 days, it also ensures that only sustained phytoplankton growth events are classified as blooms, excluding transient spikes often caused by environmental noise (very short-term or spurious fluctuations that may not represent a real increase in phytoplankton biomass). The +5% allows for the identification of ecologically meaningful increases in biomass while minimizing the influence of short-term variability. This approach (and similar ones), has been applied in regions with high variability50, including polar areas25. It ensures robust results with reduced noise, thus making it a suitable method for studying phytoplankton blooms in the Southern Ocean61.

Fig. 8: Illustration of a summer phytoplankton bloom and the phenological metrics.
Fig. 8: Illustration of a summer phytoplankton bloom and the phenological metrics.
Full size image

The phenology metrics showed were calculated in this study (Bloom Amplitude, Bloom Peak, Bloom Start, Bloom End, and Bloom Duration). The Bloom Biomass metric is represented by the area shaded in green. The horizontal orange lines indicate the maximum chlorophyll-a concentration observed during the peak of the bloom (BMax), and the annual median +5% (threshold used to identify a bloom). Adapted from Ferreira et al. (2021).

The metrics BStart, BEnd, BDuration and BArea were estimated for each pixel. BStart and BEnd were determined as the week (of the year) preceding the onset and following the end of the bloom, respectively. BDuration was determined as a difference between the BStart and BEnd, while BArea was estimated using Simpson’s rule, numerically integrating the area of the graphical representation of the chl-a biomass53,62. BAmplitude was calculated exclusively for the main bloom of the cycle and the TArea was calculated, using the same approach as the BArea but for the entire cycle53.

Trend analysis of chl-a, phenology metrics, and abiotic variables

Pixel-by-pixel linear regression analysis was performed to assess the existence of linear trends in the average September–April chl-a from 1998 to 2022. Trends were also determined for the phenology metrics BStart, BEnd, BDuration, and BArea, as well as for the abiotic variables from 1998 to 2021. Specifically, each linear regression assessed the relationship between time (years; independent variable) and the corresponding metric (dependent variable). The trend was defined as the slope of the linear regression (i.e., change per year). In this study, linear trends were used to identify the direction and extent of change over time, as the main focus was to analyse if blooms were beginning earlier or later, or whether chl-a levels were rising or falling. Linear trend analysis is a simple way of assessing such trends over a 24-year time series. To ensure statistical robustness, only pixels with at least 10 years of data were included in the phenological trend analysis.

Reproducibility of the annual seasonal cycle

To further understand the spatial heterogeneity of phytoplankton phenology in the Ross Sea, an index of the reproducibility of the annual seasonal cycle of chl-a (or SCR) was calculated to analyse the similarities between each year’s growing cycles and the average climatological growing cycle42.

The SCR was calculated by computing pixel-by-pixel Pearson’s correlations between each year’s seasonal chl-a cycle and the climatological chl-a cycle (1998–2022; used as a reference value). The Pearson correlation coefficient obtained for each year were then averaged to obtain a single SCR value per pixel. The closer the SCR is to 100%, the more consistent the chl-a seasonal cycles are. This metric is particularly useful for capturing key scales of variability42,63. The closer the SCR is to 100%, the less variable the chl-a seasonal cycles from year to year, indicating regions where bloom timing and shape are stable over time, likely linked to least variability in the environmental drivers. Conversely, low SCR values highlight areas of interannual variability, and more variables areas42. While SCR is a useful tool for assessing the bloom seasonality, it is important to consider that it is more sensitive to the timing and shape of the seasonal cycle than to its intensity. Nevertheless, SCR remains a valuable tool to detect regions of ecological stability and variability, where phenological patterns are strongly influenced by climate-driven processes, particularly in polar regions.

Regional patterns of bloom phenology

As the Ross Sea is a large and complex region, phenoregions, regions with similar patterns of phytoplankton bloom phenology, were first identified using a hierarchical clustering analysis (Fig. 9). Three factors were considered: the SCR, the mean climatological chl-a between 1998 and 2021 (September to April), and the mean number of days without sea ice. Prior to the clustering, the spatial resolution of all datasets was standardised to 25 km to ensure alignment across pixels. The data were then normalised, and Euclidean distance was used as the measure of similarity between pixels, while Ward’s method was applied for clustering53,64,65.

Fig. 9: Flow diagram representing the different steps (top to bottom) involved in the partition of the Ross Sea area into three phenoregions.
Fig. 9: Flow diagram representing the different steps (top to bottom) involved in the partition of the Ross Sea area into three phenoregions.
Full size image

Workflow included: (top) Extraction of the chl-a and time series of sea ice for The Ross Sea, on a pixel-by-pixel basis and transformation of the data to the same 25 km spatial resolution; (middle) Climatological chl-a and the reproducibility of the annual seasonal cycle were computed in the following step. Additionally, the number of days without sea ice was also calculated; (bottom) After a cluster analysis (based on the 3 elements calculated above), three phenoregions were obtained and their different phenological metrics (BStart, BEnd, BDuration, BArea) computed. The metrics were then used in a phenoregion-specific random forest analysis to identify the main abiotic drivers modulating phytoplankton phenology.

To determine the chl-a climatological growing cycle (September–April) for each phenoregion, daily averages were first calculated. A 15-day moving average, requiring at least 15 valid data points within the window, was then applied to smooth the chl-a data, improving the visualisation of the seasonal cycle. To estimate phenology metrics for the different phenoregions, chl-a was spatially averaged into weekly (8-day) means, which were then used to calculate the phenology metrics for each year (n = 24).

Random forest

Random forest models (RF)66 were used to assess the primary drivers of chl-a (September to April) and key phytoplankton bloom parameters (BEnd, BStart, BDuration, and BArea) for each phenoregion (Fig. 9). These phenological metrics were considered the most relevant in order to understand bloom dynamics in this region25. RF models are widely used machine learning tool in ecology and have been applied to study various biological groups, including phytoplankton53,65,66,67,68,69.

The foundation of the RF lies in the aggregation of several decision trees, where samples from the input dataset are used to build each regression tree. Every decision node along the tree has a random selection of predictors available, and at each node, a splitting predictor is chosen according to a criterion until a final prediction is made at the end of the tree65. RF models are highly dynamic, exhibiting several advantages: (1) robust handling of missing values; (2) great adaptability and capacity to identify complex relationships between the response variable and the predictors; (3) capacity to effectively manage anomalies, predictor collinearity, and high-dimensional information; (4) stable and easily understandable results53,65,67. The RF model was selected, because its strong performance in capturing complex interactions between ecological predictors, with high variability66,67,68,70. This model can generate some biases, such as volatility in variable importance rankings, especially when there is a high degree of correlation among abiotic variables, or when several variables provide redundant information to the model. To avoid redundant information a Spearman correlation matrix was calculated to avoid multicollinearity among predictors, variables with a correlation coefficient ≥0.7 were considered highly correlated (Supplementary Fig. 8). In such cases, one of the correlated variables was excluded from the model to retain the most informative and non-redundant predictors. After selecting the variables for each phenoregion, 5 RF models were created, one for each phenology metric. The number of trees for the RF was set to 50050. This method measured the impact of each variable on the RF performance, comparing the performance of the original model with the performance after randomly permuting the values of each. The variables were ordered based on their average importance values, with those causing the greatest reduction in performance when permuted considered the most important. In this study, we obtained both R² and Mean Absolute Error (MAE) to provide insight into model accuracy across phenoregions.

After running the RF models, the main abiotic driver was identified and analysed. However, in some cases, it was not possible to distinguish a single main driver, as the first, second and, in some cases, third drivers, had very similar variable importances and contributed in a similar manner. Consequently, in these cases, the top two or three main factors were analysed.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.