Introduction

Long-term programs produce valuable ocean ecological time series that inform marine policy and resource management1,2. Scientific impact and influence on policy development increase with the duration of ecological studies3. Long time series are also essential for detecting population trends4 and for developing models to make near-term, data-driven forecasts of marine populations5. However, ship-based programs face numerous challenges that threaten consistency and longevity6. Vessel operations are expensive and constrained by logistics and competing scientific priorities. As a result, shifts in survey location and timing can impact the quality or frequency of ecological indicators.

Zooplankton are effective indicators of ocean change7 and play key food-web roles in the Southern Ocean8 where the challenges of maintaining time series are exemplified due to costly operational expenses and restrictive environmental conditions. Pteropod, salp, and euphausiid populations respond to environmental variability and indicate changes in ecosystem structure and function9. The shelled pteropod Limacina rangii is a bellwether of ocean acidification and tracks El Niño Southern Oscillation variability10,11. High densities of the salp Salpa thompsoni are associated with ocean warming and alter ecosystem function due to efficient grazing and vertical export12,13. The Antarctic krill Euphausia superba (largest, longest-lived euphausiid in the Southern Ocean, targeted in commercial fishery), the ice krill Euphausia crystallorophias (high-latitude, neritic), and Thysanoessa macrura (smaller than E. superba and E. crystallorophias, cosmopolitan, most numerically abundant) influence biogeochemical cycling and are important forage species9. The continuation of long-term observations to understand zooplankton population dynamics amidst climate change and to inform model development has been identified as a scientific priority14. Consequently, there is a need to assess how spatiotemporal sampling coverage influences the detection and predictability of ecological change as ocean observing programs allocate sea time. We leveraged three decades of Southern Ocean zooplankton observations to conduct analyses that may inform time-series programs throughout the global ocean.

The Palmer Long-Term Ecological Research program (Palmer LTER) has sampled oceanographic and ecological properties west of the Antarctic Peninsula since 1993, but its spatiotemporal coverage has shifted (Fig. 1). Expansion of the sampling grid in 2009 to document effects of sea-ice loss in a Far South subregion reduced the number of samples collected within the program’s original North and South subregions. The sampling domain also expanded into a Far North subregion after 2020 to examine the influence of Weddell Sea-sourced waters. Reduced coverage within the original subregions has the potential to introduce spatial bias and decrease the precision of long-running zooplankton density indices. Also, there was effectively a two-year pandemic-induced gap from 2021–2022 due to survey cancellation followed by a severe (7 weeks early) timing shift (Fig. 1). The impacts of more moderate (up to 11-day) shifts toward later end dates and earlier start dates from 1997–1999 and 2023-2024, respectively, have not been estimated. Finally, Palmer LTER faces an impending reduction in frequency from annual to biennial coverage, which poses two challenges for analysts: (1) whether the missing years can be successfully gap-filled, and (2) whether useful forecasts for future survey years can be made despite missing observations in preceding years. These challenges are common across long-term ecosystem monitoring programs; therefore, an analysis toolbox is essential for supporting ongoing surveys and for informing analyses that combine time series at regional to global scales.

Fig. 1: Spatiotemporal coverage of zooplankton sampling by the Palmer Long-Term Ecological Research program.
figure 1

Changes in (a) sampling grid extent, (b) spatial resolution (crosses indicate subset of discontinued stations), (c) cruise dates, and (d) survey frequency have logistical and scientific costs and benefits. Under the original sampling regime (1993–2008) 52 grid stations were sampled on average. Currently, an average of 26 annual stations are sampled across a larger area. Over the first 28 years, mean survey start and end dates were 8 January and 2 February (vertical gray lines on timeline). Timing shifted earlier after 2020, and survey frequency was reduced to every other year beginning in 2024. Colors indicate latitudinal subregions. On the map, symbol size indicates whether a station is sampled in the current design, and shape indicates whether a station was part of the original survey design or added more recently. Gray shading indicates seafloor depth. Map made using data from the SCAR Antarctic Digital Database106 and the International Bathymetric Chart of the Southern Ocean107 under Creative Commons Attribution 4.0 International Licenses.

We utilized the Palmer LTER’s unique record of zooplankton observations to: (1) demonstrate how reduced spatial sampling resolution can (but does not always) introduce spatial bias to annual density indices and long-term trends; (2) assess the species-specific effects of sample size on uncertainty; (3) quantify the within-year influence of sampling date on species density; and (4) simulate how sampling interval (and the resulting availability of data) affects our ability to forecast abundance and to gap-fill years without survey coverage. Declines in predictive performance when transitioning from annual to biennial survey frequency illustrate substantial information loss. Our analyses could inform decisions to modify sampling design as program managers consider tradeoffs between survey costs and monitoring change across ocean ecosystems. We expect our approaches will be useful for oceanographers and ecologists seeking to maximize the value of their observations through sound sampling design and robust statistical analysis.

Results and discussion

Effects of spatial coverage changes

Since its onset, Palmer LTER consistently samples three cross-continental shelf strata defined by depth (Coast, Shelf, Slope)15. From 1993–2008, stations were sampled every 20 km along transect lines in the North and South subregions (Fig. 1). Since 2009, the number of stations sampled per line was reduced to ~3 (~1 per cross-shelf stratum). As a result, Coast stations increased from 20% to 38% of the total stations sampled on average, and Shelf stations decreased from 57% to 37%, while Slope stations remained similar (23% to 25%) (Supplementary Fig. 1). These changes were particularly prominent in the South.

We compared annual mean zooplankton densities (individuals m-3) with delta generalized linear model (dGLM) indices that incorporated cross-shelf stratum as a spatial covariate. Annual mean indices were generally robust to changing survey design, but failing to statistically address spatial sampling distribution aliased a long-term trend in the case of the ice krill E. crystallorophias (Fig. 2). Correlations between annual mean and dGLM indices were weaker for E. crystallorophias (Pearson’s r = 0.69-0.73; p < 0.0001) than for other species (Pearson’s r ≥ 0.86; p < 0.0001). When using annual means, E. crystallorophias increased in the South from 1993-2020 (t = 3.9; p = 0.0006), but this trend was absent with dGLM indices (t = 0.60; p = 0.55). In contrast, density increases for the euphausiid T. macrura in the North and juvenile Antarctic krill (E. superba <31 mm) in the South were consistent between index methods (t > 2.17; p < 0.04). Significant directional changes were not detected for other time series (|t| < 2.0; p > 0.05). This trend analysis was intended primarily to assess the impact of the index method, and insignificant results do not necessarily imply population stability16. That is, new directional trends may emerge in the future, and interannual to decadal variability is already apparent (Fig. 2).

Fig. 2: Comparisons of zooplankton density time series.
figure 2

Annual indices calculated as the mean of log10-transformed data (black, dashed line) are compared to indices from delta generalized linear models (red, solid line). Upper row presents time series for the North subregion, and bottom row presents the South. Each column is a different species or age class. Pearson’s correlation coefficients (r) between the two time series are provided for each panel. Linear regression fits are plotted in panels if at least one time series exhibited a significant directional trend from 1993–2020. Thick lines are the mean fit and gray shading indicates 95% confidence intervals. All time series are centered on the 30-year mean in log10 space.

The dampening of trends when using model-based indices for E. crystallorphias (and juvenile E. superba to a lesser extent) (Supplementary Fig. 2) is likely attributable to spatial shifts in station coverage. Both E. crystallorophias and juvenile E. superba are most abundant in coastal waters15,17 where proportionally fewer stations were sampled prior to 2009. A long observational record enables the estimation of such spatial effects, and data requirements will depend upon the variability of a species’ distribution. For example, after the first 12 years of the Palmer LTER time series it was apparent that the cross-shelf gradients of macrozooplankton species densities were relatively consistent18, making it simpler to account for spatial effects when necessary. A similar model-based index approach with a 31-year time series reduced spatial bias when effort was restricted to nearshore stations in the central California Current during the COVID-19 pandemic19. In contrast, if a species’ distribution is more dynamic year to year, then estimating annually varying spatial effects may be more challenging when survey coverage shifts20. Our analysis was conducted retrospectively, but we recommend proactive assessments prior to spatial survey design changes that simulate various options by sampling from modeled distributions of target taxa21,22. These efforts can be valuable but are not trivial to execute. Our results suggest the annual mean generally can serve as an appropriate density index when a survey loses spatial resolution but maintains coverage across strata, but we recommend the use of model-based indices that include spatial covariates, especially for taxa with restricted spatial distribution (e.g., E. crystallorophias) that may be subject to bias from effort reallocation.

A robust increase in the recruitment of E. superba over the three decades of Palmer LTER observations (Fig. 2) supports the establishment of the southwestern Antarctic Peninsula as an important regional source of Antarctic krill production. Previous analysis of data compiled from multiple sampling programs revealed high mean density of larval E. superba south of 65 °S relative to northern waters in the southwest Atlantic sector from 1996–201623. The resulting poleward shift in the distribution and enhanced recruitment of the Antarctic krill population represents a marked change compared to earlier in the twentieth century. Higher annual densities of T. macrura and E. superba in the Palmer LTER time series were linked to higher summer phytoplankton productivity15, which has increased as a result of sea-ice changes that increase upper ocean stability, especially in the South24,25. Future impacts of changing primary productivity also will depend upon shifts in phenology and phytoplankton composition26,27. The tendency toward increased T. macrura density and E. superba recruitment throughout the Palmer LTER time series is thus likely a result of complex ice-ocean processes that influence bottom-up ecosystem dynamics.

Effects of reduced sample size

Due to the survey design changes described above, the average number of stations sampled was reduced from 31 to 9 in the North and from 21 to 8 in the South after 2008, decreasing precision of annual indices (Fig. 3). The 67% decline in annual station coverage coincided with a 47% increase (in log10 space) in standard errors calculated from a Bayesian implementation of the dGLM indices. However, the degree and drivers of uncertainty varied among zooplankton. Indices were more precise for abundant zooplankton (adult E. superba, T. macrura, and the pteropod L. rangii; present in 87-98% of hauls) than for rarer organisms (juvenile E. superba, E. crystallorophias, and the salp S. thompsoni; 44-71% presence) (Fig. 3). Uncertainty varied with annual station count and density, but the importance of these two factors differed between the abundant and rarer groups. Standard errors for adult E. superba, T. macrura, and L. rangii were more stable across years and scaled directly with sample size. Uncertainty was less consistent and exhibited stronger negative relationships with annual density for juvenile E. superba, E. crystallorophias, and S. thompsoni.

Fig. 3: Uncertainty of annual zooplankton indices.
figure 3

Log10-transformed standard errors versus annual number of grid stations. Standard errors were calculated from Bayesian delta generalized linear models. Sample size is 60 indices (30 years each for North and South subregions). Curves represent the predicted relationship between standard error and sample size (extrapolated across observed sample sizes of 6–35 stations). Line color indicates the z-scored annual mean index for each corresponding point. Inset plot displays time series of annual grid station count sampled in North (red) and South (blue) subregions, highlighting the change in station samples since 2009.

The influence of uncertainty on change detection will be species-specific and relative to temporal variability28. For example, although uncertainty was greatest for S. thompsoni, salps also exhibited the greatest interannual variability. In contrast, interannual variability and uncertainty were both relatively low for T. macrura and adult E. superba, suggesting more subtle density changes may be detectable for these euphausiids than for salps. Biological oceanographic time-series programs may benefit from identifying baseline uncertainty of their indices and formally considering impacts of reduced sample size (using the survey simulation approaches discussed above) when faced with logistical limitations that necessitate a modified survey design19,29.

Fisheries oceanography offers useful lessons on the implications of uncertainty as it relates to detecting population change. Shorter distance between sampling stations was associated with more precise density estimates in observed and simulated ichthyoplankton surveys off of Newfoundland, prompting the suggestion that survey designs be evaluated carefully to understand their capacity for change detection30. Simulated subsampling of the California Cooperative Oceanic Fisheries Investigations grid demonstrated that interannual fluctuations of larval fishes can be reproduced effectively with reduced annual sampling effort29. This latter analysis included a stratified design that mirrors Palmer LTER sampling since 2009. While a larger within-year sample size will increase sensitivity to more subtle variations, collecting representative data across the study region in as many years as possible should be the priority for detecting ecological change. Our results demonstrate that within-year sample size alone is insufficient to control uncertainty and suggest reduced spatial resolution has a minor impact on a time series compared to the introduction of temporal gaps (discussed below).

Effects of survey timing

Timing can be critical for ecosystem monitoring, and there is thus a need to understand how sampling dates impact survey goals. The annual Palmer LTER survey was scheduled in January to coincide with Adélie penguin chick rearing31,32. Densities of most zooplankton taxa changed over the 46-day sampling window from 30 December to 13 February. Mean densities of the salp S. thompsoni, the pteropod L. rangii, and juvenile Antarctic krill (E. superba < 31 mm) declined with sampling date (GLMM: z < −2.7, p < 0.007) (Fig. 4a), and the marginal effects of sampling two weeks earlier were 83%, 68%, and 49% density increases, respectively (Fig. 4b). In contrast, mean density of adult Antarctic krill (E. superba > 31 mm) increased with day of year (GLMM: z = 3.7, p = 0.0002) (Fig. 4a) and was predicted to increase 48% if sampling shifted two weeks later (Fig. 4b). Sampling date did not have a significant effect on densities of the euphausiids E. crystallorophias or T. macrura (GLMM: |z| <1.4, p > 0.16).

Fig. 4: Within-year changes in zooplankton density.
figure 4

a Response curves of zooplankton density as a function of sampling date for organisms with significant day-of-year effects estimated in generalized linear mixed models. Marginal effects are presented for the Coastal stratum of the North subregion. Black lines indicate predicted marginal means and colored shading represents 95% confidence intervals. b Estimated zooplankton density change vs. shift in sampling date. Line color matches the shading in panel a and corresponds to the same organisms. Negative shifts indicate earlier sampling and positive values correspond to later sampling.

The phenological patterns we detected agree with available data collected in this strongly seasonal ecosystem. A decline in S. thompsoni later in summer is consistent with the December peak in salp density33. Age-specific changes in the density of E. superba may be due to coastward migration of adults from oceanic waters into the sampling region after spawning and juveniles migrating out of the sampling domain into coastal bays for overwintering34,35. Declines in L. rangii were also documented through summer near Palmer Station36 and are attributable to ontogenetic vertical migration37. Complex zooplankton life histories thus apparently necessitate consistent survey timing to the extent possible so that annual indices remain comparable.

Sensitivities of 48–83% over two weeks demonstrate the potential to directionally bias zooplankton time series if sampling dates shift to the point that interannual signals are aliased. The up to 11-day shifts in 1997–1999 and 2023–2024, however, likely introduced tolerable bias relative to interannual variability. For example, annual density estimates varied across three orders of magnitude for S. thompsoni and L. rangii (Fig. 2). Juvenile E. superba density also varied by three orders of magnitude in the South and ~500-fold in the North. Density of adult E. superba was most consistent among years but still varied by more than an order of magnitude (Fig. 2). We thus emphasize the relatively small risk of phenological bias relative to interannual changes, which might be further exacerbated due to increased Antarctic climate variability38,39.

A complicating wrinkle is that phenology itself is not fixed and generally tends toward earlier life-history events in warmer years40,41. Consequently, a seasonally fixed sampling window may be sensitive to climate-driven changes in a species’ phenology, as recently made apparent in the Gulf of Alaska when warm conditions resulted in early spawning and reduced availability of walleye pollock to surveys42. Statistically accounting for shifting phenology is less important when the time window of ecological interest is fixed (e.g., Adélie penguin chick provisioning in late December to early February). An ideal approach would be to sample a grid multiple times per year to more robustly quantify within-year density changes, but this is rarely possible from a ship due to logistical constraints. Autonomous sampling (discussed below) may provide additional temporal coverage of biological and physical ocean conditions to better understand conditions before and after ship-based sampling. Ultimately, a survey’s goal determines when it should be conducted, and while our analysis generally supports maintaining consistent timing for Palmer LTER, the <2-week shifts were unlikely to meaningfully alias interannual changes.

Forecasting and gap-filling with changes in sampling interval

We explored the extent to which failing to collect annual data impacts the ability to statistically gap-fill missing years and to make useful forecasts for future survey years by fitting a suite of predictive time-series models. We used empirical dynamic modeling (EDM), which uses time lags of focal species density and relevant environmental covariates as predictors while making minimal assumptions about functional form43. In doing so, we leveraged information from intrinsic population processes and climatic patterns that may serve as useful predictors of marine populations when considered at time scales relevant to an organism’s life history44.

Using data from 1993–2020, we modeled the density of each zooplankton species (using z-scored dGLM indices) as a function of its own lags and of lagged environmental covariates (Supplementary Table 1; Supplementary Fig. 3). We built three models: (1) an annual model, (2) a biennial gap-filling model, and (3) a biennial forecasting model (Fig. 5). In the annual model, all preceding years (up to a species’ maximum life span) were available as predictors, as is the case with annual survey coverage. In the biennial gap-filling model, only density values from odd lags (e.g., 1, 3, and 5 years prior) were available as predictors. This model mirrors future observational availability for predicting years when no survey will be conducted. In the biennial forecasting model, only density values from even lags (e.g., 2, 4, and 6 years prior) were available as predictors, mirroring two-step-ahead forecasts made for the next survey year. Lags were determined relative to the target prediction year and are distinct from whether a year itself was odd-numbered (e.g., 1999) or even-numbered (e.g., 2000) (Fig. 5). All models could use the same environmental variables, as these are remotely sensed.

Fig. 5: Illustrating the three empirical dynamic model structures that were compared.
figure 5

Zooplankton species density (y) in year t is the prediction target. Filled circles indicate years for which observations are included, and empty circles indicate years when observations are excluded or unavailable. Arrows are drawn from the available focal lags to the prediction target. In this case, the preceding four years of density values are considered possible predictors, as would be the case for a species with a maximum life span of four years.

Gap-filling and forecasting during biennial sampling is most promising for L. rangii pteropods. The final EDMs for L. rangii included only sea-ice duration and winter Multivariate El Niño Southern Oscillation Index (MEI) as predictors (Supplementary Table 2), allowing for identical model structure and consistent predictive skill across annual and biennial versions (R2 = 0.44) (Table 1). Tight direct coupling between pteropod abundance and remotely-sensed environmental conditions, especially the MEI, has been previously reported11. Annual EDMs for S. thompsoni performed worse than using the annual mean as a prediction, and biennial models were not pursued. The unpredictability of salp density may be due to their episodic nature in the region (Fig. 2), which is likely a function of occasional transport from an offshore source region in the Antarctic Circumpolar Current45.

Table 1 Species life history traits and empirical dynamic model results

Annual EDMs predict euphausiid density reasonably well, but the relative performance of biennial gap-filling and forecasting models declined by 47% on average (Table 1). The best performing annual EDM was for adult Antarctic krill (E. superba >31 mm; R2 = 0.59), which also had the largest decline in predictive skill for its biennial models (R2 = 0.13–0.17). Annual EDMs for T. macrura and juvenile Antarctic krill (E. superba <31 mm) exhibited intermediate fits (R2 = 0.36–0.42), more similar to their biennial models (R2 = 0.22–0.25). For the ice krill E. crystallorophias, its annual model fit poorly compared to those of other euphausiids (R2 = 0.22), and its biennial models performed similarly to those of adult E. superba (R2 = 0.13–0.17). Continued collection of annual euphausiid observations likely would be required to usefully operationalize predictions, providing the opportunity to assess and refine performance.

Predictability of Antarctic krill recruitment highlights potential management-relevant applications, because krill is targeted in the Southern Ocean’s largest fishery46. Combining EDM with dynamic programming, referred to as Empirical Dynamic Programming47, allows computation of optimal policies amid competing objectives, such as maximizing fishery yield while conserving populations that prey upon the harvested stock. For example, high localized krill catch rates may inhibit population performance of dependent seabird and marine mammal species48,49,50. The precautionary management approach could leverage annual forecasts to identify years when krill availability will be low and concentrated fishing consequently would be more likely to impact dependent predators. Applying an empirical dynamic programming approach to this challenge is a promising avenue for assessing this trade-off, given E. superba is relatively short-lived and its population dynamics appear predictable to a useful degree51. However, annual survey coverage would maximize the potential for applications with management utility.

We thus demonstrate that the Palmer LTER time series can enable predictive model development while underlining the critical reliance of ecological forecasting on continued observation. Confronting predictions with new observations is a strong test of current knowledge that advances ecological understanding and enables societally-relevant applications52. In this case, the biennial forecast model could be augmented with additional training data as the Palmer LTER time series is extended. Iterative updating is critical if ecosystem dynamics change abruptly, which is more likely to go undetected with reduced observational frequency53. The gap-filling models we developed will not be updateable in the future, because observations will not be available to compare with predictions. In addition to decreased power for detecting zooplankton population change, the loss of consistent temporal coverage also substantially inhibits predictive capability.

Future outlook

Technical approaches to ocean observation are transforming rapidly as autonomous capability for data collection increases, reducing reliance on ship operations54,55,56. However, vessels are still needed to deploy and recover autonomous platforms (especially in the Southern Ocean), and many of the approaches discussed below are nascent, rather than readily transferable. Additionally, sampling gear calibrations are complicated, and without prolonged intercomparison periods new techniques may not detect changes in species abundance or distribution relative to long-term and large-scale historical data57,58. Notably, global and regional databases of zooplankton observations indicate declines in coverage from the 1980s to 2010s59,60. Consistent time-series programs thus are collecting increasingly rare but valuable data that enable calibration as autonomous approaches fill gaps in spatiotemporal coverage at smaller scales. Critical assessment of the comparability among methods is needed to determine if a new approach can effectively continue an existing time series or whether such a switch instead represents a new, independent time series61. Here we consider the potential of new methods to match the quantitative comparability and taxonomic resolution of net-based time series.

Multifrequency echosounders are commonly used to study euphausiids (especially E. superba), and more recently salps. Their incorporation into gliders and uncrewed surface vehicles allows for spatially extensive surveys comparable to those conducted by research vessels, albeit at slower speeds62,63,64. Networks of acoustic moorings can collect high-resolution data at fixed locations65. If replacing ship-based sampling with autonomous acoustic platforms, then approaches other than net hauls must be used to determine the taxonomic identity and size of sound-scatterering animals. This information is essential for estimating biomass, and diet sampling from penguins or fur seals is one alternate source of length-frequency data for E. superba62,66. A switch from zooplankton net sampling to the use of active acoustics solely would represent a substantial loss in biodiversity information and would decrease the interpretability of acoustic data, as these approaches are complementary, not redundant.

In-situ imaging and environmental DNA provide alternative approaches for collecting taxonomically resolved data. Cameras on gliders and profiling floats allow for morphological discrimination and size measurement of zooplankton67. However, such systems most effectively sample weaker-swimming taxa (perhaps including pteropods and salps). A major challenge is that in-situ camera systems sample small volumes of water, resulting in low detection probabilities and large uncertainties in density estimates68. Collection of environmental DNA samples by autonomous underwater vehicles and uncrewed surface vehicles could also reduce reliance on ships69,70. DNA metabarcoding from filtered water samples produces relative abundance estimates (i.e., the proportion of sequence reads within a given sample) for many taxa71,72. Deriving quantitative information requires species-specific genetic analysis as well as accounting for when DNA was shed and its degradation rate73,74. Efforts should continue to better understand how zooplankton data from nets, in situ imaging, and environmental DNA complement one another (particularly regarding taxonomic resolution and their quantitative comparability). It is unlikely any single method will be directly substitutable for net collections in a manner that allows for the simple transition of a ship-based time-series program to autonomous sampling platforms.

If zooplankton time-series programs are irreplaceable for detecting and predicting change, then how can they demonstrate value to maintain support? The zooplankton time series from the Newport Hydrographic Line in the northern California Current has been ongoing since 1996 and attributes its continued support to the generation of ecosystem indicators that mechanistically influence ecosystem services75,76. The identification of such indicators is only possible by integrating multiple time series from the same ecosystem. Additionally, networking time series from many regions is valuable for understanding broader processes across ocean ecosystems28,77, and the availability of ecological data to conduct such analyses is scarce when compared to physical and chemical oceanographic data coverage78. Another strength is that time-series sites are excellent locations for novel field and modeling experiments that leverage long-term context to bolster interpretability79. The risk nonetheless remains that time series are taken for granted. Teams that run time series must be creative and tenacious, as demonstrated by the >90-year history of the Continuous Plankton Recorder Survey, which underwent a hiatus during World War II, changed institutions on multiple occasions, and nearly ended during periods of funding instability80. Steady leadership is required so that time series can persist as ecological surprises continue to unfold.

Conclusions

Survey assessment is the first step toward informed decisions when sampling effort must be reduced. We suggest survey evaluations be applied (proactively if feasible and retroactively when needed) because biological, chemical, and physical indices are sensitive to changes in survey design. Many monitoring programs faced unanticipated challenges during the COVID-19 pandemic19,81, but survey changes may also be long term and motivated by scientific priorities or funding fluctuations1,82. Importantly, such exercises to deeply understand observational data also inform model development and studies aiming to synthesize the impacts of climate variability.

Time-series programs can maintain utility despite survey design changes. Spatial bias was apparent for one coastal euphausiid species when the historical domain was sampled at lower resolution to expand coverage, but such effects can be accounted for statistically to derive accurate indices. Though such sampling shifts come at the cost of increased uncertainty, an expanded survey area can be expected to enable the detection of climate-driven distribution shifts, as demonstrated in the Bering Sea when groundfish recently shifted out of historical survey areas83. Additionally, we showed the direct scaling between standard error and sample size does not always hold for patchier taxa (e.g., salps and juvenile Antarctic krill), limiting a scientist’s control over index precision. We also found that moderate (<2 week) timing effects were generally small relative to interannual variability. Biennial survey coverage likely will maintain statistical power to detect population trends, although trend magnitude will be less accurate, and some important anomalous years may be missed16. Gap-filling and forecasting will be hindered by the loss of observations in preceding years, and we emphasize the irreplaceability of missed sampling time points. There is a research opportunity to pair modeling tools applied here with ongoing observations such that any course adjustments will ensure continued robustness of time-series programs.

As climate variability continues to alter ocean ecosystems, shifts in seemingly remote regions such as the Southern Ocean will have global impacts84. Recognizing change and understanding the consequences will depend upon sound sampling design, continued international cooperation, and networked time series. There is a need to leverage technological innovation and ships of opportunity to alleviate costs while remaining clear-eyed about comparability among methods.

Methods

Zooplankton collection

The Palmer LTER program collected macrozooplankton using a 2 × 2 m square frame Metro net with 700 μm mesh towed obliquely to a depth of 120 m15. Five species were considered: Antarctic krill Euphausia superba, ice krill Euphausia crystallorophias, big-eye krill Thysanoessa macrura, the shelled pteropod Limacina rangii, and the salp Salpa thompsoni (oozooid and blastozooid stages combined). Euphausia superba was separated into juveniles (age-class 1) and adults (age-class 2+) based on a standard length cutoff of 31 mm85. Data are archived with the Environmental Data Initiative86,87,88. The survey conducted in November-December 2021 was excluded from all analyses, because it did not overlap with the dates of any other survey (Fig. 1). Ecosystem conditions were evidently different during 2021 sampling, and the effects of the substantial (7-week) temporal offset cannot be confidently disentangled from annual effects with only one year of data.

Day-night correction

Zooplankton hauls were conducted at all times of day, and the densities during day versus night (corrected solar elevation ≤ −0.8333°) were compared to test for effects of diel vertical migration15,89,90. Absences were excluded from this comparison. A one-sided Wilcoxon rank-sum test was used to test the hypothesis that density was greater at night than during daytime. All statistical analyses were conducted in R91. Nighttime density was significantly elevated for T. macrura (median night-to-day ratio = 1.84; p < 0.0001), L. rangii (median N:D = 1.83; p = 0.006), and S. thompsoni (median N:D = 3.20; p < 0.0001). When present at night, the density of these species was divided by their respective median night-to-day ratio prior to subsequent analysis. This step was taken to decrease night densities of vertical migrators to make them more comparable with daytime hauls. Due to the extended photoperiod during Antarctic summer, only 15% of hauls included in this study were conducted at night.

Effects of spatial coverage changes

We assessed whether changing cross-shelf representation in the North and South subregions biased annual zooplankton density estimates by comparing annual mean indices to dGLM indices with spatial covariates. For both methods, haul densities were spatially averaged within 100 × 20 km grid cells each year. The annual mean index was then calculated from log10-transformed densities17. Grid cell absences were replaced with one half the lowest non-zero density value for a given species to allow log10 transformation92.

The dGLM index consisted of a binomial generalized linear model (GLM) to estimate the proportion of presence and a lognormal GLM to estimate density when present93. Survey year and cross-shelf stratum were included as fixed-effect factors in both component models. Akaike’s Information Criterion (AIC) was used to compare versions of the dGLMs with and without cross-shelf effects. AIC quantifies the tradeoff between improving model fit and increasing complexity. When models with cross-shelf effects had lower AIC values, they were selected over the alternatives that did not account for cross-shelf differences. Twelve dGLMs were developed (North and South dGLMs for each of six zooplankton). The cross-shelf effects were retained based on AIC for both the binomial and lognormal models in 8 of 12 instances (Supplementary Table 3). In the remaining four instances, the cross-shelf effects were retained for either the binomial or lognormal component of the dGLM. After fitting dGLMs, annual coefficients for all year effects were back-transformed to original units. Probability of presence was calculated from annual coefficients of the binomial GLMs using the logistic function. Density estimates were calculated by exponentiating the annual coefficients of the lognormal GLMs. Annual indices were then calculated by multiplying occurrence probability times density. When a species was absent in a subregion during annual sampling (which only occurred in five years), absences were replaced with one-half of the lowest non-zero value in the given time series. This allowed for subsequent log10-transformation and comparison across index methods.

For both index methods, time series were normalized to the long-term mean so that they could be presented in a directly comparable anomaly format. For each taxon and subregion, we calculated Pearson’s correlation coefficients between pairs of index time series to compare interannual variability. We fit linear regressions to the continuous period of each time series (1993-2020) to test whether index approach affected density trends.

Effects of reduced sample size

To evaluate the effect of reduced sample size (number of grid stations) on uncertainty, we obtained annual standard errors from the dGLMs by running Bayesian versions of the index models using the rstanarm package94. We used weakly informative priors following the software’s default method. For each model, four randomly initialized Markov chains were each sampled for 2500 iterations (following a warmup of 2500 iterations). We examined the relationships among index precision, z-scored density, sample size, and the expected scaling between standard error and sample size19.

Effects of survey timing

To assess the effect of survey timing on observed zooplankton density, we fit generalized linear mixed models (GLMMs) with sampling date as a covariate using haul-level data (n = 1167-1292). Dates ranged from 30 December to 13 February (day of year −1 to 44, Fig. 1). GLMMs were fitted with a Tweedie distribution and log link function using the glmmTMB package95. Latitudinal subregion and cross-shelf stratum were included as additional fixed effects. Year was treated categorically and modeled as a random intercept. The marginal effects of sampling date on zooplankton density were calculated using the ggeffects package96 to estimate the impact of earlier and later survey timing. We used predicted mean marginal response curves to calculate percent change in zooplankton density (\({\varDelta }_{i}\)) as follows:

$${\varDelta }_{i}=\frac{{\rho }_{{t}_{i}}-{\rho }_{{t}_{0}}}{{\rho }_{{t}_{0}}}\times 100$$
(1)

where \({\rho }_{{t}_{0}}\) was predicted zooplankton density on a given reference date t0, and \({\rho }_{{t}_{i}}\) was zooplankton density on a date ti, which ranged from 14 days prior to 14 days after t0. Percent change for a given temporal shift is constant regardless of reference date, as in the cases of exponential growth or decay.

Forecasting and gap-filling with changes in sampling interval

Empirical dynamic modeling (EDM) employs lags of observed variables to account for unobserved variables and reconstruct system dynamics97. For example, lags of a given species’ density can implicitly account for changes in dispersal or the density of other species with which it interacts. Although system dynamics can be resolved from a single (sufficiently long) time series, skillful prediction may be improved by including influential driving variables, such as environmental conditions or multiple life-history stages43. EDM makes few assumptions about relationships between variables and reconstructs relationships non-parametrically from the data. Alternate variable combinations may be theoretically equivalent, and we aim to maximize predictive skill given the available information rather than identify a single best model structure. For example, information lost by removing the previous year’s observed population density may be effectively replaced by incorporating a climate index that influences population density. EDM thus provides a robust way to evaluate how sampling interval affects predictability.

We fit EDMs using Gaussian process regression and automatic relevance determination98 with the GPEDM package99. The North and South density indices were modeled hierarchically, allowing for similar (but non-identical) dynamics across subregions100. The hierarchical model structure includes a dynamic correlation hyperparameter, which indicates the similarity of the fitted functions between subregions. A dynamic correlation of 1 indicates the functions are identical, and a value of 0 indicates independent functions.

Model selection was conducted to maximize predictive skill (leave-one-year-out R2) and remove uninformative predictors. Inverse length scale hyperparameters for each predictor determine how smooth its relationship is with the response variable. A predictor was dropped if its inverse length scale was zero, because this infers there was no functional relationship between the predictor and the response variable98. Predictor selection followed three steps: (1) fitting a model using density lags only; (2) for E. superba only, considering a 1-year lag of the alternate age class (i.e., predicting adult density with last year’s juvenile density, and vice versa); and 3) adding environmental drivers individually (Supplementary Figs. 34).

Environmental variables included sea-ice duration, Southern Annular Mode (SAM), and Multivariate El Niño Southern Oscillation Index (MEI), all of which relate to zooplankton density (Supplementary Table 1)11,15,85,101. Annual sea-ice duration time series for the Shelf stratum of the North and South subregions were derived from daily GSFC Bootstrap version 3 sea-ice concentration102,103.

Lagged climate indices (up to 1–3 years prior, based on species age at maturity) were seasonally averaged, and we identified one SAM and MEI index to test in EDMs for each species based on Spearman correlation strength (Supplementary Figs. 56). Climate indices were averaged for summer (December-February), autumn (March-May), winter (June-August), and spring (September-November). Summer MEI was excluded due to seasonal weakening of the relevant atmospheric teleconnection104,105. Annual SAM and MEI indices were calculated as the mean of seasonal values. For T. macrura, L. rangii, and S. thompsoni we considered lags up to the preceding austral summer. For E. crystallorophias and E. superba, we included multi-year averages of seasonal and annual indices covering up to two and three years prior, respectively, based on age at maturity. All environmental and zooplankton time series were z-scored over the period 1993–2020 for EDMs.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.