Background & Summary

Harbor porpoises (Phocoena phocoena) are widespread, highly mobile animals1 and the most abundant cetacean species in the North Sea2. Despite their almost extinction in the 1960s in the southernmost part3, the last aerial survey of the North Sea (Small Cetacean Abundance in the North Sea IV; SCANS-IV) showed remarkable estimates of 340,000 to 560,000 individuals (95% C.I.) with a southward redistribution of the population2,4,5,6. The Belgian part of the North Sea (BPNS) presents a unique habitat for the species. It is a shallow (max. 40 m deep) and highly dynamic system, with a mobile subtidal sand bank system7. Human activities are intense in the BPNS, from heavily used shipping lanes to offshore wind farms (OWFs)—all of which continuously produce noise pollution8. This influences the behavior and presence of harbor porpoises, which can ultimately weaken their survival9.

Previous non-invasive efforts to monitor the abundance of P. phocoena in the North Sea involved aerial surveys (e.g., SCANS; Royal Belgian Institute of Natural Sciences10) and monitoring of stranding events5. Dependent on good weather and daylight, these surveys took place mainly during the summer (June to August), when harbor porpoises were at their lowest densities in the BPNS. Consequently, the survey’s results were not representative of the abundance and habitat use of the species over the entire year in Belgian waters, making observations a partial snapshot of the population at that particular moment.

Communication of odontocete cetaceans can cover various distances, depending on its function. Whether used for communication (whistles or short pulses11), or for foraging and navigation (clicks of echolocating odontocetes12), their echolocation provides us with invaluable clues of their habitat use, foraging, social structure and behaviour13. Passive acoustic monitoring (PAM) presents the opportunity to record the narrow-band high frequency (NBHF) click sequences of porpoises year-round without interfering with the animals’ activities, independent of the external weather conditions. However, oceanographic factors (e.g., tides, sediment transport) can be influenced by the weather in certain systems (e.g., sandbanks), and thus affect the quality of the recordings. Additionally, porpoise behavioral states influence the probability of detecting the vocalizing animal, which can lead to over- or underestimations of abundance14. Although there is a general risk of knowledge on the behavioral context of these sounds, PAM is proven efficient not only in monitoring distribution (presence/absence), but also in observing behaviors (e.g., foraging, traveling)15,16 and in estimating absolute abundances of small cetaceans over time and space17.

As top predators, harbor porpoises play a crucial role in the ecosystems’ health and functioning18. Their coastal distribution exposes them to direct and indirect anthropogenic stressors such as habitat destruction19, bycatch20, noise pollution8, overfishing of prey species21, toxic chemicals22,23 and climate change24. The number of individuals artificially removed from the population exceeded the maximum sustainable yield (MSY) for several European populations20, prompting the implementation of special conservation measures in the European Union (EU). They are protected under national and international laws (Annex II, EU Habitats Directive, Appendix II of Convention for the Conservation of Migratory Species of Wild Animals21) and included in the EU Marine Strategy Framework Directive (MSFD)25 and the OSPAR convention26. To understand their distribution and presence over time and space in the BPNS, a long-term continuous data series at high spatiotemporal resolution was needed.

In line with this objective, the Belgian cetacean passive acoustic network (BCPAN; Fig. 1) was established within the LifeWatch observatory27. Using Cetacean Porpoise Detectors (C-PODs; Chelonia Ltd., UK), or PAM loggers, echolocation clicks of odontocete cetaceans (except for sperm whales) can be recorded. C-PODs continuously listen for high frequency clicks (20–160 kHz), with a high-pass filter of 20 kHz, and stores solely the duration and other parameters (e.g., frequency, sound pressure level (SPL), bandwidth, etc.) of these clicks making it power-efficient. C-PODs were either moored onto a navigation buoy at 3-m below the sea surface or anchored to shipwrecks or artificial hard structures until 2017. However, due to the high number of lost devices and chain and wave breaking noise, it was decided that all C-PODs must be deployed on the seafloor. By summer 2018, this was achieved, and all C-PODs were deployed on a multi-use platform—an in-house developed tripod28 slightly modified to accommodate a C-POD, which can be moored on the seafloor.

Fig. 1
figure 1

Overview of the 8 stations (yellow circles) present in the Belgian cetacean passive acoustic network (BCPAN) within the Belgian part of the North Sea (BPNS; red line) and offshore wind energy zones (Offshore Wind Farms, OWFs; grey polygons). Bathymetry (in meters) and the most transited shipping routes (white) are displayed for context62,63,64,65. The red polygon in the inset map shows the location of the BPNS from a larger map scale.

With long-lasting batteries, requiring a small memory capacity and minimal manipulation, this sensor network is key for obtaining long-term data series on porpoise presence. This data series can potentially contribute to climate change research, which relies on decades of background data to obtain significant trends in ecological parameters15,29. External environmental factors (e.g., seasonality, daylight, tides and temperature) can also be related in distribution analyses30,31, potentially leading to the study of climate change effects and trends on the species once baseline parameters are established30,32. The entire pathway to obtain and maintain this data series, from data acquisition towards online data accessibility, is described in this paper along with its data curation, integration and quality control (Fig. 2).

Fig. 2
figure 2

Schematic overview to obtain and maintain the harbor porpoise data series from data acquisition to harvesting of biodiversity information (solid arrows), and the data files involved in each step (broken arrows). Data read from the PAM loggers (DATA0.CHE, DATA1.CHE,) are developed into .CP1 files, classified and manually validated as .CP3 files, and exported as (1) Detections and environment and (2) Train duration one-minute resolution text files according to quality class. The text files as well as the deployment metadata are uploaded on the European Tracking Network (ETN33) database, which could be visualized and analyzed through the LifeWatch data explorer36 and the lwdataexplorer38 package accessible in R. Datasets in minute- and hour-resolution are both published yearly with a Digital Object Identifier (DOI) on the Integrated Marine Information System (IMIS39,40,41,42,43,44,45,46,47) and Marine Data Archive (MDA48). Datasets in hour-resolution, aggregated from the minute-resolution datasets, are published in a Darwin Core Archive format (DwC-A) on IMIS49 and in several unrestrictive repositories.

Methods

Setting up C-PODs

C-PODs (Chelonia Ltd., UK) store clicks and the parameters of these clicks on a preprogrammed SD card. A maximum of 4096 clicks can be stored per minute to prevent high battery consumption and to avoid the SD card from being filled up with noise. A battery pack of 10 D-cell alkaline batteries provides enough power to record for 4 to 6 months. The default setting of logging was applied, wherein the C-POD does not log data when lying horizontally. The C-POD is therefore maintained in a horizontal position until it is deployed to avoid unnecessary consumption of battery power and memory space.

Deployment, retrieval and maintenance of C-PODs

Since the summer of 2018, all C-PODs, or PAM loggers, have been deployed on a multi-use platform (Fig. 3) across the ten stations of the BCPAN. The multi-use platform is an in-house developed tripod frame28 with a floatable collar (Deepwater Buoyancy Inc.), which was primarily designed to deploy acoustic receivers with an acoustic release system for fish telemetry purposes. The platform was slightly modified to fit a C-POD in a fixed vertical position in the floatable collar next to the acoustic receiver with an acoustic release system (Vemco VR2AR or Thelma Biotel TBR800). The floatable collar is connected to the tripod with a rope, of length 1.5 to 2 times the depth of the water column of the specific station, carefully coiled inside the central canister. The acoustic release holds the floatable collar tight to the tripod and allows the retrieval of the entire platform (details of design and deployment protocol were outlined28). The equipped tripod is lifted by the A-frame of the RV Simon Stevin using the deployment eye, lowered into the water column to the seafloor, and disconnected using a quick release. The design of the tripod assures its upright position on the seafloor ensuring the full-time recording of the sensors.

Fig. 3
figure 3

Technical drawing of the multi-use platform—a galvanized-steel tripod frame with a floatable collar, that can hold both a PAM logger and an acoustic receiver, attached with a rope carefully coiled inside the central canister. The rope is attached to the bottom of the tripod through a threaded rod. The acoustic release pin is connected to the tripod’s eye with turnbuckles. Weighted horizontal bars give additional weight to the tripod, while the anode protects the tripod from corrosion. The deployment eye is used to lift and lower the entire platform into the water column.

About every 4 months, the stations of the BCPAN (Fig. 1) are revisited for data acquisition and redeployment. The platform is retrieved through the acoustic release system by establishing a connection between the hydrophone of the deck unit at the surface and the acoustic release on the tripod. Once acoustic release is activated, the floatable collar holding the C-POD and the acoustic receiver surfaces and is retrieved on the rigid-hulled inflatable boat (RHIB). The rope connecting the floatable collar to the tripod is disconnected from the floatable collar and attached to the A-frame of the main Research Vessel for a full recovery of the tripod. The tripod and floatable collar are cleaned, rope is stored back in the rope canister and the data of the C-POD and the acoustic receiver are downloaded, new batteries are switched and both sensors are reactivated for a new deployment at sea to start a new recording cycle. The deployment metadata which includes the coordinates of the station and the date and time (UTC) of C-POD activation, deployment and retrieval are manually logged in the European Tracking Network (ETN33). While C-POD activation is defined as the date and time that the C-POD was activated, the deployment timestamp is logged at the time that the platform is fixed on the seafloor, and the retrieval timestamp immediately prior to the activation of acoustic release to retrieve the platform. Any abnormality in the deployment, hardware or data are stored in the ‘Comments’ section of ETN33.

Data acquisition and processing

Using the CPOD.exe software, data recorded by a C-POD is read from the SD card, stored at the internal server of VLIZ (Vlaams Instituut voor de Zee - Flanders Marine Institute), processed, classified and manually validated (Fig. 2). Data read from the SD card (DATA0.CHE, DATA1.CHE,…) are developed into .CP1 files containing all detected clicks. From the .CP1 files, the click trains originating from porpoise clicks (narrow-band high frequency clicks) are automatically detected using the built-in KERNO classifier algorithm of CPOD.exe and stored as .CP3 files. KERNO uses multiple hypothesis testing to test multiple features from the raw data to isolate trains that come from one of the so-called species classes (harbor porpoise, other cetaceans or sonar), and then classifies each train34,35. KERNO also classifies trains according to quality class (high, moderate and low) where high-quality trains have the lowest risk of false positives and vice versa.

High-, moderate- and low-quality click trains are then visualized in CPOD.exe, wherein a maximum of 100 randomly selected click trains per deployment are manually validated according to their features (see Technical Validation). Data per minute is exported as (1) Train duration total and (2) Detections and environment text files (Table 1) separately for high-, moderate- and low-quality click trains.

Table 1 Description of the variables in the Detections and environment (A) and Train duration total (B) text files extracted from CPOD.exe after manual validation61.

Data registration and visualization

The ETN33 online web application, component Underwater Acoustics, was developed in-house to store all metadata and output files of the C-PODs. In addition to the deployment metadata (activation, deployment and retrieval date and time in UTC) of each C-POD deployment, valid date and time until which the C-POD was actively recording on the seafloor are registered on the ETN33 database to consider the occasional termination of recording prior to retrieval due to battery power or memory loss. The latter is checked after processing the data through the CPOD.exe software and defines the end time of data collected at sea. This information is crucial as the C-POD is not equipped with an internal clock. Only data between the timestamps of deployment and until which the C-POD was actively recording on the seafloor are registered on the ETN33 database to ensure that all data stored on ETN33 were exclusively recorded while the C-POD was fixed underwater. All metadata timestamps are important to keep track of sensor maintenance and performance.

Data Records

The Train duration total and Detections and environment text files (Table 1) uploaded and merged on the ETN33 database can be visualized on the LifeWatch data explorer36 and downloaded after registration on the platform (Fig. 2). The LifeWatch R package (lwdataexplorer; Data record 137,38) also provides open access to this minute-resolution dataset; i.e., observations per minute (detection positive minutes, DPM). Using the lwdataexplorer38 package, datasets of a selected period and quality can be viewed in an R data frame object with the getCpodData function.

Yearly exports of high- and moderate-quality data from the entire minute-resolution dataset are published as a .csv file (Data record 239,40,41,42,43,44,45,46,47): station names, species name and date and time are standardized, and minutes when the C-POD was not working are excluded. These yearly datasets in minute-resolution are published with a Digital Object Identifier (DOI) along with their metadata on the Integrated Marine Information System (IMIS39,40,41,42,43,44,45,46,47) and archived in Marine Data Archive (MDA48).

Subsequently, datasets in minute-resolution of high- and moderate-quality are aggregated into hour-resolution datasets and presence/absence information from these datasets are stored in a Darwin Core Archive (DwC-A), a standardized format for sharing biodiversity data (Data record 349). The resulting archive is published with a DOI and is accessible on IMIS49 and in several unrestrictive online repositories—the European Marine Observation and Data Network (EMODnet50), Global Biodiversity Information Facility (GBIF51), European Ocean Biodiversity Information System (EurOBIS52) and the Ocean Biodiversity Information System (OBIS53) (see Fig. 2).

These data records generated from the data processing and published online through several public repositories are summarized in Table 2. Details of variables of the yearly published datasets (Data record 239,40,41,42,43,44,45,46,47) include the coordinate variables, metrics derived from the acoustic device and the technical properties of the acoustic recorder (Table 3). Original exports from the C-POD loggers (e.g., .CP1, .CP3, Detections and environment and Train duration total text files) are stored in the internal server of VLIZ but can be made available upon request.

Table 2 Various data records of processed C-POD data published online through several public repositories such as the Integrated Marine Information System (IMIS39,40,41,42,43,44,45,46,47,49), European Marine Observation and Data Network (EMODnet50), Global Biodiversity Information Facility (GBIF51), European Ocean Biodiversity Information System (EurOBIS52) and the Ocean Biodiversity Information System (OBIS53).
Table 3 Description of the variables from the yearly published dataset (Data record 239,40,41,42,43,44,45,46,47) at the Integrated Marine Information System (IMIS39,40,41,42,43,44,45,46,47) of VLIZ. Variables are grouped into three classes: coordinate variables, primary data and technical variables. Only high- and moderate-quality click trains are published in the yearly exports.

The acoustic recordings from 2016 to 2022 across all stations contained a total of 19,889,836 recorded minutes with 993,062 minutes where porpoises were present (Fig. 4). Every minute with a number of clicks exceeding the maximum number (4096 clicks) that the SD card can store per minute is classified as a “lost minute.” Throughout this period, there were a total of 1,924,544 (9.68%) lost minutes. Lost minutes are directly related to the noise of the C-POD’s environment and can be either due to natural sources (e.g., tides and sediment transport) or the type of mooring, which influences the stability of the PAM logger. Surface- and bottom-moored C-PODs had an average of 19.76% and 7.88% lost minutes of the recorded data respectively (Fig. 5). After 2018, when the deployment of the multi-use platform was standardized following a pilot period, stations Middelkerke South, Oostdyck west, LST420, WK16, WK19 and Gootebank were no longer operational (Fig. 4). Stations G-88 and Nautica Ena were discontinued in 2019 and 2021 respectively, leaving 8 stations currently monitored (Figs 1 and 6). As of 2022, there had been a total of 127 C-POD deployments. The largest number of deployments were at the 8 stations annually monitored since 2018, and the discontinued Nautica Ena (Fig. 7). Unavailable data between deployments were either due to the malfunctioning of the C-PODs (i.e., loss of data from malfunctioning of the SD card or battery) or unforeseen weather circumstances preventing access to the site.

Fig. 4
figure 4

Summary of total recorded minutes for each station obtained over the course of the LifeWatch Belgian cetacean passive acoustic network per year (BCPAN; 2016–2022) in the Belgian Part of the North Sea (BPNS). The size of circles corresponds to the total recorded minutes per station. Label codes for each of the stations are provided in the table. Offshore wind energy zones (Offshore Wind Farms; OWFs) in the BPNS are shown (striped polygons)62,63.

Fig. 5
figure 5

Percentage of lost minutes per station over the course (2016–2022) of the LifeWatch C-POD network in the Belgian part of the North sea (BPNS) by the type of mooring used for deployments. Data were extracted from the lwdataexplorer38 package with the getCpodData function in rstudio.lifewatch.be.

Fig. 6
figure 6

Average of daily Detection positive minutes (DPM; black circles) of harbor porpoises per station. Seasons were based on the exact astronomical dates of equinoxes and solstices in the northern hemisphere (spring equinox: ~20 March, summer solstice: ~21 June, autumn equinox: ~22 September, winter solstice: ~21 December). Offshore wind farms (OWFs) are indicated by striped, grey polygons. Data were extracted from the lwdataexplorer38 package with the getCpodData function in rstudio.lifewatch.be. Shpfiles were sourced from marineatlas.be and marineregions.org.

Fig. 7
figure 7

Belgian cetacean passive acoustic network (BCPAN) data availability per station from 2016 to 2022. Data were extracted from the lwdataexplorer38 package using the getCpodData function in RStudio.

Technical Validation

A first check of the data is performed in CPOD.exe, confirming the activation and deployment timestamps and until which the C-POD was actively recording in a fixed position underwater. High-, moderate- and low-quality click trains classified by the built-in KERNO classifier of CPOD.exe are quality controlled through visual validation54 of 100 randomly selected click trains per deployment. Since cetaceans mostly produce trains continuously11,14, a click train coming from a particular source should have similar characteristics within successive clicks (i.e., sound pressure level profile is evenly spread), a quiet background where random unrelated clicks before, during and after a train are few to none, and a temporal association with other trains54. Specific features of each click train are then assessed for comparability to a harbor porpoise click train (NBHF) as described by Chelonia Ltd., UK (Table 4).

Table 4 Features of click trains used to validate porpoise clicks (narrow-band high frequency, NBHF) classified by the built-in KERNO classifier of CPOD.exe54.

In addition to visual validation of click trains, the Train duration total and Detections and environment text files (Table 1) are quality checked. A positive number of milliseconds in the Train duration total should always correspond to a DPM of 1 in the Detection and environment text file. Coordinates logged together with the deployment date and time are verified in the data as an extra check of sensor performance. Temperature is also checked, since a sudden drop or rise in temperature values is associated with a shift in data; that is, the data on the SD card can suddenly shift to data from the previous deployment.

Usage Notes

The function getCpodData from the lwdataexplorer package37,38 allows the retrieval of PAM data from the LifeWatch observatory55. To boost query performance, up-to-date PAM data can also be accessed via https://rstudio.lifewatch.be/ upon registration38 In the LifeWatch project data portal36, it is possible to browse and visualize the data and select specific parameters, such as the timeframe, its quality (level of processing) and a sample period. The annual exports of minute- and hour-resolution PAM data can also be downloaded without login requirements via IMIS39,40,41,42,43,44,45,46,47,49.

LifeWatch observatories continuously generate various biodiversity data which may be used in conjunction to the PAM data. Ecological studies in the BPNS involving porpoises may be furthered through integration of relevant data such as abiotic parameters (temperature, salinity, nutrients), phytoplankton, zooplankton and fish telemetry data. For example, data streams from PAM and acoustic telemetry were already combined in a recent species co-occurrence study56 of the European seabass, Atlantic cod, harbor porpoise and dolphins in the BPNS using presence-absence matrices (see Code Availability).

During the LifeWatch data analysis workshop57 held in VLIZ from the 5th to the 6th of October 2017, research ideas and questions on the use of PAM data generated by BCPAN were formulated. This includes the correlation of PAM data to abiotic variables, sensor network analyses, dealing with background noise, effects of bottom versus surface moorings and comparisons with aerial observations (see Code Availability).

LifeWatch needs to be cited if any of the data are used and acknowledged: “Flanders Marine Institute (VLIZ), Belgium; (2023): LifeWatch observatory data: permanent Cetacean passive acoustic sensor network in the Belgian part of the North Sea. Marine Data Archive. https://doi.org/10.14284/639”.