A Dataset of Lower Band Whistler Mode Chorus and Exohiss with Instrumental Noise Thresholds

Santolík, Ondřej; Kolmašová, Ivana; Taubenschuss, Ulrich; Hanzelka, Miroslav; Hartley, David P.

doi:10.1038/s41597-025-05531-6

Download PDF

Data Descriptor
Open access
Published: 18 July 2025

A Dataset of Lower Band Whistler Mode Chorus and Exohiss with Instrumental Noise Thresholds

Scientific Data volume 12, Article number: 1265 (2025) Cite this article

1571 Accesses
Metrics details

Subjects

Abstract

We describe a large database of natural electromagnetic emissions of lower band whistler mode chorus and exohiss within the Earth’s magnetosphere. It is based on more than 124 million selected survey measurements of magnetic fluctuations, recorded between 2001 and 2020 by the two NASA Van Allen Probes and four ESA Cluster spacecraft. The database provides a comprehensive view of amplitudes of these important electromagnetic emissions in the audible frequency range. We carefully condition the data to minimize the influence of instrumental artefacts. We also remove all data points which may be contaminated by instrumental noise using a newly developed method to define detection thresholds as a function of frequency, time, and instrument settings. The database can serve as a valuable resource for a broad range of scientists studying space weather, magnetospheric physics, and radiation belt dynamics.

Experimental study on chorus emission in an artificial magnetosphere

Article Open access 15 February 2024

Whistler-mode waves in Mercury’s magnetosphere observed by BepiColombo/Mio

Article 14 September 2023

Detection of ultrafast electron energization by whistler-mode chorus waves in the magnetosphere of Earth

Article Open access 14 January 2025

Background & Summary

Electromagnetic emissions of chorus and exohiss are a class of whistler-mode waves that occur naturally in the low-density regions of the Earth’s magnetosphere^1,2 in the audible frequency band. Significant effects of these waves were found in the outer Van Allen radiation belt³. Knowledge of their properties therefore became essential for attempts at operational forecasting of radiation in the near-Earth environment^4,5 with important societal implications⁶. Whistler-mode waves can rapidly increase highly variable fluxes of relativistic electrons^7,8,9,10 or remove them into the atmosphere^11,12,13,14. Previously, models of wave amplitudes were constructed based on the datasets of THEMIS spacecraft^15,16, on the subset of measurements of the Cluster spacecraft and on Van Allen Probes measurements^17,18. Although the original spacecraft datasets are accessible, a database of frequency-integrated amplitudes with removed instrumental artefacts, which is necessary for derivation of the chorus and exohiss models, was not made publicly available.

Two examples of survey data acquired by the EMFISIS Waves instrument on Van Allen Probe A are shown in Fig. 1. We use the 3D measurement of the wave magnetic field to derive the trace of the magnetic power-spectral density matrix and estimate thus the power-spectral density of the squared modulus of the vector of magnetic field fluctuations as a function of frequency and time. We also use the same 3D measurement to obtain the magnetic ellipticity E_B in the polarization plane¹⁹ as a function of frequency and time. These characteristics, together with the position of the model plasmapause²⁰, allow us to systematically distinguish different types of natural electromagnetic emissions by their frequency, characteristic polarization, and region of occurrence. For comparison, we also show the local plasma density determined from the measurements of the upper hybrid frequency²¹.

Figure 1a–c show measurements acquired after the geomagnetically disturbed period on 28 September 2017, when the Kp index was between 5- and 7-, and the model plasmapause position²⁰ L_PP was between 2.8 and 3.3, roughly consistent with the measured transitions to the lower density regions along the spacecraft orbit. Intense whistler-mode waves were observed on the dayside. Figure 1d–f show measurements recorded after a period of calm geomagnetic conditions on 1 January 2015, when the Kp index was between 1 and 1+, and L_pp was between 4.5 and 4.6. No natural emissions of electromagnetic waves were observed between 14:00 and 17:00 UT on the nightside.

Plasmaspheric hiss²², shown as type 1 in Fig. 1b–c,e–f, typically occurs at characteristic frequencies between 20 Hz and 2 kHz, in the whistler mode with a right-handed polarization. It can be characterized by a significantly positive signed ellipticity¹⁹, E_B > 0.2. It is confined within the plasmasphere at L < L_PP, where the L parameter is calculated from the spacecraft position using the dipole approximation,

$${\rm{L}}={\rm{R}}{\cos }^{-2}{\lambda }_{m},$$

(1)

where R is the radial distance from the Earth’s center in Earth radii (R_E, defined as 6371.2 km), and ${\lambda }_{m}$ is the magnetic latitude related to the Earth’s main dipole axis.

This occurrence pattern is reversed for lower band chorus and exohiss^2,23, which is the subject of this work and is shown as type 2a-2f in Fig. 1b,c,e–f. It occurs in the low-density plasmatrough at L > L_PP, propagating in the whistler mode with E_B > 0.2. Its origin is linked to electron cyclotron resonance close to the geomagnetic equator^24,25,26 and its typical frequency range between 0.1 ${f}_{{ce}0}$ and 0.5 ${f}_{{ce}0}$ is therefore related to the equatorial cyclotron frequency. Assuming propagation along the dipole magnetic field lines, we obtain it as

$${f}_{{ce}0}={f}_{{ce}}{\cos }^{6}{\lambda }_{m}/\sqrt{1+3\,{\sin }^{2}{\lambda }_{m}},$$

(2)

where ${f}_{{ce}}$ is the local electron cyclotron frequency obtained from measurement of the background magnetic field at the magnetic latitude ${\lambda }_{m}$.

Upper band chorus, shown as type 3 in Fig. 1b,c, occurs in the same region with similar polarization properties but at frequencies between 0.5 ${f}_{{ce}0}$ and 0.8 ${f}_{{ce}0}$. Equatorial noise^27,28 is shown as an intense emission of type 4 in Fig. 1b,c. It occurs both inside and outside the plasmapause, at frequencies below the lower hybrid frequency (0.001 ${f}_{{ce}0}$ and 0.02 ${f}_{{ce}0}$) but with a polarization close to linear (|E_B| < 0.2). Finally, lightning generated whistlers are observed as impulsive emissions in the plasmasphere (Type 5 in Fig. 1e,f).

These examples demonstrate that our dataset of three-dimensional measurements, together with the plasmapause model, allows us to distinguish chorus and exohiss from other types of magnetospheric wave emissions. However, it does not allow for distinguishing chorus from exohiss based on discrete time-frequency structures. This is only possible with high resolution measurements, an example of which are the continuous burst mode intervals of the EMFISIS Waves instruments onboard the Van Allen Probes. Figure 2 shows measurements from six such intervals, as they are marked 2a-2f in Fig. 1, corresponding to separate panels a-f in Fig. 2, respectively (hear also from audio files described below as Data record 13). The trace of the power spectral density matrix of the three magnetic field components as a function of frequency and time was calculated from high resolution 3D magnetic field measurements. The results show that either structureless exohiss or discrete chorus emissions or their various combinations can be observed in the frequency range between 0.1 ${f}_{{ce}0}$ and 0.5 ${f}_{{ce}0}$.

The high-resolution measurements shown in Fig. 2, however, are not continuously recorded but triggered from increased wave intensity. To avoid this selection effect, our procedure is based on the regularly sampled survey data, as they are described above and shown in Fig. 1. It therefore combines chorus and exohiss emissions, without distinguishing them based on their time-frequency structure. Each of the examples in Fig. 2 contains a 0.468 s long time interval of the survey mode capture at its very beginning. The frequency-integrated lower band chorus/exohiss amplitudes from these survey mode captures enter into our database. They are, in this case, approximately the same for both chorus and hiss, between 12 and 20 pT.

Our analysis procedure is schematically depicted in Fig. 3. We collected a large dataset of observations from the systematic Survey mode measurements of the EMFISIS Waves instruments on the two NASA Van Allen Probes^29,30,31 and from the Normal mode dataset of the STAFF-SA instruments on the four ESA Cluster spacecraft^32,33,34 in the magnetosphere. From this combined dataset, we first removed all time intervals of known instrumental inaccuracies and disturbances, caused, for example, by inaccurate onboard calibration tables, firing the attitude thrusters, calibration runs, absence of onboard despin procedure, burst captures, or active sounding intervals.

This cleaned dataset first served for characterizing the detection thresholds by a manual selection of calm intervals containing only the instrumental noise, with no natural emissions of electromagnetic waves — such as the interval shown in the example Fig. 1e,f between 14:00 and 17:00 UT. We selected these intervals to cover the variations of the frequency dependent noise background for different measurements settings and to capture slow increases of the noise throughout the lifetime of the spacecraft caused by degradation of the instruments. The resulting frequency and time dependent detection thresholds are published as the first category of reusable Data records 1–6 described by the present paper.

We further used these detection thresholds to characterize the frequency integrated amplitudes of whistler-mode chorus and exohiss from the original cleaned data set — examples are shown as “type 2a–f” emissions in Fig. 1. We reduced the cleaned data set to time intervals of observations in the plasmatrough region, and to time intervals when the frequency interval of chorus/exohiss extended over at least one frequency channel of the instrument, while not extending outside its entire frequency range. We then integrated the power-spectral density of detected right-hand polarized waves over the frequency interval of chorus/exohiss, to obtain root-mean-square amplitudes. The database of these integrated amplitudes forms the second category of reusable Data records 7–12 described by the present paper.

The Methods section shows the details of the separate steps of a newly developed procedure to define detection thresholds and their evolution during the lifetime of the six spacecraft, and to obtain chorus/exohiss root-mean-square amplitudes. The Data Records section describes the files of the detection thresholds, which can be used generally throughout the original Van Allen Probes and Cluster datasets, not being limited to any preselected wave mode. This section also describes the files of resulting chorus/exohiss amplitudes derived from the measurements of the six spacecraft. The Technical Validation section demonstrates the validity of the plasmapause model and the consistency of the resulting data records, and, finally, the Usage Notes section shows examples and summarizes the usage of the new data records.

Methods

Survey dataset of the Van Allen Probes EMFISIS Waves instruments

The orbital coverage of two Van Allen Probes spacecraft allows us to systematically investigate the equatorial region for L < 6.7 at latitudes within 20° around the magnetic equator. The Level 2 Survey dataset of the EMFISIS Waves instrument^29,30,31 can be obtained from NASA Space Physics Data Facility (on https://spdf.gsfc.nasa.gov/pub/data/rbsp/rbsp#/ l2/emfisis/wfr/spectral-matrix/, where ‘#’ is replaced by ‘a’ or ‘b’ for Van Allen Probe A or B, respectively).

The dataset consists of multicomponent spectral matrices obtained from measurements of three orthogonal magnetic search coil antennas and three electric double-probe antennas sampled at 35 kHz. The data were analyzed onboard the spacecraft using the fast Fourier transform of 16,384 samples with the von Hann’s cos² windowing function in the time domain, based on 0.468 s long measurements repeated every 6 s. Resulting spectral matrices were arithmetically averaged into 65 frequency channels with a pseudo-logarithmic frequency spacing between 2 Hz and 12 kHz. As the original spectral estimates from the fast Fourier transform were linearly spaced in frequency, the number of averaged spectral matrices increased from n_l = 1 in the lowest 13 frequency channels (l = 1…13) up to n₆₅ = 642 in the last frequency channel in order to achieve their pseudo-logarithmic spacing^29,30.

Inaccurate onboard calibration tables were used by operational mistake on Probe A and Probe B before 12 and 13 February 2013, respectively, and we have removed the data acquired from the beginning of the mission up to these dates. The dataset from Van Allen Probe A therefore spans between 12 February 2013 and 13 October 2019, containing 3.48 × 10⁷ measurements with a 99.2% time coverage. The Van Allen Probe B dataset was recorded between 13 February 2013 and 16 July 2019 and contains 3.36 × 10⁷ captures with a 99.6% time coverage. All time intervals of known instrumental disturbances caused by firing the attitude thrusters have been subsequently removed from the datasets of both spacecraft, and the resulting dataset contains 3.45 × 10⁷ and 3.34 × 10⁷ measurements with 98.5% and 98.9% time coverage for Probe A and Probe B, respectively.

Lower band chorus/exohiss dataset from the Van Allen Probes EMFISIS Waves instruments

We further selected only the measurements, for which the lower band chorus/exohiss frequency interval between 0.1 ${f}_{{ce}0}$ and 0.5 ${f}_{{ce}0}$ (see Eq. 2) was fully contained in the instrument frequency range, and at least one instrument frequency channel was entirely inside this interval. We additionally restricted our dataset to measurements inside the model magnetopause³⁵, parametrized by the solar wind dynamic pressure and interplanetary magnetic field from the OMNI2 database³⁶ (https://spdf.gsfc.nasa.gov/pub/data/omni/). All these conditions were fulfilled for 2.58 × 10⁷ and 2.51 × 10⁷ measurements of Van Allen Probe A and B, respectively. During that time, both probes had on average 14 frequency channels of the instrument entirely contained within the chorus/exohiss frequency interval.

Normal mode dataset of the Cluster Staff-SA instruments

Four spacecraft of the Cluster mission have an eccentric high inclination orbit allowing us to investigate a wide range of latitudes up to at least 60° around the equator for L > 4. The Normal mode dataset of the STAFF-SA instrument^32,33,34 can obtained from the European Space Agency Cluster Science Archive (using different retireval methods, for example through http://csa.esac.esa.int/csa-sl-tap/data?RETRIEVAL_TYPE=product&DATASET_ID=C#_CP_STA_ SM &START_DATE = 2004-07-15T00:00:00Z&END_DATE = 2004-07-16T00:00:00Z&DELIVERY_ FORMAT = CDF, where the START_DATE and the END_DATE can be set throughout the mission duration and where ‘#’ is replaced by 1, 2, 3, or 4, for Cluster 1, 2, 3, or 4, respectively).

The dataset contains spectral matrices obtained by onboard analysis of measurements by three orthogonal magnetic search coil antennas and two electric double-probe antennas, with measurements of 3.84 s repeated every 4 s and analyzed by numerical filtering in 27 logarithmically spaced frequency channels between 8 Hz and 4 kHz. They are divided into three separate groups of 9 channels, where the number of averaged spectral estimates is ${n}_{l}=4$, 32, and 256, in frequency channels $l=1\ldots 9$, 10…18, and 19…27, respectively.

The dataset was obtained by the four Cluster spacecraft between 7 January 2001 and 30 April 2020, resulting in 5.6 × 10⁸ measurements with 84%-time coverage. By excluding the measurements outside of the magnetosphere and in the distant tail at radial distances larger than 11 R_E we reduced the dataset down to 1.5 × 10⁸ measurements, corresponding to 22% time coverage. We subsequently removed all Burst mode intervals, intervals of active soundings of the Whisper³⁷ instrument, calibration intervals, and intervals when the onboard de-spin procedure was switched off. The resulting cleaned dataset then contains 1.14 × 10⁸ measurements from all four Cluster spacecraft (2.82 × 10⁷, 2.88 × 10⁷, 2.80 × 10⁷, and 2.91 × 10⁷, from Cluster 1–4, respectively) corresponding to 19%-time coverage.

Lower band chorus/exohiss dataset from the Cluster Staff-SA instruments

The dataset from the four Cluster spacecraft was further reduced to measurements inside the model magnetopause³⁵, while measuring with at least one instrument channel fully inside the lower band chorus/exohiss frequency interval between 0.1 ${f}_{{ce}0}$ and 0.5 ${f}_{{ce}0}$ according to Eq. 2, and, at the same time, with this interval fully contained in the instrument frequency range. This yielded 1.84 × 10⁷, 1.77 × 10⁷, 1.80 × 10⁷, and 1.89 × 10⁷ measurements from Cluster 1–4, respectively, with an average number of 7 frequency channels of the instrument entirely contained within the chorus/exohiss frequency interval.

Model of the cumulative distribution function for the noise power spectral densities of the magnetic field fluctuations for the dataset of the Cluster Staff-SA instruments

For the determination of the detection threshold we first need to determine the shape of the cumulative probability distribution ${F}_{l}\left({S}_{l}\right)$ of the trace ${S}_{l}$ of noise power spectral density matrix in each frequency channel $L$. Assuming normally distributed white noise with zero mean value for each of the sensors, the onboard spectral analysis results in power spectral densities obtained as the sum of squares of normally distributed real and imaginary parts of separate spectral components, which should therefore obey a scaled ${\chi }^{2}$ distribution with 2 degrees of freedom. These spectral estimates are further averaged across ${n}_{l}$ frequency bins and/or time intervals. The sum of power spectral densities from ${n}_{a}$ sensors (in our case n_a = 3) finally provides us with the trace of the ${n}_{a}$-dimensional power spectral density matrix. For independent spectral estimates we therefore theoretically obtain,

$${F}_{l}({S}_{l})={F}_{\nu }({\chi }^{2}),\nu =2{n}_{l}\,{n}_{a},\,{\chi }^{2}=\frac{{S}_{l}}{s},\,s=\frac{{\sigma }_{l}^{2}}{2\,{n}_{l}},$$

(3)

where ${F}_{\nu }$ is the cumulative probability of the χ² distribution with $\nu $ degrees of freedom, and $s$ is a scaling factor, which depends on the original root-mean-square noise level ${\sigma }_{l}^{2}$ of the sensors in the given frequency channel $l$.

We used a Monte Carlo simulation to verify this model by simulating the sensor noise with a normally distributed pseudo-random vector with a standard deviation $\sigma =$ 1.3 pT. This corresponds to the typical root-mean-square noise level of Cluster sensors integrated over their entire frequency band. We simulated the onboard spectral analysis using the Fast Fourier Transform (FFT) procedure over a pseudo-random noise sequence of 16384 samples, and we verified its normalization using the Parseval theorem by integrating the obtained power spectral densities over the entire frequency band, to retrieve the square of the original noise root-mean-square value $\sigma $. We then averaged power spectral densities in ${n}_{l}$ neighboring frequency bins. The procedure was repeated ${n}_{a}=$ 3 times and we summed the results to simulate the calculation of the trace of the magnetic power spectral density matrix. We constructed a histogram of obtained values from 2¹⁸ realizations of the pseudo-random noise sequences and normalized it to obtain estimates of the probability density function, which we compared with the first derivative of the cumulative probability distribution ${F}_{l}\left({S}_{l}\right)$ from Eq. 3. The results confirm a perfect agreement (Fig. 4), and we therefore used Eq. 3 to determine the detection thresholds for the Cluster STAFF-SA Normal mode data. With the measured noise intervals, we first determined the median value ${S}_{l,0.5}$ of the trace of the measured magnetic power spectral density matrices in each frequency channel $l$. With the appropriate number of averaged spectral estimates ${n}_{l}$ in this channel, we determined the original root-mean-square noise level ${\sigma }_{l}^{2}$ of the sensors from Eq. 3 by defining F_l (S_l,0.5) = 0.5. We then used the obtained value of ${\sigma }_{l}^{2}$ in Eq. 3 together with Eq. 6 for determining the detection thresholds for the Cluster STAFF-SA Normal mode data.

Model of the noise cumulative distribution function for power spectral densities of the magnetic field fluctuations for the Survey mode data products of the Van Allen Probes EMFISIS instruments

The onboard procedure of the Survey mode data products of the Van Allen Probes EMFISIS instruments included the von Hann’s cos² windowing function for the measured time series to suppress the samples toward the edges of the analyzed time intervals. This violates the assumption of the statistical independence of the averaged neighboring spectral estimates, and the resulting probability distribution therefore deviates from the shape of the scaled ${\chi }^{2}$ distribution according to Eq. 3. To verify if this deviation is significant, we performed the same Monte Carlo simulation of the onboard procedure for this dataset as described above but we included the windowing function in the time domain. We also used a correction factor of 8/3 for power spectral densities obtained from the squared gain and noise bandwidth of the von Hann’s window³⁸ to fulfill Parseval’s theorem. The results show that the distribution of the trace of the noise power spectral density matrix is significantly different from the model in Eq. 3 but that it can still be modeled by a combination of two separate scaled ${\chi }^{2}$ distributions ${F}_{{cl}}\left({S}_{l}\right)$ and ${F}_{{tl}}\left({S}_{l}\right)$ for the core and tail parts of the distribution, respectively (see Fig. 5). The core part is defined for ${S}_{l}$ such that F_cl(S_l) < 0.99,

$${F}_{{cl}}({S}_{l})={F}_{\nu }({\chi }^{2}),\nu =\frac{2\,{n}_{l}\,{n}_{a}}{{C}_{c\nu }},\,{\chi }^{2}=\frac{{S}_{l}}{s},\,s={\sigma }_{l}^{2}\frac{{C}_{{cs}}}{2\,{n}_{l}}.$$

(4)

The tail part is based on the same original root-mean-square noise level ${\sigma }_{l}^{2},$ but describes the high intensity part of the distribution occurring with a probability below 1%, it means for ${S}_{l}$ such that ${F}_{{cl}}\left({S}_{l}\right)$ ≥ 0.99,

$${F}_{{tl}}({S}_{l})={F}_{\nu }({\chi }^{2}),\nu =\frac{2\,{n}_{l}\,{n}_{a}}{{C}_{t\nu }},\,{\chi }^{2}=\frac{{S}_{l}}{s},\,s={\sigma }_{l}^{2}\frac{{C}_{{ts}}}{2\,{n}_{l}}.$$

(5)

In Eq. 4, C_cν, C_cs are, respectively, correction factors for the number of degrees of freedom and for the scale of the core part of the distribution. These factors effectively decrease the number ${n}_{l}$ of the averaged neighboring spectral estimates from Eq. 3, because these estimates are now correlated. Another set of correction factors, ${C}_{t\nu }$, ${C}_{{ts}}$ is used for the number of degrees of freedom and for the scale of the tail part of the distribution in Eq. 5. We determined these correction factors by numerical fits, based on the Monte Carlo simulation of the onboard procedure for this dataset. A set of 64 realizations of these fits (each of them based on 2¹⁸ realizations of the pseudo-random noise sequences of 16384 samples) resulted in average values and standard deviations of the four correction factors ${C}_{c\nu }$, ${C}_{{cs}}$, ${C}_{t\nu }$, and ${C}_{{ts}}$ for different ${n}_{l}$ values (Table 1 and Fig. 6). While the standard deviations of the corrections factors ${C}_{c\nu }$ and ${C}_{{cs}}$ for the core part of the distribution are small (below 0.1% in our simulation, not shown in Fig. 5), the corrections factors ${C}_{t\nu }$ and ${C}_{{ts}}$ for the tail part, reach larger standard deviations on the order of 2% but their average vales can still be used with a reasonable accuracy.

Table 1 Average values of correction factors from Eqs. 4 and 5. Results are obtained from Monte Carlo simulations for different values of the number of averaged spectral matrices ${n}_{l}$.

Full size table

With the measured noise intervals, we first determined the median value ${S}_{l,0.5}$ of the trace of the measured magnetic power spectral density matrices in each frequency channel $l$. With correction factors ${C}_{c\nu }$, ${C}_{{cs}}$ for the appropriate number of averaged spectral estimates ${n}_{l}$ in this channel, we determined the original root-mean-square noise level ${\sigma }_{l}^{2}$ of the sensors from Eq. 4 for the core model median by defining ${F}_{{cl}}\left({S}_{l,0.5}\right)=$ 0.5. We then used the obtained value of ${\sigma }_{l}^{2}$ in Eq. 5 for the tail model and we used it together with Eq. 6 for determining the detection thresholds for the Survey mode data products of the Van Allen Probes EMFISIS instruments.

Determination of detection thresholds

We have characterized the instrumental noise for each spacecraft separately by manual selection of a series of measurement intervals, in which the signal of natural wave emissions was absent (as it was the case for 14:00–17:00 UT on 1 January 2015, shown in Fig. 1d–f). We assumed models of the probability distribution of the trace of the power spectral density matrix for both the Survey dataset of the Van Allen Probes EMFISIS Waves instruments (Eqs. 4 and 5), and the Normal mode dataset of the Cluster STAFF-SA instruments (Eq. 3). The models are based on scaled ${\chi }^{2}$ distributions with the number of degrees of freedom defined by known properties of the onboard analysis in each frequency channel, and with a scaling factor, which depends on the noise level of the sensors, as detailed below. This is the only free parameter of the model, which we determined from a robust estimate of the median value of the trace of the power spectral density matrix in each of the analyzed noise intervals.

The agreement of this model with experimental data is shown by examples in Fig. 7 using Van Allen probes EMFISIS Waves measurements during the noise interval on 1 January 2015 (Fig. 1d–f) and Cluster STAFF-SA measurements from a noise interval on 8 March 2002. The experimentally determined percentiles of the noise distribution between 1% and 99% also correspond well to the model across the entire frequency range of these instruments (see Figs. 8 and 9). We can therefore use this model to calculate the detection threshold ${S}_{0l}$ for the trace of the power spectral density matrix in a frequency channel $l$ such that the noise level can reach above ${S}_{0l}$ with an arbitrarily chosen low probability:

$${P}_{0}=1-{F}_{l}\left({S}_{0l}\right),$$

(6)

where ${f}_{l}$ is the modeled cumulative probability distribution (according to Eqs. 3–5).

The average probability that the instrumental noise can randomly overpass the power-spectral density threshold for at least one frequency channel of the instrument can be estimated as

$$P=1-{\sum }_{s}{N}_{s}{(1-{P}_{0})}^{{K}_{s}}/{\sum }_{s}{N}_{s},$$

(7)

where ${P}_{0}$ is the predefined probability for the determination of the detection threshold for each instrumental frequency channel, ${N}_{s}$ is the number of observations by a spacecraft $s$, and ${K}_{s}$ is the number of instrument channels inside the lower band chorus frequency interval averaged over the ${N}_{s}$ observations in our dataset of lower-band chorus/exohiss. Here, the total number of observations is N₁ = 5.1 × 10⁷ for both Van Allen Probes, with K₁ = 14, and is N₂ = 7.3 × 10⁷ for all Cluster spacecraft, with K₂ = 7 (see above in the dataset description).

If we limit ourselves to the highest experimentally determined percentiles of the noise distribution, we still have a large fraction of cases where instrumental noise causes random false positive results, considering that they can occur anywhere in the analyzed frequency band. Using a percentile of 99% (P₀ = 0.01), we obtain a quite large probability $P$ of 9.4% of false positive detections according to Eq. 7, which would then substantially bias the determined occurrence rates. For practical purposes, we therefore must use the model distribution to extend the measurements to high cumulative probabilities (it means low ${P}_{0}$). To have a sufficiently low broadband probability of false detections, we define the cumulative probability threshold for each frequency channel to be P₀ = 10⁻⁷. Following Eq. 7, this leads to the resulting probability P = 10⁻⁶, giving on average only 124 false positive detections out of the entire dataset of 1.24 × 10⁸ suitable measurements in the magnetospheric region and frequency range of chorus or exohiss.

Using P₀ = 10⁻⁷ may seem too restrictive, but in fact, it does not substantially increase the noise threshold values compared to lower percentiles, especially at higher frequency channels where more spectral estimates are averaged onboard the spacecraft (Fig. 8). This is primarily where chorus/exohiss observations occur. Analysis of similar cases of noise intervals collected during the Van Allen Probes operational mission shows that, as the preamplifier electronics of the search coil sensors degraded during the mission, the noise level increased. We took this evolution into account in our analysis, and we also considered the setting of the signal attenuator which substantially influenced the detection threshold at higher frequencies. The results³⁹ are stored as Data records 1 and 2.

We used the same time-dependent method to find the detection threshold of power spectral densities also for the measurements of the four Cluster spacecraft (see Fig. 9 and Data records 3–6)³⁹.

Root-mean-square amplitudes of lower band chorus/exohiss

We used these newly derived data records of detection thresholds together with the results of the selection procedure described by the first four paragraphs on the Methods section to define the database of root-mean-square amplitudes of the lower band chorus/exohiss. The data selection procedure yielded measurements from the Normal mode dataset of the Cluster Staff-SA instruments and from the Survey mode dataset of the Van Allen Probes EMFISIS Waves instruments, for which the lower-band of chorus/exohiss was fully contained in the instrument frequency range, and, at the same time, at least one instrument frequency channel was entirely contained inside this interval.

Each selected measurement was obtained in the form of multidimensional power-spectral matrices in one or more frequency channels of each instrument. From a 3D magnetic power-spectral matrix in a frequency channel $l$ we calculated the signed magnetic ellipticity ${E}_{{Bl}}$ in the polarization plane¹⁹. In the same frequency channel, we also calculated the trace ${S}_{{Bl}}$ of the magnetic power-spectral density matrix to estimate the power-spectral density of the squared modulus of the vector of magnetic field fluctuations. We then calculated the root-mean-square amplitude:

$${{\rm{B}}}_{{\rm{w}}}=\sqrt{{\sum }_{\{l:{E}_{{Bl}} > 0.2,{S}_{{Bl}} > {S}_{0l}\}}\,{S}_{{Bl}}\,{\varDelta }_{l}},$$

(8)

where ${\varDelta }_{l}$ is a frequency bandwidth of channel $l$, and where the sum is calculated over all frequency channels, which were entirely contained in the lower-band chorus/exohiss frequency interval, whose signed magnetic ellipticity overpassed a threshold of 0.2, and whose trace of the magnetic power-spectral density matrix overpassed the detection threshold ${S}_{0l}$. The resulting root-mean-square amplitudes are given as a function of time and position of each measurement in Data records 7–8 for Van Allen Probes A and B, respectively, and in Data records 9–12 for Cluster 1–4, respectively³⁹.

Data Records

Data record 1: Detection thresholds for Survey data of the EMFISIS instrument on Van Allen Probe A.

https://doi.org/10.6084/m9.figshare.27606945

Filename: RBSPA_SURV_BSUM_NOISE.txt

The ASCII text file contains

Self-explanatory comments introduced by a “%” sign, starting by a header of the file on its first line;
human and machine readable numerical data.

To capture the time evolution of the noise floors, with a possibility of sudden steps linked to the changes in settings of the instrument, a sufficiently general format of the numerical data is used. First a list of time nodes between which the noise floors are interpolated is given, followed by the noise floors themselves. The data are therefore listed in two sections

a)
table of time nodes,
b)
table of noise floors.

a)
The table of time nodes starts on the second line of the file, where the number of lines in the table is given. The table then starts, after a self-explanatory line of comments, on the fourth line of the file, each of its lines contains the following three data components separated by spaces:

Universal time in the “YYYY-MM-DDThh:mm:ss.msc” format, where “YYYY” stands for a 4-digit calendar year, “MM” for a 2-digit calendar month, “DD” for a 2-digit calendar day, “hh” for a 2-digit hour,”mm” for a 2-digit minute, “ss” for a for a 2-digit second, and “msc” for a for a 3-digit millisecond;
noise floor number from the table of noise floors (starting by 0, see below) to which the noise floor should be time-interpolated between the previous time node and the current time node;
noise floor number from the table of noise floors (starting by 0, see below) from which the noise floor should be time-interpolated between the current time node and the following time node.

Note that these two noise floor numbers may be identical to account for slow changes linked to degradation of the sensors, but also different, to account for the sudden steps linked to the changes of instrument settings.

b)
The table of noise floors immediately follows the table of time nodes. It starts by a line defining the number of noise floors and number of frequency channels contained in each of them. Each noise floor is then described by two lines:

a comment line defining the noise floor number (starting from 0), the spacecraft and time interval, over which the noise floor analysis was done, the number of noise spectra included in the analysis, and the probability threshold ${P}_{0}$;
a sequence of numerical values for the noise floor in nT²/Hz. Each of these values is equal to the detection threshold ${S}_{0l}$ for the $l$ th frequency channel according to Eq. 6 with P₀ = 10⁻⁷.

An example of using this data record for obtaining a noise floor at a given time is described in the Usage Notes section. A simple custom code to handle the tables of detection thresholds is available on https://doi.org/10.6084/m9.figshare.29433593.

Data record 2: Detection thresholds for Survey data of the EMFISIS instrument on Van Allen Probe B.

https://doi.org/10.6084/m9.figshare.27606945

Filename: RBSPB_SURV_BSUM_NOISE.txt