Introduction

The question of dark matter’s nature remains still unanswered almost a century after its initial proposal1,2. Among the myriad candidates proposed to constitute dark matter (DM), Weakly Interacting Massive Particles (WIMPs) have attracted significant attention due to both their compelling theoretical motivation and potential detectability3.

WIMPs, hypothesized to interact weakly with ordinary matter, would have been abundantly produced in the early universe, offering a natural explanation for the observed abundance of DM4. Furthermore, theories beyond the Standard Model of particle physics, such as Supersymmetry or extra dimensions, present potential candidates for WIMPs. Efforts to detect WIMPs have followed different strategies, including searching for their annihilation products as an excess in the fluxes of cosmic messengers reaching the Earth5, or identifying new physics phenomena in colliders, such as signals of new mediators or events with missing energy resulting from dark matter production6. Additionally, attempts to detect WIMPs directly have been performed by searching for their interaction with sensitive detectors on Earth7, which predominantly involves elastic scattering off atomic nuclei. While significant portions of the parameter space for benchmark (generic) particle candidates have been excluded using these methods8, our limited knowledge of the underlying models makes these results strongly model-dependent. Furthermore, distinguishing backgrounds from the signal is challenging. Hence, it is crucial to identify a characteristic signature of dark matter. Among the few proposed ones, annual modulation stands out as particularly compelling9,10. The flux of WIMP particles on Earth depends on the relative velocity between Earth and the DM halo. As Earth, along with the Solar System, moves towards the Cygnus constellation during its orbit around the galactic center, the Earth’s revolution around the Sun introduces a minor correction to its velocity relative to the halo. The differential scattering rate R as a function of the nuclear-recoil energy ENR and the time is11

$$\frac{dR({E}_{{{{\rm{NR}}}}},t)}{d{E}_{{{{\rm{NR}}}}}}=\frac{{N}_{T}{\rho }_{0}}{{m}_{\chi }}\int_{{v}_{\min }}^{{v}_{\max }}vf(\overrightarrow{v},t)\frac{d\sigma ({E}_{{{{\rm{NR}}}}},v)}{d{E}_{{{{\rm{NR}}}}}}{d}^{3}\overrightarrow{v},$$
(1)

where NT is the number of target nuclei, ρ0 is the local DM density, mχ is the mass of the DM particle, \(\overrightarrow{v}\) is the DM velocity in the detector’s rest frame, v is its modulus, f is the velocity distribution of DM particles in the detector’s rest frame, \({v}_{\max }\) is the maximum velocity of DM particles in the detector’s rest frame corresponding to the escape speed of the Milky Way, \({v}_{\min }=\sqrt{{m}_{{{{\rm{N}}}}}{E}_{{{{\rm{NR}}}}}/2{\mu }_{{{{\rm{N}}}}\chi }^{2}}\) is the minimum velocity of DM particles that can produce a nuclear recoil of energy ENR off a nucleus with mass mN, where μNχ is the reduced mass of the WIMP-nucleus system, and σ is the WIMP-nucleus scattering cross-section. The maximum recoil energy \({E}_{{{{\rm{NR}}}}}^{\max }=2{\mu }_{{{{\rm{N}}}}\chi }^{2}{v}^{2}/{m}_{{{{\rm{N}}}}}\) for typical WIMP velocities O(200 km s−1) ranges from approximately 10 to 100 keV, depending on mN.

The velocity distribution function f can be calculated from the velocity distribution function in the Galactic reference system fgal through a Galilean transformation \(f(\overrightarrow{v},t)={f}_{{{{\rm{gal}}}}}(\overrightarrow{v}+{\overrightarrow{v}}_{{{{\rm{E}}}}}(t))\), where \({\overrightarrow{v}}_{{{{\rm{E}}}}}\) is the Earth’s velocity in the Galactic rest frame. fgal is truncated at the Milky Way escape speed12, vesc = 544 km s−1. \({\overrightarrow{v}}_{{{{\rm{E}}}}}\) comprises three primary components12: (1) the motion of the local standard of rest, which in galactic coordinates is given by (0, v0, 0), with v0 = 238 km s−1, (2) the Sun’s peculiar motion, which is (11.1, 12.2, 7.3) km s−1 and (3) the orbital motion around the Sun, which can be well approximated by a circular orbit tilted θ ≈ 60° with respect to the Galactic plane, at an orbital speed of vorb = 29.8 km s−1. A reasonably accurate approximation for the Earth’s speed is given by

$${v}_{{{{\rm{E}}}}}={v}_{\odot }+{v}_{{{{\rm{orb}}}}}\cos \theta \cos (\omega (t-{t}_{0})),$$
(2)

where v = v0 + 12.2 km s−1, \(\omega =\frac{2\pi }{365}\) d−1, and the phase t0 depends on the specific halo model considered, but in most virialized models is about June 2, when the combined velocities reach their maximum.

Hence, there are slightly more WIMPs with high speeds in the detector’s rest frame during the summer, and conversely, more WIMPs with low speeds during the winter. This leads to a modulation in the differential rate, with the highest rate occurring in summer for larger nuclear-recoil energies and in winter for smaller ones. Considering that the variation in Earth’s speed between summer and winter amounts to roughly 6% of the average velocity, the differential rate can be approximated using a Taylor series

$$\frac{dR}{d{E}_{{{{\rm{NR}}}}}}\approx {\left(\frac{dR}{d{E}_{{{{\rm{NR}}}}}}\right)}_{{v}_{{{{\rm{E}}}}} = {v}_{\odot }}+\Delta ({E}_{{{{\rm{NR}}}}})\cos (\omega (t-{t}_{0})),$$
(3)

with

$$\Delta ({E}_{{{{\rm{NR}}}}})={\left(\frac{{d}^{2}R}{d{E}_{{{{\rm{NR}}}}}d{v}_{{{{\rm{E}}}}}}\right)}_{{v}_{{{{\rm{E}}}}} = {v}_{\odot }}{v}_{{{{\rm{orb}}}}}\cos \theta .$$
(4)

Therefore, the expected signal of dark matter integrated over a certain energy window (here denoted by k) can be expressed as the sum of a constant term plus a term modulated with an annual period:

$${R}_{k}(t)\approx {R}_{0,k}+{S}_{m,k}\cos (\omega (t-{t}_{0})).$$
(5)

If the experimental threshold is low enough, the sign change in Sm,k should be observed at characteristic energy dependent on the target nucleus and WIMP masses. The observation of this phase shift would allow us to determine the WIMP mass. The annually modulated signal is faint, corresponding only to a small percentage of the total signal. However, the requirements it must meet to be interpreted as produced by WIMPs in the galactic halo are very restrictive: it must have the correct amplitude, phase, and period, and occur only in the low-energy region.

For over 20 years, the DAMA/LIBRA experiment has observed a modulation in its data that satisfies these criteria, thus representing a strong indication of dark matter detection13. The DAMA/NaI experiment began in 1995 at the Laboratori Nazionali del Gran Sasso, Italy, with 100 kg of NaI(Tl) scintillators and an energy threshold set at 2 keV14. After 7 years, the experiment upgraded to the LIBRA setup, scaling up the detector mass to 250 kg (DAMA/LIBRA-phase1)15. Subsequently, after 7 additional years of data collection, all photomultiplier tubes were replaced with others with enhanced quantum efficiency, thus reducing the energy threshold to 1 keV (DAMA/LIBRA-phase2)16. The experiment is still ongoing, with an exposure that has already reached 2.86 ton × yr over 22 independent annual cycles17. The modulation signal observed by DAMA is relatively large in amplitude, \({S}_{m}^{{{{\rm{DAMA}}}}}\)= 10.5 ± 1.1 (10.2 ± 0.8) counts keV−1 ton−1 d−1 for [1–6] ([2–6]) keV energy region13. That corresponds to nucleon cross sections of the order of 10−40–10−41 cm2 when interpreted as a WIMP with spin- and isospin-independent coupling. Such a signal should have already been observed by other direct detection experiments, which, however, do not observe events above their estimated backgrounds and can exclude the DAMA/LIBRA signal with a very high confidence level18,19,20,21,22,23,24,25,26. Nevertheless, the comparison between experiments strongly depends on the model employed for the WIMP and its velocity distribution in the galactic halo. Additionally, the lack of alternative explanation to date for the DAMA/LIBRA signal makes it imperative to seek independent confirmation using the same target material.

This is the goal of several experiments, either completed (DM-ICE27, COSINE–10028), in data taking (ANAIS–11229), under construction (COSINE–20030, COSINUS31), or in the R&D phase (SABRE32, PICOLON33, ASTAROTH34, ANAIS+35). To accurately verify the DAMA signal, an experiment must possess the capability to replicate it with high statistical significance (which requires ultra-low background levels with threshold energy at or below 1 keV, large exposure, and operational stability) besides a thorough understanding of the detector’s response function36. This implies addressing various factors including the non-linear energy response, energy resolution, and the efficiencies for triggering and event acceptance. Moreover, because the conversion into light of the energy released by highly ionizing particles is quenched compared to electrons, it is particularly important to consider the conversion factor between the energy deposited by a nuclear recoil (ENaEI) and the energy estimated through a calibration performed with beta/gamma sources, or electron-equivalent energy (Eee). As of today, the scintillation quenching factors QNa,I = Eee/ENa,I cannot be calculated but must be measured and they have been shown to depend on energy. The available measurements14,37,38,39,40,41,42,43,44,45,46 span in the range 0.10–0.30 for QNa and 0.05–0.09 for QI for nuclear-recoil energies below 100 keV.

Under the assumption that the DAMA signal is originated by WIMPs interacting with nuclei, in order to compare the signals from different experiments, it is necessary to assume that the quenching factors QNa,I are the same for different NaI(Tl) detectors, or alternatively, measure the quenching factor in each case and correct the energy scale. In this paper, all the energies in relation to NaI(Tl) detectors are given as electron-equivalent energies. As it will be explained in the last section of this article, differences in quenching factors are the only systematic effect that could compromise a direct comparison of experiments using the same target. In addition to improving the understanding of the scintillation quenching factors in NaI(Tl), this systematic effect could be handled by strongly reducing the energy threshold well below that of DAMA/LIBRA. ANAIS+ is one of the R&D projects pursuing this goal, by replacing the PMTs with SiPMs and operating the detector at 100 K.

The ANAIS–112 experiment is composed of 112.5 kg of NaI(Tl) distributed among nine scintillator units of 12.5 kg each, constructed by Alpha Spectra, Inc., Colorado, US. Each crystal is coupled to two photomultiplier tubes (PMTs) and each module is triggered by the coincidence between the two PMT signals within a 200 ns window. ANAIS–112 has been collecting data at the Canfranc Underground Laboratory, Spain, since August 3, 2017. The detectors are shielded by 10 cm of archeological lead, 20 cm of low activity lead, an anti-radon box (continuously flushed with radon-free nitrogen gas), a muon veto made up of 16 plastic scintillators, and 40 cm of polyethylene bricks and water tanks acting as a neutron moderator.

Previous ANAIS–112 publications have provided detailed descriptions of the experimental setup, data acquisition system (DAQ) and detector performance after the first year29, the background model47, sensitivity prospects48, preliminary annual modulation results for 1.5 and 2 years49,50, results for 3 years of exposure51, and the development of a machine-learning-based analysis protocol for filtering non-bulk scintillation events52.

In this work, we present a reanalysis of the data collected over the first 3 years, totaling an exposure of 312.53 kg × yr. For the event selection, we use the analytical tools outlined in ref. 52. Additionally, we have implemented several enhancements to the ANAIS–112 analysis pipeline, including improvements in energy calibration, quality cuts, and efficiency calculation, which are thoroughly described in the Methods section.

To search for a modulation we perform a chi-squared minimization on the event counts observed by the 9 ANAIS–112 detectors over time, in the two energy regions studied by DAMA ([1–6] and [2–6] keV). The results are compatible with the null hypothesis and incompatible with the DAMA/LIBRA signal at 3.7σ and 2.6σ confidence levels (C.L.) for the [1–6] and [2–6] keV energy regions, respectively. Furthermore, the sensitivity of the experiment has improved as anticipated in ref. 52, confirming our expectations of achieving 5σ sensitivity by 2025. We also investigate how the comparison between ANAIS–112 and DAMA/LIBRA is influenced by the hypothesis that the quenching factors for sodium and iodine recoils are different in both detectors, as recent dedicated measurements suggest37. Considering energy-independent quenching factors, the results of ANAIS are incompatible with the DAMA signal at 3σ C.L.

Results and discussion

ANAIS–112 experimental performance and exposure

The ANAIS–112 detection rate remained fairly stable at about 5 Hz, dominated by fast Cherenkov light produced in the photomultipliers and random coincidences of dark current events in the two PMTs of each module within the coincidence window. The light collection stayed consistently high and homogeneous throughout the nine modules, averaging 14.6 photoelectrons (phe) keV−1 during the first year (with a standard deviation of 0.8 phe keV−1) and 14.4 phe keV−1 (0.7 phe keV−1 standard deviation) during the third year. The gain of the PMTs was stable at the 3% level for all modules, except for D4 and D5, for which we changed the high voltage bias after the first year of operation. These drifts were monitorized and corrected with the periodic 109Cd calibrations.

The trigger efficiency is 100% down to 1.5 keV and remains above 95% at 1 keV29. However, the analysis threshold is set to 1 keV because of the decrease in the acceptance efficiency for the selection of bulk scintillation events down to 20%–30%, depending on the detector (see Fig. 1 and the “Methods” section).

Fig. 1: Total detection efficiency in all the ANAIS–112 modules as a function of energy.
figure 1

It has been obtained by combining the trigger efficiency and the BDT cut efficiency (see the “Methods” section).

Table 1 summarizes the accumulated exposure for the three years of data analyzed in this work, calculated as the product of the total mass times the live time. It also details the dead time (measured using latched counters during the data taking), downtime (primarily due to bi-weekly 109Cd calibrations), percentage of periods rejected in the analysis, and the corresponding effective exposure after subtracting them. In addition to the criteria outlined in the “Methods” section for the rejection of high-rate periods, we remove events arriving within 1 s from a muon interaction in the veto. Scintillation time constants as long as 300 ms have been observed for high-energy μ events in NaI(Tl)53, therefore this criterion helps to prevent the DAQ from triggering numerous low-energy false events after a muon’s passage through the scintillator. Additionally, it also rejects potential secondary particles generated by a cosmic muon in the detector or its shielding.

Table 1 Summary of the accumulated exposure for the three years of data analyzed in this work

Energy spectrum at low-energy and background modeling

In our DM analysis, we focus on events where the energy deposition occurs in only one of the nine detectors, which are referred to in the following as single-hit events. Those falling within the region of interest (ROI), from [1–6] keV, are blinded during the tuning of the analysis procedure. Only  ~10% of the data is unblinded to evaluate the background and assess the experimental performance.

The resulting low-energy spectrum for each detector is presented in Fig. 2, based on the  ~10% of data unblinded. It has been corrected by the total detection efficiency, calculated as the product of the trigger and event selection efficiencies.

Fig. 2: ANAIS–112 detectors’ low-energy background spectra and Monte Carlo background model.
figure 2

Each panel corresponds to one of the nine detectors of ANAIS–112 (see the “Introduction” section), labeled D0 through D8. Blue points: single-hit energy spectrum measured in the ROI for each detector after event selection and efficiency correction. Data correspond to the  ~10% data unblinded for the first three years of operation. Green line: Monte Carlo background model following ref. 47.

In the figure, we also show our Monte Carlo (MC) background model (a comprehensive review can be found in ref. 47). The model takes as input independent estimates of the different radioactive contaminations. The background in the ROI is dominated by radioactive contamination of the NaI(Tl) crystal, particularly by 210Pb out of equilibrium. This isotope is found in varying quantities in the different crystals, both in the bulk and in the surface, being higher for the earliest detectors constructed, D0 and D1, at a level of 3.15 mBq kg−1. After introducing improvements in the growth and purification process, the 210Pb level in detectors D2 to D8 decreased to values in the range of 0.7–1.8 mBq kg−1. Another common contaminant of NaI(Tl) is 40K, due to its chemical affinity. In the ANAIS crystals, it is present at levels of around 1 mBq kg−1 and is responsible for the peak at 3.2 keV visible in the spectra of Fig. 2. This energy is released by the de-excitation of the atomic K-shell following an electron capture (EC) when the high-energy γ ray escapes. Sometimes, this γ can hit another crystal, producing a coincident event that allows us to tag the low-energy deposition and use it to both estimate the amount of 40K in the crystal and select these low-energy events for calibration and efficiency calculation. The cosmogenic isotope 22Na is also present in the crystals and produces a background in the ROI of similar origin, but in this case, the K-shell relaxation energy is at 0.87 keV and is only marginally present in the region between 1 and 2 keV. 3H is another cosmogenic isotope that contributes significantly to the ROI. It is a pure beta emitter with an end-point at 18.591 keV. Other cosmogenic isotopes of tellurium and iodine contribute to the region of interest through L- and K-shell EC emissions54. Their half-lives are shorter (between  ~10 and  ~150 days) and are not relevant to the total background of the experiment, but they are important for explaining the evolution of the background, especially in the latest detectors arriving at Canfranc (D6, D7, and D8). The agreement between the data and the model is very good down to 3 keV. Below this energy, there appears to be a component that is not well explained by the model. It could be optical noise that escapes the event selection or some radioactive background contribution missed in the model. Present studies with a larger digitization window (8 microseconds instead of 1.2) point to the first explanation as the main cause: this energy region seems to be dominated by a population of events with a time scale not compatible with NaI(Tl) scintillation and asymmetric in the light sharing between both PMTs. Light emissions at the PMTs (scintillation, corona discharges, etc.) could be responsible for these events. For the analysis presented in this paper, it acts as a background component whose evolution remains constant in time.

Annual modulation results

To perform an independent test of the DAMA/LIBRA signal, we look for an annual modulation in the ANAIS–112 data, but following a slightly different method as the DAMA collaboration does. DAMA calculates the residual rate of anticoincidence events vs time by subtracting the total rate from the annual average. These residuals are then fitted to a function of the form \(A\cos (\omega (t-{t}_{0}))\). While it has been noted that this approach may introduce a bias in the fit for slowly varying backgrounds55,56,57, this explanation seems unlikely for the DAMA signal: the phase obtained by DAMA would correspond to a slightly increasing background, which is challenging to explain, and no bias is observed above the energy region where the DM signal is anticipated13.

To avoid any potential systematic effects, we adopt a different approach, directly looking for the modulation in the overall event count over time through a least squares fit. We define the χ2 function as follows:

$${\chi }^{2}={\sum}_{i,d}\frac{{({n}_{i,d}-{\mu }_{i,d})}^{2}}{{\sigma }_{i,d}^{2}},$$
(6)

where ni,d represents the number of events in the ROI in the time bin ti for detector d, obtained by correcting the measured event count using the live time for that specific temporal bin and detector, along with the corresponding acceptance efficiency, σi,d is the Poisson uncertainty associated with the event count, also corrected by the corresponding live time and efficiency, and μi,d denotes the expected number of events in that particular time bin and detector, including a hypothetical DM component.

Given the presence of radioactive isotopes with half-lives of the order of a few years in the detectors, primarily 210Pb (T1/2 = 22.3 yr), 3H (T1/2 = 12.3 yr), and 22Na (T1/2 = 2.6 yr), μi,d diminishes over time. For detectors D6, D7, and D8, also Te and I cosmogenic isotope contributions are relevant. Accurately modeling this background rate decrease is crucial to avoid biasing the fit. We employ the following expression to describe it:

$${\mu }_{i,d}=\left[{R}_{0,d}(1+{f}_{d}{\phi }_{bkg,d}^{MC}({t}_{i}))+{S}_{m}\cos (\omega ({t}_{i}-{t}_{0}))\right]{M}_{d}\Delta E\Delta t,$$
(7)

where \({\phi }_{bkg,d}^{MC}\) is a probability distribution function sampled from the MC model, describing the background evolution at time bin ti for detector d, Md is the mass of every module, and ΔE and Δt represent energy and time intervals, respectively. R0,d and fd are free parameters for each detector, while Sm represents the DM annual modulation amplitude. It is set to 0 to test the null hypothesis and allowed to vary freely for the modulation hypothesis. It is worth noting that the time-invariant component R0,d includes both the background produced by isotopes with long half-lives and components of noise not explained by the model, as well as the constant component of a hypothetical contribution from DM.

In our fit, the period is fixed at one year and the phase is set to June 2. In this way, we can directly compare our results with those of the DAMA/LIBRA experiment as they appear in ref. 13. We perform two independent fits, one in the [2–6] keV region, which can be compared with the results from the total accumulated exposure of DAMA/NaI and DAMA/LIBRA, and another in the [1–6] keV region, allowing us to study the results of DAMA/LIBRA-phase2.

The results of the χ2 minimization for the null and modulation hypothesis are displayed in Figs. 3 and 4 for data in the [1–6] and [2–6] keV energy regions, respectively, grouped in 10-day bins. The possible presence of a bias in the fit was studied in ref. 51 using a large set of Monte Carlo pseudoexperiments, sampled from the background models with no modulation or with the modulation amplitude observed by DAMA/LIBRA. In all cases, the bias was found to be compatible with zero or negligible. In the present analysis, only the experimental data have changed with respect to ref. 51, so the bias study remains valid in this case. The results do not exhibit dependence on the time bin size for values between 5 and 30 days.

Fig. 3: Fit results for data from the nine ANAIS–112 modules in the [1–6] keV energy range, under both the modulation and null hypotheses.
figure 3

Each panel corresponds to one of the nine detectors of ANAIS–112. The error bars on the data points represent the standard deviation of the observed rate of events combined with the efficiency uncertainty. The blue line shows the result of the modulation hypothesis fit, while the red line represents the result of the null hypothesis, although it is generally masked by the blue line and not visible. Each panel also displays the χ2 divided by the degrees of freedom (NDF) of the fit for each detector, along with the corresponding p-value. The global results of the fit are: for the null hypothesis, χ2/NDF = 982.20/972, corresponding to a p-value = 0.403, and for the modulation hypothesis, χ2/NDF = 982.07/971, corresponding to a p-value = 0.395. The best-fit modulation amplitude in the latter case is Sm = (−1.3 ± 3.7) (counts keV−1 ton−1 day−1).

Fig. 4: Fit results for data from the nine ANAIS–112 modules in the [2–6] keV energy range, under both the modulation and null hypotheses.
figure 4

Each panel corresponds to one of the nine detectors of ANAIS–112. The error bars on the data points represent the standard deviation of the observed rate of events combined with the efficiency uncertainty. The blue line shows the result of the modulation hypothesis fit, while the red line represents the result of the null hypothesis, although it is generally masked by the blue line and not visible. Each panel also displays the χ2 divided by the degrees of freedom (NDF) of the fit for each detector, along with the corresponding p-value. The global results of the fit are: for the null hypothesis, χ2/NDF = 955.25/972, corresponding to a p-value = 0.643, and for the modulation hypothesis, χ2/NDF = 954.56/971, corresponding to a p-value = 0.641. The best-fit modulation amplitude in the latter case is Sm = (3.1 ± 3 7) (counts keV−1 ton−1 day−1).

ANAIS–112 results are consistent with the null hypothesis, with p-values of 0.40 and 0.64 for [1–6] and [2–6] keV energy regions, respectively. Best fits for the modulation hypothesis are consistent with the absence of modulation within one standard deviation in both regions, with modulation amplitudes Sm = –1.3  ± 3.7 counts keV−1 ton−1 d−1 and 3.1 ± 3.7 counts keV−1 ton−1 d−1, respectively. The χ2 divided by the number of degrees of freedom (NDF) and corresponding p-values are also calculated separately for the data of every module and displayed in the legend of each panel. The p-values are greater than 0.05 in all cases, except for D5 in the [1–6] keV energy region. Notably, these values have improved compared to the previous analysis51 for the energy region [1–6] keV, presumably due to improved filtering of noise events below 2 keV. For illustrative purposes, Fig. 5 shows the fit results after subtracting the background term from Eq. (7) in both the fitting functions and the data for the energy region [1–6] keV for the combined data of the 9 detectors. The modulation observed by DAMA/LIBRA is shown in green.

Fig. 5: Fit results for the combined data of the 9 detectors in the [1–6] keV energy region after subtracting the background term.
figure 5

The data points represent the combined data of the 9 detectors for the energy region [1-6] keV after subtracting the background term from Eq. (7). Error bars have been calculated by combining the standard deviation of the data from each detector. Blue and red lines are the result of the modulation and null hypothesis, respectively, after subtracting the background term from Eq. (7). The modulation observed by DAMA/LIBRA is shown in green.

Experimental sensitivity and prospects

We assess our sensitivity to the DAMA/LIBRA signal as the ratio \({S}_{m}^{{{{\rm{DAMA}}}}}/\sigma ({S}_{m})\), which directly gives in σ units the C.L. at which we can test the DAMA/LIBRA result. The standard deviations for the modulation amplitude obtained in the best fit, σ(Sm) = 3.7 counts keV−1 ton−1 d−1 for both [1–6] keV and [2–6] keV, corresponding to a sensitivity of 2.8 ± 0.3σ in [1–6] keV and 2.8 ± 0.2σ in [2–6] keV, where the uncertainty corresponds to the 68% C.L. DAMA/LIBRA uncertainty.

Figure 6 displays in dark blue lines the ANAIS–112 sensitivity projections following ref. 48, conveniently updated to the effective exposure and detection efficiency presented in this work. Cyan bands take into account the 68% uncertainty in \({S}_{m}^{DAMA}\). The black dot is the sensitivity derived from the results presented here, in good concordance with our estimates. These results support our expectation of achieving a 5σ sensitivity by 2025.

Fig. 6: Evolution of the ANAIS–112 sensitivity to the DAMA/LIBRA signal over time and sensitivity obtained in this work.
figure 6

Dark blue line: the ANAIS–112 sensitivity to the DAMA/LIBRA signal is represented in σ confidence level (C.L.) units as a function of real time in the [1-6] keV (a) and [2-6] keV (b) energy regions. The black dot is the sensitivity measured experimentally in this work. The cyan bands represent the 68% C.L. DAMA/LIBRA uncertainty. Red dashed and red solid lines correspond to reference values of 3σ and 5σ, respectively.

Investigating the impact of the hypothesis of different quenching factors among detectors

The importance of quenching factors has already been emphasized in comparing data from experiments with scintillators searching for WIMPs through their elastic interaction with atomic nuclei. Because of this, model-independent testing of the DAMA/LIBRA result requires using the same target material. Additionally, it is necessary to calibrate the detectors in nuclear-recoil energies, as far as using the conventional electron-equivalent energy calibration cannot guarantee a fair comparison of the same energy regions in the case those quenching factors vary significantly for different NaI(Tl) detectors, for example, due to different concentrations of Tl or the presence of impurities or defects.

In recent decades, the community working with NaI(Tl) has made significant efforts to shed light on this issue. ANAIS and COSINE have conducted a joint study on quenching factors in crystals produced by Alpha Spectra, used by both experiments37. Small crystals from the same supplier but with different powder quality were measured using a monochromatic neutron source at TUNL, North Carolina (US). The results were consistent for all measured crystals. The study also highlighted the importance of properly considering the well-known non-linearity in the NaI(Tl) response, as it can distort the results at very low energies. The results of this work yield constant QNa values of 0.210 ± 0.003 or slightly decreasing with decreasing energy down to a value of  ~0.15 for recoil energies ENa = 10 keV, depending on the calibration method. Despite the differences observed in the various measurements of QNa, there is a general consensus towards QNa values on the order of 0.2, which decreases as energy decreases below ENa = 30 keV to values around 0.10–0.15. In this regard, it is also interesting to mention the preliminary results obtained in ref. 58 with 5 crystals with variable Tl dopant levels ranging from 0.1, 0.3, 0.5, 0.7 to 0.9%, which point to values of QNa in the range of 0.2 for all of them. Few measurements deviate from this trend, such as ref. 46 and notably ref. 14, obtained on crystals from DAMA/LIBRA through a 252Cf neutron calibration. Assuming energy-independent quenching factors, results were consistent with \({Q}_{{{{\rm{Na}}}}}^{{{{\rm{DAMA}}}}}\) = 0.3.

Concerning QI, the situation is similar, although due to its low value, measurements are more challenging, and experimental results are scarce. The ANAIS and COSINE joint work37 has obtained a value of 0.060 ± 0.022. Available data39,41,42 also point to QI on the order of 0.06. Similarly, in this case, the value obtained by DAMA for the constant quenching factor hypothesis14 is higher (\({Q}_{{{{\rm{I}}}}}^{{{{\rm{DAMA}}}}}\) = 0.09).

In conclusion, the possibility that the quenching factor of DAMA crystals differs from those observed in recent measurements remains open. Taking into consideration this scenario for the comparison between ANAIS–112 and DAMA/LIBRA, if we aim to compare the same energy region in terms of nuclear recoils, the DAMA/LIBRA region from [2–6] keV would correspond to the ANAIS–112 region from [1.3–4] keV for constant QNa = 0.2 and QI = 0.0637. We have carried out that analysis, the results of which are depicted in Fig. 7. Once again, a high p-value is obtained for the null hypothesis, while the best fit provides a modulation amplitude of Sm = 3.3 ± 5.0 counts keV−1 ton−1 d−1. In this case, the sensitivity is 3σ because, although the statistics are reduced as a consequence of the reduction of the integration window, the signal-to-background ratio increases correspondingly.

Fig. 7: Fit results for data from the nine ANAIS–112 modules in the [1.3–4] keV energy range, under both the modulation and null hypotheses.
figure 7

Each panel corresponds to one of the nine detectors of ANAIS–112. The error bars on the data points represent the standard deviation of the observed rate of events combined with the efficiency uncertainty. The blue line shows the result of the modulation hypothesis fit, while the red line represents the result of the null hypothesis, although it is generally masked by the blue line and not visible. Each panel also displays the χ2 divided by the degrees of freedom (NDF) of the fit for each detector, along with the corresponding p-value. The global results of the fit are: for the null hypothesis, χ2/NDF = 969.61/972, corresponding to a p-value = 0.516, and for the modulation hypothesis, χ2/NDF = 969.18/971, corresponding to a p-value = 0.510. The best-fit modulation amplitude in the latter case is Sm = (3.3 ± 5.0) (counts keV−1 ton−1 day−1).

Methods

Energy calibration

Energy calibration is carried out in two different ranges: high energy (HE) and low energy (LE). For both regimes, we have updated our calibration procedure with respect to ref. 29. In the case of HE, background measurements are used; while for LE, periodic calibrations are performed with a 109Cd source which allows correction for possible gain drifts. Finally, the ROI is recalibrated using two lines from the background corresponding to 22Na (0.87 keV) and 40K (3.20 keV).

High-energy calibration

The digitization scale is optimized for the study of the low-energy events, so the events above  ~500 keV are out of the digitizer dynamic range and the pulse area energy estimator, Ssum, saturates because the digitized pulses (negative) are truncated at the minimum voltage (−1 V). For this reason, the ANAIS–112 DAQ system29 incorporates a second signal line conveniently attenuated to retain information on the energy released by every high-energy event through the use of charge-to-digital converters (QDC).

As an example, Fig. 8 represents the pulse area versus the corresponding QDC value for detector D3 during the two first weeks of data taking. It can be observed that the pulse area parameter is clearly saturated above QDC  700. In order to estimate the energy of events above ~100 keV, the linearization of the Ssum response was previously performed using a modified logistic function29, but the deviation of high-energy events from the fit reached up to 4%. Therefore, we have updated the high-energy linearization function (green line) by combining a first-degree polynomial (QDC < 250), a 12th-order Chebyshev polynomial (approximately up to 90% of the QDC saturation value), and a second-degree polynomial, successfully reducing the high-energy residuals below 2% (as can be seen in the top panel).

Fig. 8: An example of the procedure for obtaining the high-energy estimator.
figure 8

Dots in (a) are the total pulse area for detector D3 during the two first weeks of data taking as a function of the QDC readout. The α population (red dots) is clearly separated from the β/γ one (black dots). The green line is the result of a fit to a 12th-order Chebyshev polynomial. b shows the residuals of the β/γ population fit to the green line.

This double readout system also allows to discriminate α events (shown in red in Fig. 8) from β/γ events (depicted in black). For high-energy events, the digitized pulses are saturated and, as α events are faster than β/γ, the integral of the pulse in the microsecond window is smaller for the same QDC value.

Since there are no external sources of high energy available for calibrating the ANAIS–112 high-energy regime, calibration of events above  ~100 keV is conducted independently for each background run using several easily identifiable peaks present in the background data. For every detector and run, which on average lasts for two weeks, the number of peaks used for calibration ranges from 6 to 7, depending on their presence in the background spectrum. Among them are: 238.6 keV (212Pb), 295.2 keV (214Pb), 351.9 keV (214Pb), 609.3 keV (214Bi), 1120.3 keV (214Bi), 1460.8 keV (40K), and 1764.5 keV (214Bi). Each peak is fitted to Gaussian lineshapes. Eventually, the calibration is performed via a linear regression between the nominal energies of the peaks and the Gaussian means using a second-degree polynomial.

Figure 9 shows the high-energy calibrated background spectrum for single-hit events adding all the ANAIS–112 detectors over the first three years of operation. Panel b shows the residuals ((fit – nominal)/nominal) for the positions of the main peaks identified in the background, all of them below 1%.

Fig. 9: Total high-energy anticoincidence spectrum measured along the first three years of ANAIS–112 operation.
figure 9

Panel a) shows the spectrum and panel b) shows the residuals for the positions of the main peaks identified in the background.

Low-energy calibration

The ANAIS–112 modules feature a Mylar window in the lateral face, allowing for the use of external gamma sources to perform low-energy calibration. Every two weeks, 109Cd sources are introduced from outside the shielding via multiple flexible wires, enabling simultaneous calibration of all nine modules. 109Cd decays by electron capture (EC) emitting an 88.0 keV γ. Kα and Kβ Ag X-rays are also emitted with average energies of 22.1 and 25.1 keV, respectively. In addition, the source plastic housing contains a certain amount of bromine, which under 109Cd irradiation produces a new calibration line in correspondence with the K-shell Br X-rays. For the Br line, we take as nominal energy the average of the Kα and Kβ X-rays, resulting in 12.1 keV. The 12.1 and 88.0 keV lines are fitted to single Gaussian lineshapes added to a first-degree polynomial, while the 22.1 + 25.1 keV lines are fitted to two Gaussian lineshapes with the same standard deviation added to a first-degree polynomial. Then, a linear regression on the expected energies against the positions of the peaks for every detector is performed using a linear function, and the recalibration of the low-energy events (below  ~100 keV) is carried out.

In order to increase the reliability of the energy calibration in the ROI, and try to reduce possible non-linearity effects, we can also profit from two known lines present in the background, which are actually either in the ROI or very close to the threshold. These lines correspond to an internal contamination of 40K in the bulk and the presence of 22Na as a result of cosmogenic activation. These isotopes may decay via EC, with the emission of a γ from the daughter nucleus de-excitation. The atomic de-excitation energy (0.87 keV for 22Na and 3.2 keV for 40K for K-shell EC, which has the largest probability) is fully contained in the crystal where the decay occurs, while the high-energy γ (1274.5 and 1460.8 keV, respectively) can escape and hit another detector, thus producing a coincidence event.

The 22Na/40K low-energy peaks are excellent for low-energy calibration, but their low rate and the low efficiency for the detection of the coincidence to select them accurately requires the accumulation of background data over long periods to observe them properly. It is also worth noting that the 22Na peak is below the analysis threshold, and despite efficiently triggering at the photoelectron level of each PMT, the requirement of coincidence between the two PMTs within the 200 ns window results in a non-negligible decrease in trigger efficiency below 1 keV. This efficiency was estimated through a Monte Carlo simulation in ref. 29 and has been used to correct the nominal energy of the 22Na peak from 0.87 keV to 0.90 keV. Thus, we have accumulated low-energy coincident events of 22Na and 40K over the first five years of measurement for each detector, and each peak has been fitted to a Gaussian lineshape added to a first-degree polynomial to estimate its mean value. Eventually, the ROI calibration has been conducted via linear regression between the Gaussian means and the nominal energies (0.90 and 3.2 keV, respectively) using a linear function.

Figure 10 shows the evolution of the mean energy of the fitted 0.90 and 3.2 keV peaks from 22Na (in orange) and 40K (in green), respectively, along the first five years of data taking for the nine ANAIS–112 modules using this calibration strategy. It can be observed that the energy scale over time is stable within the ROI in all detectors.

Fig. 10: Stability over time of the calibration peaks in the region of interest for the nine ANAIS–112 detectors.
figure 10

Upper panels, in orange: Evolution of the mean energy of the fitted 0.90 keV peak from 22Na along the first five years of data taking for the nine ANAIS–112 modules. Lower panels, in green: the same, but for the 3.2 keV peak from 40K. Error bars represent the standard deviation of the mean value. The dashed lines are the mean values of the peak positions in each detector, and the shaded regions represent the standard deviations of the peak positions. The mean value and the standard deviation for each module are also shown in the panels.

Event selection

The trigger rate in the ROI is dominated by non-bulk scintillation events. For this reason, the development of robust protocols for the selection of events corresponding to bulk scintillation in sodium iodide is mandatory. Initially, the selection criteria applied in ANAIS–112 were based on standard cuts on a few parameters29, and even though they demonstrated effectiveness above 2 keV, they showed weaknesses in the region from 1 to 2 keV. In order to improve the rejection of noise events between 1 and 2 keV, a machine-learning technique based on a Boosted Decision Tree (BDT) has been implemented. A detailed description of the BDT performance in ANAIS–112 with the old low-energy calibration can be found in ref. 52. Because of the implementation of the new low-energy calibration, the BDT filtering method requires a new training procedure using the updated populations. As training populations for BDT, we combine the following: as signal events, scintillation events ranging from [1–2] keV inside the crystal bulk produced by neutron interactions from dedicated neutron calibrations with a 252Cf source located outside the ANAIS–112 shielding, which are predominantly associated with elastic nuclear recoils in that energy region59; and as noise events, those coming from a blank module similar to the ANAIS–112 modules, but without NaI(Tl) crystal. This choice of the training populations is robust, as it entirely excludes background events, and uses bulk events as signal. The fact that we do not have pure scintillation populations can slightly bias the training, although the major effect is underestimating the cut efficiency. The training process results in a newly constructed parameter named BDT, which combining the information of 15 discriminating parameters maximizes the separation between the signal and noise populations used in the training. For the event selection, we define an energy-dependent BDT cut, retaining only those events that exceed it. This selection criterion has been fine-tuned for each detector and energy bin (see ref. 52 for details). The corresponding efficiency is estimated individually for each detector by using 252Cf neutron calibration events. The ratio of events that pass the signal selection to the total events determines the acceptance efficiency which, when multiplied by the trigger efficiency, constitutes the total detection efficiency. The total efficiency for event detection in all the ANAIS–112 modules as a function of energy is shown in Fig. 1. The acceptance efficiency derived from the BDT cut notably exceeds (around 30% in [1–2] keV) that of the previous ANAIS–112 filtering procedure. Moreover, the BDT method significantly diminishes the background level below 2 keV for all detectors compared to that obtained using the previous ANAIS–112 protocols. In particular, the integrated rate from 1 to 2 keV is 5.39 ± 0.04 and 4.40 ± 0.03 counts keV−1 kg−1 d−1 for the ANAIS–112 filtering procedure and the BDT method, respectively, representing an 18% reduction in background.

Trigger rate cut

Radioactive backgrounds and dark matter are expected to produce a constant rate of events in the detector when considering short time intervals. High-trigger rate periods may be caused by, for example, electrical or mechanical disturbances. In this scenario, high-rate periods, statistically inconsistent with the average detection rate of the experiment can be safely discarded. In order to do so, we evaluated the daily rate of low-energy single-hit events (below 3 keV) passing the BDT cut for each detector (blue line in Fig. 11). All bins exceeding three standard deviations from the detector’s annually averaged rate are removed. The red line in Fig. 11 shows the event rate surviving the cut. The fraction of rejected live time after applying this filtering varies for each detector but, in any case, it is less than 1%.

Fig. 11: Evolution over time of the trigger rate for each ANAIS–112 detector.
figure 11

In blue: trigger rate (calculated in 1-day time bins) for events filtered with the Boosted Decision Tree (BDT) algorithm below 3 keV during the first three years of ANAIS–112 data taking. In red: the same, but after applying the cut in trigger rate described in the text. Note that usually the blue line is masked by the red one.