Polynomial modelling of high-quality yet incomplete rare earth element data sets and a holistic assessment of REE anomalies

Ernst, David M.; Vogt, Joachim; Bau, Michael; Mues, Malte

doi:10.1038/s41598-025-89227-2

Download PDF

Article
Open access
Published: 13 February 2025

Polynomial modelling of high-quality yet incomplete rare earth element data sets and a holistic assessment of REE anomalies

David M. Ernst¹,
Joachim Vogt²,
Michael Bau¹ &
…
Malte Mues³

Scientific Reports volume 15, Article number: 5360 (2025) Cite this article

1891 Accesses
2 Citations
Metrics details

Subjects

Abstract

Rare earth elements (REEs) are powerful proxies used in many (bio-)geochemical studies. Interpretation of REE data relies on normalised REE patterns and anomaly quantification, and requires complete data. Therefore, older, high-quality REE data determined by neutron activation or isotope dilution methods are often ignored, as they did not provide complete data. Similarly, modern analytical data can lack certain REEs due to quantification limits, interferences or usage of REE spikes. However, such data may be the only information available since sample material was consumed, sample locations became inaccessible, or samples represent past states of a dynamic natural system. Therefore, the ability to impute such high-quality data is of value for many geoscientific sub-disciplines. We use a polynomial modelling approach to impute missing REE data, verify the method’s applicability with a large data set (>13,000 samples; PetDB), and complement three originally incomplete REE data sets. Good fitting results (SD <6%) are supported by Monte Carlo simulations for assessing the model uncertainties (± 12%). Additionally, we provide a procedure to quantify REE anomalies, including uncertainties, which were usually not determined in the past but are essential for scientific comparison of REE anomaly data between different data sets. All Python scripts are provided.

A study of rare earth elements enriched carbonisation material prepared from Dicranopteris pedata biomass grown in mining area

Article Open access 22 February 2025

Study on the material source and enrichment mechanism of REE-rich phosphorite in Zhijin, Guizhou

Article Open access 18 March 2024

Reaction-driven magmatic crystallisation at the Maoniuping carbonatite

Article Open access 04 August 2025

Introduction

The rare earth elements (REE) are a group of elements with similar physical properties and coherent geochemical behaviour. Therefore, REEs are widely used as proxies for numerous geochemical processes (e.g.^1,2,3,4), physico-chemical environments or element sources^5,6,7,8. Especially individual REE anomalies are of great scientific interest since they indicate deviations from “normal” conditions. For example, Eu anomalies can be indicative of reducing and hot (>250 °C) formation conditions (e.g.^9,10,11). Research on REE intensified in the 1960s with the emergence of neutron activation analysis (NAA; e.g.^12,13,14), and, subsequently, isotope dilution (ID; e.g.^15,16). However, often, only incomplete REE data sets could be determined due to the specific limitations of the analytical method used. Isotope dilution techniques, for example, do not allow the determination of the monoisotopic REEs praseodymium (Pr), terbium (Tb), holmium (Ho), and thulium (Tm). Neutron activation datasets usually lack praseodymium, dysprosium (Dy), holmium, and erbium (Er). As the interpretation of REE data is predominantly based on shale- or chondrite-normalised REE (REE_N) distribution patterns, missing REEs can alter the appearance of REE patterns and hence affect the interpretation or, as in the case of the lanthanide tetrad effect, even prevent it. Introduced in the 1970s, inductively coupled plasma atomic emission spectroscopy (ICP-AES) and later inductively coupled plasma mass spectrometry (ICP-MS) allowed the measurement of all 14 naturally occurring REEs (e.g.¹⁷). Although incomplete, the comparison of reference material (RM) data derived from ID and NAA measurements with ICP-MS measurements shows that ID and NAA are accurate and precise analytical methods. Amongst others, Cobb¹⁸ and Higuchi et al.¹⁹, for example, determined REE contents by NAA in the W-1 diabase RM that are very similar (within common uncertainties) to the results of more recent studies that used ICP-MS (e.g.²⁰). In the case of isotope dilution analysis, for example, Hooker et al.¹⁶ and Langmuir et al.²¹ independently determined REE contents for the BCR-1 basalt RM that were later confirmed by ICP-MS measurements (e.g.²²).

Despite their high analytical quality, ID and NAA data sets are usually ignored in current studies. Only complete REE data provide maximum information and allow for REE anomaly quantification. Rare earth element anomalies are one of the strongest proxies in REE research. Rare earth element anomaly quantification is based on the respective REE neighbours, which is why incomplete REE data sets are usually not suitable for this purpose. For example, missing gadolinium (Gd) poses a problem for traditional Eu anomaly quantification. Below, we solve this specific example of missing Gd. Not only ID and NAA but also modern ICP-MS data can lack certain REEs. This might be due to measurements close to quantification limits or the usage of REE spikes like Tm in REE preconcentration protocols (e.g.^23,24,25). In addition, certain REE isotopes can be affected by isobaric interferences that, may not be resolved even by high-resolution ICP-MS machines, especially in high-concentration samples (e.g.^20,26). Nevertheless, all these incomplete REE data still bear great information potential. Especially, since samples from such publications might no longer be available for re-measuring. Moreover, in other cases, sample locations might not be accessible anymore, or these samples represent past states of the sampled ecosystem (e.g., soil or water samples). Therefore, the published data sets often are the only information available and imputing the missing REEs and extracting as much information as possible from these data sets is of great scientific value and in the interest of many geoscientific sub-disciplines. Retrieving maximum information from historical measurements aligns with the current general efforts in science to make long-tail data more usable and accessible. Therefore, the purpose of this study is to demonstrate how to make use of incomplete REE data by advanced imputation methods. The methods presented here open up new possibilities for using existing high-quality REE data for new research. Furthermore, we provide the ready-to-use software tool that conducts the modelling.

This study follows O’Neill²⁷ and employs polynomial modelling of logarithmic REE data to impute missing REE measurements. The polynomials relate atom size to the measured normalized contents . In the following, we refer to this approach as λ polynomial modelling (λPM) and show that it is suitable for imputing missing REE data. In addition to λPM, we apply standard polynomial modelling (SPM) with monomials as basis functions. The modelled REE data can then be used to accurately determine anomalies and interpret complete REE patterns. This initial study focuses on mafic and ultramafic rock samples. However, with rock-type adaptations and after careful testing, the method presented here should also apply to other sample types.

We conducted Monte Carlo simulations to assess the uncertainties of modelled REE data and REE anomaly quantifications. Currently, Monte Carlo simulations are not very common in geochemical data assessment. However, our results show that they have great potential to improve the interoperability and reusability of geochemical data, following the FAIR Data Principles²⁸.

Methods and data

All data and the Python scripts used for this work can be found on Zenodo (https://doi.org/10.5281/zenodo.11084980).

Modern ICP-MS measurements allow the accurate and precise determination of contents for all naturally occurring REE with generally low relative standard deviations (RSD) between 3% and 15% (e.g.^29,30,31,32). The modelled results in this study will be evaluated based on this RSD range.

λ polynomial modelling of REE data

The basic concepts were developed by O’Neill²⁷ who investigated the quantification of chondrite-normalised (sub-script CN; REE_CN) REE patterns by polynomial modelling and the application of “lambda shape coefficients” (λ-parameters) for the interpretation of igneous processes. More specifically, the basis functions chosen by O’Neill²⁷ are mutually orthogonal polynomials if all REEs are present in a data set, and the expansion coefficients are termed λ shape coefficients or λ-parameters. These λ-parameters are determined by least-squares fitting of orthogonal polynomials to match the REE pattern²⁷. Although up to 14 polynomials, and hence λ-parameters, could be determined for a full REE pattern, using more than five becomes statistically insignificant²⁷. In the case of basalts, O’Neill²⁷ showed that for most samples, three polynomials, i.e., λ0, λ1, and λ2, are sufficient to precisely quantify their REE_N pattern. These first three λ-parameters also have immediate implications for the REE_N pattern: λ0 gives the normalised average REE content of the sample. λ1 gives the REE_N pattern’s slope indicating enrichment or depletion of the light vs. heavy REEs, and λ2 refers to the quadratic curvature indicating enrichment or depletion of the medium REEs.

Already in his study, O’Neill²⁷ discussed that determining λ-parameters does not require complete REE patterns. The study presented here uses this principle and examines it in detail.

The λ-parameter fitting method developed by O’Neill²⁷ was expanded by Anenburg and Williams³³, who also implemented the calculations in their Python package pyrolite³⁴; https://pyrolite.readthedocs.io/). For in-depth explanations of the nature and determination of these λ-parameters, the reader is referred to the two publications mentioned above as well as Anenburg³⁵ and the references therein. Computations were run via Python (version 3.11.5) utilising the pyrolite package (version 0.3.5) of Anenburg and Williams³³.

Some data in the original publications were provided in ppb or ppt. Therefore, if needed, all input data was converted into ppm (mg/kg) for easier handling.

Afterwards, λ-parameters were determined for each sample following the approach of Anenburg and Williams³³. These determined λ-parameters constitute the coefficients in the polynomial expansion used for reconstructing the REE composition, i.e., each λ-parameter is multiplied with the corresponding orthogonal polynomial. With this λ polynomial modelling (λPM), missing REE data (e.g., ID and NAA data sets) can be imputed and used for further interpretations. Anomalously high Ce and Eu were excluded from the modelling. The anomaly detection and exclusion are included in the provided Python scripts. If, in a first modelling loop, the modelled Ce or Eu content deviates more than a set threshold (10% by default), they are marked as anomalies and excluded in the following modelling loop.

Standard polynomial modelling of REE data

Additionally, modelling was conducted via polynomial fitting utilising the NumPy Polynomials package. The polynomial modelling was conducted similarly to the pyrolite modelling in a way that a polynomial of a given degree was fitted to the distribution of normalised REE content (y-axis) against the respective ionic radii (x-axis). In contrast to λPM , the polynomials are not orthogonal and not pre-determined. They are computed individually for each sample. Anomalous REEs were excluded from the fitting. Subsequently, the received polynomial parameters were used to re-model REE composition. A maximum polynomial degree of three was found to be best for computation (see below). The results of λPM and standard polynomial modelling (SPM) are compared and discussed below.

Monte Carlo simulation to assess modelling precision

We conducted Monte Carlo simulations (MCSs) on λPM and SPM data from the ID-modified method verification data set to assess the uncertainties of modelled REE data. Monte Carlo simulation is a stochastic method to assess prediction uncertainties based on the input variables and a suitable probability distribution³⁶. As it is unknown whether the measured data are correct or contain biases, the measured values are altered by a simulated bias using a probability distribution. For each of these artificially constructed samples, modelled REE data is computed. The result of the MCS yields a mean value for each REE element together with a bias spread in the distribution. The law of large numbers is used to get a statistically estimated measurement from the biased measurement. Due to its robustness, MCS is widely used in industry and science (e.g.³⁷). In a MCS, the probable outcomes of the input model are tested repetitively, yielding a probability distribution under the given constraints. Under the assumption that the REE data errors follow a lognormal distribution, the MCSs reproduce relative uncertainties for these REE data.

As a setup for the MCSs, a normal distribution with a standard deviation of 0.1 was chosen to match the average analytical uncertainties of modern ICP-MS systems during routine operations. For each sample, 100 repetitions were computed. A comparative run with 10,000 repetitions yielded slightly different results. However, the computations of 10,000 repetitions take disproportionately longer, and 100 repetitions were sufficient for a robust assessment of the model predictions and parameter uncertainties. The results of this extensive MCS run are not included here, but the reader is referred to the Python scripts provided to re-run the MCS with individual parameters.

Input data and data preparation

Two types of data were used: (i) complete REE data sets for method verification and (ii) isotope dilution (ID) and neutron activation analysis (NAA) data sets, that both are by nature incomplete to demonstrate the application. Method verification REE data were selected accordingly to match the sample types of the incomplete REE data. In this first study, we focused on REE data for ultramafic and mafic rocks, which are readily available.

The modelling process was verified by applying the new modelling approach to complete REE data sets from which certain REEs had been intentionally removed, allowing comparison between the modelled results and the values actually measured. The quality of the modelling method is assessed by statistical evaluation of the deviations. In this study, method verification was achieved with a data set of 14,321 mafic and ultramafic rock samples from the PetDB database (Supplementary Tab. S1;³⁸; downloaded on 22nd January 2024, using the following parameters: rock type = mafic, ultramafic and full REE data available). The references of each publication included in this bulk data set can be found in the Supplementary Tab. S5. The downloaded PetDB data set was screened for outlier and irregular data, excluding 1,055 samples and leaving 13,266 samples to be used for the computations. Outliers are unexpected REE anomalies that, unlike the Ce and Eu anomalies, cannot be explained by natural processes. Samples with zigzag REE patterns are considered irregular data and are excluded as they usually indicate analytical issues (e.g.³⁰). Since this study focuses on the development and validation of the polynomial modelling method, screened data were used.

The PetDB data set was modified in two ways: (i) praseodymium (Pr), terbium (Tb), holmium (Ho) and thulium (Tm) were removed to imitate ID data, and (ii) praseodymium, dysprosium (Dy), holmium, and erbium (Er) were removed to imitate NAA data. In the case of ID-modified data, sequential removal of Pr, Tb, Ho and Tm was also conducted to investigate the behaviour of each REE upon stepwise removal of individual REEs. The modified data sets were processed as described above. Subsequently, the modelled REE data was compared to the measured REE data, with emphasis on the removed REE.

Terminology

In the following measured data is compared with modelled data. To avoid confusion, a certain terminology is used: For simplicity, we use the term REE “content” or “data” instead of “concentration” (content in liquid samples) or “mass fraction” (content in solid samples). The originally measured REE content is called “measured”, while the re-modelled REE content is ascribed as “modelled”. The REEs comprise 15 elements, and one measurement for one individual REE is named “data point”. For example, a complete REE pattern comprises 14 data points because Pm is always missing. “Deviation” describes the relative difference between the modelled and the measured data according to Eq. (1). In all presented REE pattern plots, the REEs are normalised to C1 Chondrite (C1) according to³⁹.

$$\:Deviation\:\left[\%\right]=\left(\frac{modelled\:REE}{measured\:REE}-1\right)\:\times\:\:100$$

(1)

Results and discussion

Results of method verification – λ polynomial modelling

Verification of the here developed method is best ensured by testing with complete REE data sets of the same sample type as the target data sets. Therefore, data of mafic and ultramafic rocks were used for method verification. Besides the intentionally removed REE (mimicking ID- and NAA-determined data sets), cerium (Ce) and europium (Eu) were excluded for samples showing Ce or Eu anomalies. As an example, Fig. 1 shows the PETDB-1284-COCOS 36 basalt sample from⁴⁰. In this case, Pr, Tb, Ho and Tm were removed prior to modelling to simulate an ID data set. The results shown in Fig. 1 demonstrate that all modelled REE data coincide with the measured data. The average deviation of the modelled REE from the measured REE is less than 2% (“average deviation: 0.0105” in Fig. 1). The largest individual deviations of modelled REE in PETDB-1284-COCOS 36 are −3.0% for Er and 4.1% for Dy (Supplementary Tab. S2), which is within analytical uncertainty. The modelled data and the deviation for each REE in each sample can be found in the Supplementary Tab. S2 (ID modified data) and Supplementary Tab. S3 (NAA modified data).

A summary of all deviations between modelled REE and measured REE in the method verification data set (ID modified) is shown in the density histogram of (Fig. 2A). The x-axis shows the deviation between modelled and measured REEs, with values close to zero indicating a high accuracy of the modelling process. The method verification data set contains 184,432 individual data points, and Ce and Eu were excluded 1,292 times. The deviations are narrowly distributed around a mean of 0.3%. Almost all data points show deviations of less than 10% (absolute value), with 90% of the data points ranging between − 5.4% and 7.2% (5% and 95% percentile; SD = 4.0). The kernel density estimation (KDE) curve is almost perfectly symmetric with a skewness of 0.63. A perfectly symmetric distribution, like the normal distribution, has a skewness of zero. However, skewness values between − 1 and 1 indicate only minor asymmetry. The high symmetry of the KDE and the mean close to zero prove that the applied method does not introduce bias to the modelled REE contents. In this case, the KDE’s kurtosis (normalised fourth central moment) of 6.99 indicates that outliers are very minor, corroborating the high precision of the applied method. For comparison, a normal distribution has a kurtosis of 3, thus the value of 6.99 indicates a narrower distribution than the normal distribution.

Figure 2B shows the deviations between modelled and measured data for Pr, Tb, Ho, and Tm, i.e., for those REE that had been intentionally removed. They show slightly larger deviations than the remaining REEs but still are also mostly <10% in a narrow range between − 6.9% and 9.7% (5% and 95% percentile; SD = 5.4).

Figure S1 (A–C; Supplementary) shows how the deviation between modelled data and measured data of each REE changes upon the sequential removal of Pr, Tb, Ho and Tm. The p0.05, p0.95, mean, standard deviation, skewness and kurtosis for each KDE curve can be found in the Supplementary Tab. S6. The solid black KDE curve refers to REE modelling with the complete verification data set. These baseline KDEs cover narrow ranges for each REE and are predominantly symmetrical around zero, with some minor exceptions. The blue, orange, green and red KDE curves refer to the modelled data after the sequential removal of Pr, Tb, Ho and Tm. For all REEs that are not removed (La, Ce, Nd, Sm, Eu, Gd, Dy, Er, Yb, and Ho) from the input data set, the different KDE curves are almost identical, indicating that the removal of other REEs does not affect their modelling. The only exceptions are Er and Yb, which show slightly thinner KDE curves after Tm removal. Neither case indicates a loss of quality for modelling. The removal of Pr, Tb, Ho and Tm from the input data set affects their respective modelling. After removal, the slightly broader KDE curves of each element indicate larger deviations between modelled and measured data. However, most modelled REE data still has deviations less than 10% (absolute value; Supplementary Tab. S6). Furthermore, the KDE curves for Pr, Tb, Ho and Tm only change after the respective REE is removed from the input data set and are not affected by the removal of further REEs: The KDE for Tb after Pr removal is identical to that of Tb baseline KDE; the Ho KDE curves after Pr removal and after Pr and Tb removal are identical to the Ho baseline KDE; the Tm KDE curves after Pr, Tb and Ho removal are similar to that of Tm baseline KDE. This behaviour shows that the λPM is mostly independent from neighbouring REEs. Mathematically, this result is expected as the REE patterns are smooth and the remaining fittings points for computing the regression are sufficient. Removing La or Lu might have more impact than removing in-between values. The application of λPM is not limited to ID and NAA data sets, as presented here. However, although there surely is a limit of how many REEs can be removed from a data set before the modelling yields erroneous results, this limit was not reached for the ID- or NAA-modified data sets presented above. We did not conduct extensive testing on when the modelling fails. Our experience is that, when distributed equally, at least half of the REEs can be removed. If neighbouring REEs are removed, the breaking point might be around four or five, depending on the sample and what is considered an erroneous result. However, the modelling is most sensitive to removing REEs at the edges. For example, removing just La and Ce might already lead to erroneous results for the remaining light REEs.

Like the above-described computations on the ID-modified data set, method verification was also conducted on an NAA-modified version of the PetDB data set. Modelling of the NAA-modified data yields results almost identical to the ID-modified modelling. Figure S2a shows the KDE plot for the deviations of modelled REE and measured REE in the NAA-modified data set. The deviations cluster in a narrow range around a mean of −0.6% with 5% and 95% percentiles of −7.3% and 6.0%, respectively (SD = 4.3, kurtosis = 5.84). Figure S2b shows the KDE plot for the deviations of Pr, Dy, Ho and Er (NAA-modification). This KDE is slightly shifted to a mean of −2.2, indicating a minor underestimation of the removed REE by modelling. A closer look at the individual deviations (Fig. S3) shows that while the deviation KDE for Pr is centred at almost zero (mean = −0.5), the KDEs for Dy, Ho and Er are shifted towards the left side with decreasing means between −2.4 and −3.3. Mathematically, the explanation for this behaviour is that Dy, Ho and Er are direct neighbours, creating a rather large gap of three missing values between Tb and Tm. In comparison, for the ID-modified data, each removed REE was surrounded by two fixed REE data points that supported the fitting. Nevertheless, the modelled data for Dy, Ho and Er in the NAA-modified data set predominantly show deviations <10% and are, therefore, in good agreement with typical uncertainties of modern routine analytical methods for REEs.

Overall, the low deviations between modelled and measured data and the high reproducibility, even after removing multiple REEs, prove the accuracy and precision of the applied modelling approach. The large number of data points in the method verification data set also demonstrates its applicability to a wide range of mafic and ultramafic lithologies with a range of different REE compositions. The deviations between modelled and measured data fall well-within typical analytical uncertainties for REE measurements by ICP-MS. Therefore, the λPM method is suitable for imputing incomplete REE data sets of mafic and ultramafic rocks.

Optimal polynomial degree for standard polynomial modelling

Fitting a polynomial to a data set as a regression model for estimating missing values always bears the risk of overfitting. O’Neill²⁷ showed that for the λ-parameter fitting, the maximum polynomial degree should be four, while for most samples, even degree three is sufficient. To determine the optimal maximum polynomial degree for the standard polynomial modelling (SPM), the entire PetDB data set (no REEs removed) was computed for polynomials of degrees one to eight, yielding eight sets of modelled REE data. For comparison, the root mean square (RMS) deviation of model prediction from measured values for each REE was computed in each of the eight model data sets. The RMS deviation was used to ensure positive values, and the results are shown in (Fig. 3). Cerium and Eu are not shown due to their anomalous high or low content in many samples, resulting from their specific redox-sensitivity. The RMS deviations of the REEs show asymptotic behaviour, approaching a lower limit of around 5% at a polynomial degree of three. This behaviour shows that increasing polynomial degrees from one to three also increases the prediction quality. Almost constant RMS deviations above degree three indicate overfitting when polynomials of degree four and higher are used. Only the RMS deviations of La and Lu continue to decrease slightly above degree three. However, this is due to the position of La and Lu at the very beginning and end, respectively, of the REE patterns. Due to the nature of polynomial modelling, data at the edges are more strongly affected by the order of the polynomial. Therefore, we suggest that a maximum polynomial degree of three be chosen for SPM to model REE data. This is consistent with the observation of O’Neill²⁷, even if he decided to choose four. The RMS deviations of the REEs at a polynomial degree of three range from approximately 3%–7%, with a mean close to 5%.

Results for standard polynomial modelling

The standard polynomial modelling was applied to the ID-modified PetDB data set. As described above, a maximum polynomial degree of three was selected for the modelling. The results can be found in the Supplementary Tab. S4 and are summarised in two histogram plots (Fig. S4) that show the deviation between the modelled and measured data for all REE (Fig. S4a) and exclusively for the removed Pr, Tb, Ho and Tm (Fig. S4b). These two histograms are counterparts to (Fig. 2). The SPM yields good overall modelling results, with most data points showing deviations less than 10%. In Figure S4a, 90% of the data range between −6.2% and 8.4% deviation (SD = 4.9). In Figure S4b, which only shows the deviations for the removed REE, 90% of the data points range between −6.7 and 11.1% (SD = 6.4). Thus, the SPM yields slightly higher deviations compared to the λPM. However, it should be noted that the SPM was run with a polynomial degree of three, while the λPM was run with a maximum polynomial degree of four. The results of SPM are similar to those of λPM since both methods share the same principle of fitting the REE distribution pattern to polynomials.

Results for Monte Carlo simulation of modelled REE data from ID-modified PetDB data set

A simple Monte Carlo simulation was run on the ID-modified PetDB data set for both modelling methods, λPM and SPM. The results are visualised in Fig. 4 by means of histograms showing the deviations from the mean of each REE in each sample throughout 100 repetitions.

The Monte Carlo simulation for SPM yields slightly smaller uncertainties of approx. ±10.3% (5%–95% percentile; SD = 6.3) than the Monte Carlo simulation of λPM with approx. ±11.8% (5%–95% percentile; SD = 7.2). However, these marginal differences may be neglected as both modelling methods yield uncertainties within the typical analytical uncertainties for REEs.

These computations show that Monte Carlo simulation is a vital tool for assessing uncertainties of modelled REE data. Especially for REE anomaly quantification, the Monte Carlo simulation is a promising approach to assess the uncertainties. Classically, REE anomalies are quantified by calculating a theoretical non-anomalous REE* content (e.g.^41,42). Since this REE* value is not measured but only computed, uncertainties have usually not been fully assessed in the past. However, Monte Carlo simulation can now be applied to estimate REE* uncertainties robustly. An example is given further below. Holistic and accurate determinations of REE anomaly uncertainties are crucial to derive conclusions from REE anomaly quantification.

The Monte Carlo simulation might also be fine-tuned in future applications using individual probability distributions for each REE depending on their respective analytical uncertainty within a measurement. For example, one could use a sample-specific probability distribution based on large data sets of the same sample type.

Final comparison λ polynomial and standard polynomial modelling

The modelling results obtained with the λPM and SPM are very similar. While the Monte Carlo simulation yields slightly lower uncertainties for SPM than for λPM, the fitting accuracy is higher for the latter. An example is given in Supplementary Figure S5, showing the measured REE in comparison to the λPM (Fig. S5a) and SPM (Fig. S5b) data. By simple visual inspection, it is obvious that the λPM yields the more accurate and realistic REE model. In the following, we will use the λPM for re-modelling REE data from originally incomplete REE data sets.

Results for imputing incomplete REE data

All results for the modelled data of originally incomplete REE data sets can be found in (Supplementary Table 7). Figure 5 shows an example from the NAA data set of Potts and Condie⁴³ (91_ultramafite is an ultramafic rock sample). Data for Pr, Dy, Ho, and Er are missing in the data set⁴³ because NAA was used. Additionally, Eu was excluded during the modelling process as a slight negative Eu anomaly was detected. The modelled data coincides with the originally measured data for the remaining REEs with an average deviation of less than 2%. Gadolinium (−3.0%) and Tb (3.7%) show the largest deviations, which probably originate from the fact that Gd and Tb are surrounded by Eu (left side) and Dy, Ho, Er (right side), which were all either excluded during modelling or missing in the input data set. Nevertheless, deviations of <4% (absolute value) are remarkably good and much lower than typical relative standard deviations for the analysis of REE. Further modelled REE data and deviations for each sample from Potts and Condie⁴³, Zindler et al.⁴⁴, and Stosch et al.⁴⁵ can be found in (Supplementary Table 7).

A summary of the deviations between modelled and measured REE of^43,44,45 for existing REE data in the input data sets can be found in Fig. 6a-c, respectively. All three data sets show means close to zero (all 0.1%) and small skewnesses (−0.28 to 0.19). The majority of modelled data shows deviations less than 10%, with 5% and 95% percentiles of −7.1 and 6.6% (SD = 4.6;⁴³), −7.8% and 7.1% (SD = 5.1;⁴⁴), and −7.1% and 9.8% (SD = 4.9;⁴⁵), respectively. Detailed deviations for each REE can be found in Supplementary Table 7, which also contains data from three olivine⁴³, two dunite⁴⁶, two ferrohedenbergite, and two fayalite samples from⁴⁷.

Uncertainties of modelled REE data assessed by Monte Carlo simulation

Monte Carlo simulations for λPM data were conducted for each of the three incomplete REE data sets, as described above. The only difference is that 1,000 repetitions were conducted during the MCS for the three originally incomplete data sets. Summaries of the Monte Carlo simulations are given in Fig. 6d–f, showing the histograms for the deviations from the data mean. All three data sets show similar distributions with standard deviations of 8.1⁴³, 8.7⁴⁴ and 8.3⁴⁵. These uncertainties are somewhat larger than those determined via MCS for the ID-modified PetDB data set (Fig. 4a) and also larger than the plain deviations determined in (Fig. 2). Therefore, the 5% and 95% percentiles given in Fig. 6d–f can be considered rather conservative uncertainties. Nevertheless, these determined uncertainties range within common analytical uncertainties for REEs.

Regardless of the sample type, modelled REE data shows remarkably good agreement with the measured REE data for available REEs. Based on these low deviations and the above results for the method verification and MCS, it can be assumed that the modelled REE contents of missing REEs are accurate within the above-given uncertainties.

Outcome and application of imputed REE data

The combination of measured and modelled REE data yields full REE distribution patterns that can be evaluated like other complete data sets. Figure 7b combines the measured (Fig. 7a) and modelled data of the kyanite eclogite sample S468 from Stosch et al.⁴⁵. Although already prominent in the measured data set, the Eu anomaly can now be evaluated and properly quantified. Anomaly quantification can be conducted using traditional arithmetic or geometric mean determination methods. Alternatively, anomalies can also be computed using the modelled REE data²⁷. For basalts, computation via λPM (shape coefficient modelling) improves the Eu anomaly quantification by 18%²⁷. Thus, we highly recommend using the λPM method as it considers all REEs and not just two neighbouring REEs. In the case of S468, the C1normalised (after³⁹) measured [Eu]_C1 = 29.58 and the modelled [Eu]_C1* = 2.25. Therefore, the Eu anomaly ([Eu]_C1 / [Eu]_C1*) = 13.14. For comparison, an Eu anomaly quantification via measured Sm and Tb (Eq. (2); after⁴⁸) yields a value of 12.03 ([Eu]_C1* = 2.46), which is 8% lower than the former.

$$\:{Eu}_{C1}^{*}=\:0.67\times\:{Sm}_{C1}+0.33\times\:{Tb}_{C1}$$

(2)

Another example is given in Fig. 8 for the picrite basalt sample RE 78⁴⁴ showing a strong positive Ce anomaly. Therefore, Ce was excluded from the modelling (“model 1” in Fig. 8), yielding good agreement with the remaining measured REE. However, model 1 has a comparably low value for Ce, causing a slight depression between measured La and Nd. This may look odd at first glance, suggesting a potential La anomaly. Therefore, for model 2, Ce was determined by linear interpolation between La and Nd first (Eq. 3). Afterwards, modelling was computed with the linearly interpolated Ce. However, Fig. 8 clearly shows that model 2 has larger deviations from the measured REE, especially for La (5.6%) and Nd (17.0%), than model 1 (La: −0.4%; Nd: 3.1%). Also, there is a significant difference between the linearly interpolated Ce (0.88 ppm), which was included in the input file for model 2, and the modelled Ce value of model 2 (0.77 ppm). Considering the overall smoothness and fit, model 1 is preferred, strongly arguing for REE anomaly quantification using λPM instead of interpolation between two neighbouring REEs, as already discussed²⁷. Furthermore, this example shows that λPM might even help to uncover other hidden anomalies, like La in the RE78 sample. This has to be examined in more detail in the future.

$$\:{log}_{10}\left({Ce}_{C1}^{*}\right)=\:0.67\times\:{{log}_{10}(La}_{C1})+0.33\times\:{{log}_{10}(Nd}_{C1})$$

(3)

To take the REE anomaly determination one step further, the precision of the REE anomaly quantification can be estimated using MCS. For example, the kyanite eclogite sample S468 (Fig. 7) is used. Stosch et al.⁴⁵ give a 2σ standard deviation of 0.04 mg/kg for S468, which corresponds to a single relative standard deviation of 1.15% (Eu_S468 = 1.74 mg/kg; 1σ SD = 0.02 mg/kg). The standard deviation for Eu_S468 in the here conducted MCS is 0.01 mg/kg, yielding a relative standard deviation for the modelled Eu_S468* of 7.7% (Eu_S468* = 0.13). Since REE anomaly calculation is a division, the uncertainties add up to 8.0% (Eq. 4) in the case of S468. This approach makes REE anomaly calculations more comparable, offering an improved way to compare REE data of different samples. This approach is not limited to the REE but can also be applied to quantifying Y anomalies in REY distributions. Generally, MCS can be applied to assess uncertainties of anomalies in spider diagrams (e.g.⁴⁹).

$$\:relative\:uncertainty=\:\sqrt{{\left(2.3\:\%\right)}^{2}+{\left(7.7\:\%\right)}^{2}}=8.0\:\%$$

(4)

Another application for the presented λPM are REEs affected by interferences, which cannot be resolved analytically. Especially in high-concentration samples, like ores, many REE isotopes can be affected by isobaric interferences (e.g.^20,26). Even the light REE can interfere on their heavy neighbours. For example, Anenburg et al.⁵⁰ demonstrated one way of correcting Er, Gd, Tb and Yb. The here presented λPM now provides another robust and accurate method to impute interference-affected REE data. The provided scripts offer an easy application for every researcher.

Remarks

Restoring missing REEs remains primarily a mathematical imputation procedure that involves minimal geochemical modelling since the polynomials used are based on the atomic radii of the REEs. Therefore, this method is a data approximation and should be cautiously applied, especially regarding potentially anomalous elements. A profound inspection of the modelled data is necessary. Nevertheless, the presented method here is much more elaborate than simple linear or geometrical interpolation, as it involves the entire REE data set. If applied correctly, it offers a great opportunity to reassess incomplete REE data and quantify REE distributions and anomalies. This is not only limited to ID and NAA data but can also be applied to ICP-MS measurements in which not all REEs were determined. However, especially in the case of REE measurements close to limits of quantification, the λPM or SPM do not replace a re-measurement if the sample is still available.

Conclusions

In this study, we conducted λ polynomial and standard polynomial modelling to impute incomplete REE data sets. Both methods are capable of quantifying REE distribution patterns and modelling missing REE data well-within the uncertainties of modern ICP-MS analytical methods (<10%). The λPM is based on the initial work of O’Neill²⁷ and its continuation and pyrolite implementation by Anenburg and Williams³³. Method verification was achieved by processing REE data of >13,000 mafic and ultramafic rock samples from the PetDB database³⁸. Since this study focused on imputing isotope dilution (Pr, Tb, Ho and Tm missing) and neutron activation analysis (Pr, Dy, Ho and Er missing) data, the method verification data sets were modified accordingly. Nevertheless, λPM and SPM are also applicable to data sets in which other REEs are missing. Additionally, Monte Carlo simulations were conducted to assess the precision of the model predictions, which were also found to fall well-within the analytical uncertainties of modern REE analysis, confirming the reliability of the here presented method.

The data processing methods we presented offer a new opportunity to utilise incomplete REE data. This allows to reassess not only highly valuable REE data sets, such as those measured with ID and NAA, but also REE data sets where certain REEs were used as spikes or are affected by interferences in the analysis. Furthermore, the combination of λPM and MCS is capable of precise REE anomaly quantification and determination of REE anomaly uncertainties, an important prerequisite for quantitative REE anomaly comparison. We thus recommend implementing these methods in future REE investigations to ensure objectivity and an enhanced inter-comparability of different REE studies. The methods presented here aim to make geochemical data more FAIR²⁸.

Data availability

Data and Python scripts are available through Zenodo at https://doi.org/10.5281/zenodo.11084980. If there are any problems, accessing the data or running the scripts, please contact David Ernst (ORCiD: https://orcid.org/0000-0003-4316-135X) or Malte Mues (ORCiD: https://orcid.org/0000-0002-6291-9886).

References

Bau, M. Controls on the fractionation of isovalent trace elements in magmatic and aqueous systems: evidence from Y/Ho, Zr/Hf, and lanthanide tetrad effect. Contrib. Miner. Petrol. 123, 323–333 (1996).
Article ADS MATH CAS Google Scholar
Viehmann, S. et al. The reliability of ∼2.9 Ga old witwatersrand banded iron formations (South Africa) as archives for mesoarchean seawater: evidence from REE and nd isotope systematics. J. Afr. Earth Sc. 111, 322–334 (2015).
Article ADS CAS Google Scholar
Klimpel, F., Bau, M. & Graupner, T. Potential of garnet sand as an unconventional resource of the critical high-technology metals scandium and rare earth elements. Sci. Rep. 10 https://doi.org/10.1038/s41598-021-84614-x (2021).
Zhang, K., Zocher, A. L. & Bau, M. Rare earth elements and yttrium in shells of invasive mussel species from temperate rivers in Central Europe: comparison between C. Fluminea, D. Bugensis, and D. Polymorpha. Chem. Geol. 648, 121878 (2024).
Article CAS Google Scholar
Benaouda, R., Holzheid, A., Schenk, V., Badra, L. & Ennaciri, A. Magmatic evolution of the Jbel Boho alkaline complex in the Bou Azzer inlier (Anti-Atlas/Morocco) and its relation to REE mineralization. J. Afr. Earth Sci. 129, 202–223 (2017).
Article ADS CAS Google Scholar
Kreitsmann, T. et al. Oxygenated conditions in the aftermath of the Lomagundi-Jatuli event: the carbon isotope and rare earth element signatures of the paleoproterozoic zaonega formation, Russia. Precambrian Res. 347, 105855 (2020).
Article CAS Google Scholar
Ernst, D. M. & Bau, M. Banded iron formation from Antarctica: the 2.5 Ga old Mt. Ruker BIF and the antiquity of lanthanide tetrad effect and super-chondritic Y/Ho ratio in seawater. Gondwana Res. 91, 97–111 (2021).
Article ADS CAS Google Scholar
Krohn, L. M., Klimpel, F., Béziat, P. & Bau, M. Impacts of COVID-19 and climate change on wastewater-derived substances in urban drinking water: evidence from gadolinium-based contrast agents in tap water from Berlin, Germany. Water Res. 259, 121847 (2024).
Article PubMed CAS Google Scholar
Kraemer, D., Frei, R., Ernst, D. M., Bau, M. & Melchiorre, E. Serpentinization in the archean and early phanerozoic–insights from chromium isotope and REY systematics of the mg cr hydroxycarbonate stichtite and associated host serpentinites. Chem. Geol. 565, 120055 (2021).
Article CAS Google Scholar
Benaouda, R., Kraemer, D., Bejtullahu, S., Mouttaqi, A. & Bau, M. Occurrence of high-grade LREE allanite-pegmatites and calcite carbonatite dykes in the Ediacaran complex of Aghracha, Oulad Dlim massif (South Morocco). J. Afr. Earth Sc. 196, 104727 (2022).
Article CAS Google Scholar
Klose, L. et al. Hydrothermal activity and associated subsurface processes at Niuatahi rear-arc volcano, North East Lau Basin, SW Pacific: implications from trace elements and stable isotope systematics in vent fluids. Geochim. Cosmochim. Acta. 332, 103–123 (2022).
Article ADS CAS Google Scholar
Chase, J. W., Winchester, J. W. & Coryell, C. D. Lanthanum, europium, and dysprosium distributions in igneous rocks and minerals. J. Geophys. Res. 68, 567–575 (1963).
Article ADS CAS Google Scholar
Desai, H. B., Krishnamoorthy Iyer, R. & Sankar Das, M. Determination of scandium, yttrium, samarium and lanthanum in standard silicate rocks, G-1 and W-1, by neutron-activation analysis. Talanta 11, 1249–1255 (1964).
Article CAS Google Scholar
Brunfelt, A. O. & Steinnes, E. Instrumental neutron-activation analysis of ‘standard rocks’. Geochim. Cosmochim. Acta 30, 921–928 (1966).
Article ADS CAS Google Scholar
Schnetzler, C. C., Thomas, H. H. & Philpotts, J. A. Determination of rare earth elements in rocks and minerals by mass spectrometric, stable isotope dilution technique. Anal. Chem. 39, 1888–1890 (1967).
Article MATH CAS Google Scholar
Hooker, P. J., O’Nions, R. K. & Pankhurst, R. J. Determination of rare-earth elements in USGS standard rocks by mixed-solvent ion exchange and mass-spectrometric isotope dilution. Chem. Geol. 16, 189–196 (1975).
Article ADS CAS Google Scholar
Broekaert, J. A. C., Leis, F. & Laqua, K. Application of an inductively coupled plasma to the emission spectroscopic determination of rare earths in mineralogical samples. Spectrochimica Acta Part. B: at. Spectrosc. 34, 73–84 (1979).
Article MATH Google Scholar
Cobb, J. C. Determination of lanthanide distribution in rocks by neutron activation and direct gamma counting. Anal. Chem. 39, 127–131 (1967).
Article MATH CAS Google Scholar
Higuchi, H., Tomura, K., Onuma, N. & Hamaguchi, H. Rare earth abundances in several geochemical standard rocks. Geochem. J. 3, 171–180 (1969).
Article ADS CAS Google Scholar
Dulski, P. Reference materials for geochemical studies: new analytical data by ICP-MS and critical discussion of reference values. Geostand. Geoanal. Res. 25, 87–125 (2001).
Article MATH CAS Google Scholar
Langmuir, C. H., Bender, J. F., Bence, A. E., Hanson, G. N. & Taylor, S. R. Petrogenesis of basalts from the FAMOUS area: Mid-atlantic ridge. Earth Planet. Sci. Lett. 36, 133–156 (1977).
Article ADS CAS Google Scholar
Haase, K. M., Beier, C., Regelous, M., Rapprich, V. & Renno, A. Spatial variability of source composition and petrogenesis in rift and rift flank alkaline lavas from the eger rift, central Europe. Chem. Geol. 455, 304–314 (2017).
Article ADS CAS Google Scholar
Schmidt, K., Bau, M., Merschel, G. & Tepe, N. Anthropogenic gadolinium in tap water and in tap water-based beverages from fast-food franchises in six major cities in Germany. Sci. Total Environ. 687, 1401–1408 (2019).
Article ADS PubMed CAS Google Scholar
Zocher, A. L., Klimpel, F., Kraemer, D. & Bau, M. Naturally grown duckweeds as quasi-hyperaccumulators of rare earth elements and yttrium in aquatic systems and the biounavailability of gadolinium-based MRI contrast agents. Sci. Total Environ. 838, 155909 (2022).
Article PubMed CAS Google Scholar
Zocher, A. L., Kraemer, D., Merschel, G. & Bau, M. Distribution of major and trace elements in the bolete mushroom suillus luteus and the bioavailability of rare earth elements. Chem. Geol. 483, 491–500 (2018).
Article ADS CAS Google Scholar
Lomax-Vogt, M. C., Liu, F. & Olesik, J. W. A searchable/filterable database of elemental, doubly charged, and polyatomic ions that can cause spectral overlaps in inductively coupled plasma-mass spectrometry. Spectrochim. Acta Part. B Spectrosc. 179, 106098 (2021).
Article CAS Google Scholar
O’Neill, H. St. C. The smoothness and shapes of chondrite-normalized rare earth element patterns in basalts. J. Petrol. 57, 1463–1508 (2016).
Article ADS MATH Google Scholar
Wilkinson, M. D. et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
Article PubMed PubMed Central MATH Google Scholar
Verma, S. P., Santoyo, E. & Velasco-Tapia, F. Statistical evaluation of analytical methods for the determination of rare-earth elements in geological materials and implications for detection limits. Int. Geol. Rev. 44, 287–335 (2002).
Article MATH Google Scholar
Zocher, A. L., Klimpel, F., Kraemer, D. & Bau, M. Assessing the bioavailability of dissolved rare earths and other trace elements: digestion experiments with aquatic plant species Lemna minor (duckweed reference standard BCR-670). Appl. Geochem. 134, 105025 (2021).
Article CAS Google Scholar
Zocher, A. L., Ciesielski, M., Piarulli, T., Farkas, S., Bau, M. & J. & Rare earth elements and yttrium (REY) in fjord waters: comparison between seawater in the Trondheimfjord (Norway), its local riverine REY sources and the North Atlantic. Geochim. Cosmochim. Acta 379 https://doi.org/10.1016/j.gca.2024.06.014 (2024).
Klimpel, F. & Bau, M. Decoupling of scandium and rare earth elements in organic (nano) particle-rich boreal rivers draining the fennoscandian shield. Sci. Rep. 13, 10357 (2023).
Article ADS PubMed PubMed Central MATH CAS Google Scholar
Anenburg, M. & Williams, M. J. Quantifying the tetrad effect, shape components, and Ce–Eu–Gd anomalies in rare earth element patterns. Math. Geosci. https://doi.org/10.1007/s11004-021-09959-5 (2021).
Article Google Scholar
Williams, M. et al. Pyrolite: Python for geochemistry. JOSS 5, 2314 (2020).
Article ADS MATH Google Scholar
Anenburg, M. Rare earth mineral diversity controlled by REE pattern shapes. MinMag 84, 629–639 (2020).
Article ADS MATH CAS Google Scholar
Fishman, G. S. Monte Carlo - Concepts, Algorithms, and Applications (Springer New York, 1996).
Vogt, J. et al. Daedalus ionospheric profile continuation (DIPCont): Monte Carlo studies assessing the quality of in situ measurement extrapolation. Geosci. Instrum. Method Data Syst. 12, 239–257 (2023).
Article ADS MATH Google Scholar
Lehnert, K., Su, Y., Langmuir, C. H. & Sarbas, B. & Nohl, U. A global geochemical database structure for rocks. Geochem. Geophys. Geosyst. 1, (2000).
Palme, H. & O’Neill, H. St. C. Cosmochemical estimates of mantle composition. In Treatise on Geochemistry 1–39 https://doi.org/10.1016/B978-0-08-095975-7.00201-1 (Elsevier, 2014).
Harpp, K. S., Wanless, V. D., Otto, R. H., Hoernle, K. & Werner, R. The cocos and carnegie aseismic ridges: a trace element record of long-term plume–spreading center interaction. J. Petrol. 46, 109–133 (2005).
Article ADS CAS Google Scholar
Bolhar, R., Kamber, B. S., Moorbath, S., Fedo, C. M. & Whitehouse, M. J. Characterisation of early archaean chemical sediments by trace element signatures. Earth Planet. Sci. Lett. 222, 43–60 (2004).
Article ADS CAS Google Scholar
Barrat, J. A., Bayon, G. & Lalonde, S. Calculation of cerium and lanthanum anomalies in geological and environmental samples. Chem. Geol. 615, 121202 (2023).
Article CAS Google Scholar
Potts, M. J. & Condie, K. C. Rare earth element distributions in a proto-stratiform ultramafic intrusion. Contr Mineral. Petrol. 33, 245–258 (1971).
Article ADS CAS Google Scholar
Zindler, A., Hart, S. R., Frey, F. A. & Jakobsson, S. P. Nd and Sr isotope ratios and rare earth element abundances in Reykjanes Peninsula basalts evidence for mantle heterogeneity beneath Iceland. Earth Planet. Sci. Lett. 45, 249–262 (1979).
Article ADS CAS Google Scholar
Stosch, H. G., Herpers, U. & Kötz, J. Neutron activation analysis of the rare earth elements in rocks from the earth’s upper mantle and deep crust. J. Radioanal. Nuclear Chem. Articles 112, 545–553 (1987).
Article MATH CAS Google Scholar
Brunfelt, A. O., Roelandts, I. & Steinnes, E. Determination of rubidium, caesium, barium and eight rare earth elements in ultramafic rocks by neutron-activation analysis. Analyst 99, 277 (1974).
Article ADS CAS Google Scholar
Michael, P. J. Partition coefficients for rare earth elements in mafic minerals of high silica rhyolites: the importance of accessory mineral inclusions. Geochim. Cosmochim. Acta 52, 275–282 (1988).
Article ADS MATH CAS Google Scholar
Bau, M. & Alexander, B. W. Distribution of high field strength elements (Y, zr, REE, Hf, Ta, Th, U) in adjacent magnetite and chert bands and in reference standards FeR-3 and FeR-4 from the Temagami iron-formation, Canada, and the redox level of the Neoarchean ocean. Precambrian Res. 174, 337–346 (2009).
Article ADS CAS Google Scholar
Rock, N. M. S. The need for standardization of normalized multi-element diagrams in geochemistry: a comment. Geochem. J. 21, 75–84 (1987).
Article ADS MATH CAS Google Scholar
Anenburg, M., Mavrogenes, J. A. & Bennett, V. C. The fluorapatite P–REE–Th vein deposit at nolans bore: Genesis by carbonatite metasomatism. J. Petrol. 61, egaa003 (2020).
Article ADS CAS Google Scholar

Download references

Acknowledgements

This research was funded by the German Federal Ministry of Education and Research (BMBF). The funding was granted to Falk Howar (TU Dortmund; 16DKWN124B) and Michael Bau (Constructor University; 16DKWN124A). We sincerely thank them for their support. We would also like to thank Michael Anenburg for his insightful and thorough comments, which enhanced the manuscript.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Critical Metals for Enabling Technologies – CritMET, School of Science, Constructor University, Campus Ring 1, 28759, Bremen, Germany
David M. Ernst & Michael Bau
School of Science, Constructor University, Campus Ring 1, 28759, Bremen, Germany
Joachim Vogt
Chair for Software Engineering, TU Dortmund University, Otto-Hahn-Straße 12, 44227, Dortmund, Germany
Malte Mues

Authors

David M. Ernst
View author publications
Search author on:PubMed Google Scholar
Joachim Vogt
View author publications
Search author on:PubMed Google Scholar
Michael Bau
View author publications
Search author on:PubMed Google Scholar
Malte Mues
View author publications
Search author on:PubMed Google Scholar

Contributions

DE conducted the conceptualization, method and software development, data visualization and writing of main manuscript text. Visualization. JV conducted method and software development, data validation and reviewed and edited the manuscript. MB participated in the conceptualization, reviewed and edited the manuscript and supervised the work. MM developed methods and software, participated in main text writing and data visualization.

Corresponding author

Correspondence to David M. Ernst.

Ethics declarations

Competing interests

The authors declare no competing interests.

Generative AI and AI-assisted technologies in the writing process

During the preparation of this work the authors used Grammarly in order to spell check. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary Material 2

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ernst, D.M., Vogt, J., Bau, M. et al. Polynomial modelling of high-quality yet incomplete rare earth element data sets and a holistic assessment of REE anomalies. Sci Rep 15, 5360 (2025). https://doi.org/10.1038/s41598-025-89227-2

Download citation

Received: 05 July 2024
Accepted: 04 February 2025
Published: 13 February 2025
DOI: https://doi.org/10.1038/s41598-025-89227-2