Introduction

Mycotoxins are toxic fungal secondary metabolites that contaminate a wide spectrum of essential foods worldwide, including staple crops consumed by the most vulnerable populations in low- and middle-income countries1. Although many filamentous moulds are toxigenic, the most important mycotoxin-producing genera are Aspergillus, Fusarium and Penicillium2. One fungal species may produce numerous different mycotoxins, and the same mycotoxin may also be produced by different species. Eskola et al. (2020) estimated that between 60% and 80% of the world’s food crops are contaminated with mycotoxins3. Therefore, human mycotoxin exposure through dietary consumption is universal3,4,5.

Mycotoxins have the potential to contribute to a diversity of adverse human health outcomes6. The foremost toxic effects are carcinogenicity, genotoxicity, nephrotoxicity, hepatotoxicity, oestrogenicity, allergenic manifestations, reproductive disorders, immunosuppression, and dermal effects. Nevertheless, mycotoxins as a group cannot be classified according to their mode of action, toxicology, or metabolism, as these have a great diversity in physicochemical properties. The most common and pathologically-significant mycotoxins are aflatoxins (AF) (e.g., aflatoxin B1, AFB1), fumonisins (e.g., fumonisin B1, FB1), ochratoxins (e.g., ochratoxin A, OTA), trichothecenes (e.g., deoxynivalenol, DON), patulin (PAT) and zearalenone (ZEN)7,8,9. Certain mycotoxins - such as AFB1 - are predicted to become an even more important food safety issue globally, especially in a +2°C global average temperature scenario, which is the most probable of the climate change scenarios expected for the upcoming years10,11. Notably, a recent Advisory Group of scientists recommended priorities for the International Agency for Research on Cancer (IARC) Monographs program during 2020-2024 and 2025-2029, and highlighted FB1, AF, ZEN, and PAT as agents for evaluation, respectively12,13.

The capability of conducting accurate exposure assessments of mycotoxins at the individual level is required to fully understand potential health consequences in humans. Specific well-characterized biomarkers in biological fluids (urine or blood) have been shown to predict relevant clinical outcomes across a variety of treatments and populations14. A challenge in biomarker-driven research is to identify, prove the relevance, and validate biomarkers14. To date, the research field on mycotoxin biomarkers of exposure & effect is still a poorly unexplored territory. The only validated mycotoxin biomarker, besides AFB1-lysine (serum albumin adduct) & FB115,16,17, was recently validated by our group for DON through a human toxicokinetic (TK) intervention study, namely DON-3 & 15-glucuronide (DON-3-GlcA & DON-15-GlcA), by the use of mass spectrometry (MS)18,19.

In the past decade, MS has played an important role in xenobiotic metabolite identification2,20. The use of high-resolution mass spectrometry (HRMS) for metabolomics approaches is still an underutilized tool in mycotoxin biomarker research, although a great impact is expected from this technology. HRMS will enable the unravelling of unknown metabolites and metabolic pathways of mycotoxins, which will support researchers and public health officials in gaining insights on how to assess the associated health risks of mycotoxin exposure.

In recent years, the scientific community has aimed to evaluate all variables (predictors, features) simultaneously (i.e., in a multivariate way) when interpreting collected data21,22,23,24,25,26. This approach allows for extracting the most relevant information from datasets, mainly when large amounts of noisy or redundant data occur. Multivariate data analysis techniques (also known as chemometrics in the research fields of chemistry or, more generally, machine learning, ML) are divided into two main categories: (i) pattern recognition techniques and (ii) regression methods.

Pattern recognition models are subdivided into (i) unsupervised models (i.e., the information about the a priori knowledge of the instances under exam is not considered by the algorithms) and (ii) supervised/classification models (the a priori information/category labelling the instances under exam is an essential input of the algorithms). The most used unsupervised methodologies in ML are Principal Components Analysis (PCA)27 and cluster analysis (CA)26. Supervised/classification modelling techniques are grouped into discrimination and class modelling techniques. The latter models, like Soft Independent Modelling of Class Analogies (SIMCA)28 are considered ensemble models and assign the different instances to no classes, one class, or more than one class. In mycotoxin biomarker analysis, discrimination models such as Linear Discriminant Analysis (LDA)26 and Partial Least Squares – Discriminant Analysis (PLS-DA)29 are employed to calculate specific boundaries in the multidimensional space that allow separating the different individuals within their corresponding classes. In parallel, several further distance-based and experience-based ML classification methodologies should also be evaluated to recognize control individuals from mycotoxin-treated volunteers30,31. Furthermore, the main goal of supervised regression/calibration modelling is to assess the relationships and the correlation occurring between a matrix X of predictors/variables and one or more responses contained in a matrix Y. Examples of supervised regression techniques include Multivariate Linear Regression (MLR)32, Principal Component Regression (PCR), and Partial Least Squares Regression (PLS-R)33. The quality of the regression models can be expressed by metrics such as the Root Mean Square Error in Calibration (RMSEC). This (absolute) value quickly indicates the magnitude of the error. If cross-validation is performed, it is possible also to evaluate the Root Mean Square Error in Cross-Validation (RMSECV). Furthermore, suppose new independent objects are introduced into an external evaluation set to test the performance of the developed models. In that case, the Root Mean Square Error in Prediction (RMSEP) can also be calculated.

The metabolic profile reflects the functionality of kidneys, liver, and the gastrointestinal tract, which are the major physiological compartments involved in mycotoxins transformation, absorption, and excretion (ADME-principle)34,35,36. Once biomarkers have been identified, resulting data are integrated through the use of TK models to derive estimates of key ADME parameters, such as rates of gastrointestinal (GI) uptake, clearance by metabolism, and urinary excretion. For xenobiotics, hierarchical population-based Bayesian models have emerged as a state-of-the-art approach to account for inter-individual variation in ADME. However, surprisingly, it has yet to be applied in the context of mycotoxins37,38,39,40.

This study presents an integrated framework for the biomarker discovery and toxicokinetic characterization of mycotoxin in humans. Using both newly generated and previously validated results, it combines a standardized human intervention trial with different data analysis, including ML–based classification and regression, and hierarchical Bayesian TK modeling. The approach is designed for mycotoxins classified in IARC Group III, ensuring ethical feasibility while maintaining mechanistic relevance.

Special focus is laid on implementing non- and minimally-invasive sampling strategies. The generated data support the identification of novel biomarkers of exposure and the construction of robust TK models based on the metabolic profile. The harmonization of the experimental design improves both the repeatability and reproducibility of the experimental results obtained, whilst the coverage of different bio-matrices improves the model’s predictivity and highlights the superiority of multi-matrix approaches in biomarker detection, pathway elucidation, and future clinical purposes41.

Material and methods

Human intervention trial

The recruitment was aimed to enrol an adequate number of research volunteers to achieve robust and trustful results based on the specific research question(s). The procedures for the recruitment, along with the inclusion and exclusion criteria, and a volunteer manual were detailed in the information brochure (IB), which was broadcasted with the invitation through a direct e-mail or delivered in person. Specific for mycotoxin intervention trials, people who suffered from kidney, liver, or bile diseases were excluded, as well as people who were taking medicinal drugs that affected the same organs (such as cytochrome P450 (CYP) inhibitors) since the functionality of the compartments involved in the metabolism of mycotoxins might have been compromised. Furthermore, as mycotoxins have been demonstrated to pass through the placental barrier to the foetus or through breastmilk to the newborn, pregnant or breastfeeding women were excluded as well42,43,44,45,46. All the required template documents were lawfully legitimated agreeing to the enforced laws of the country where the trial is performed. Specifically, this trial was developed in compliance with Belgian and European legislation and the Declaration of Helsinki. The presented study design was approved on September 9th, 2020, by the Ethics Committee of Ghent University Hospital (UZGent, Ghent, Belgium) as an amendment to the original dossier B670201630414, approved on February 2nd, 2017. All the documents and the human intervention trial were approved by the Ethical Commission of the Organization of the Research Centre. The research volunteers who were interested to take part in the trial, were invited starting from February 2017 and again from June 2021 by the principal investigator (PI) to overview the study (no data will be collected at this point) (Supplementary Information - Document S1) at the Research Centre. Before entry the study, the implications of participation, and the required procedures were explained (Supplementary Information - Documents S3 and S4). After signing the Informed Consent Form (ICF) (Supplementary Information - Document S2), the eligible research volunteers were invited for an introductive meeting together with the supervising Medical Doctor (MD). During this meeting, more details were provided regarding the study. Participants were requested to fill in a sociodemographic questionnaire (Supplementary Information - Document S5) before receiving the sample collection material and the aqueous solution with or without mycotoxin depending on the group they were assigned (control or intervention). Participant assignment to one of the two groups was performed randomly using a randomization function. Given the typically small sample sizes in intervention studies, the random allocation was subsequently double-checked to ensure that the two groups were balanced in terms of sex, BMI, age, and number of participants, among others. Participants were not informed about their specific group allocation.

The intervention trial was designed to be minimally invasive and patient-friendly. All the procedures were executed at home in a time frame of five consecutive days as depicted in Figure 1. On days 1 and 2, the research volunteers were requested to avoid or limit the consumption of any food and drinks containing the mycotoxin investigated; this specific information was reflected in the volunteer manual. A list of foods with relative levels of mycotoxin contamination was provided by the European Food Safety Authority (EFSA) for multiple mycotoxins. However, an additional review of publications was performed to ascertain a mycotoxin-free diet. To control for this setting, a food diary (Supplementary Information - Document S6) was compiled during the 5-day trial by the volunteers to trace the consumption pattern as explained in the manual (Supplementary Information - Document S3). On day 3, the volunteers collected blank samples after awakening, followed by the consumption of the aqueous bolus solution with a dedicated amount of mycotoxin (ng/µg) calculated based on the declared body weight (bw, kg). The mycotoxin amount in the bolus did not exceed any health-based guidance value or safe exposure threshold established by a competent authority, such as the total daily intake (TDI, µg/kg bw/day) or threshold of toxicological concern (TTC, µg/kg bw/day)47. As most mycotoxin standard solutions are methanol/acetonitrile-based, the solution was first dried down with N2 to avoid any presence of organic solvents. The dry residue was then redissolved in 50µL of pure ethanol and diluted in 5mL of water. After ingestion, the volunteers started collecting urine, capillary blood, and/or faeces for the upcoming 48 hours. Urine void volume samples were collected using a graduated polypropylene measuring cup and an aliquot was transferred to a traditional plastic sample container with a screw cap. Faecal material was collected by the use of Fe-Col® paper ring (Alpha Laboratories, UK) and an aliquot was transferred to a plastic sample container with a screw cap. Capillary blood samples obtained from the fingertip using a BD Microtainer® lancet pen (Novolab, Belgium) were collected with 20 μL Volumetric Absorptive Micro samplers (VAMS) (Mitra®, Neoteryx, USA) according to a pre-defined sampling schedule (Supplementary Information - Document S7). Volunteers delivered the biological samples, food diary, and sampling collection schedule within 24 hours from the last sample collection to the research team for storage at −80°C upon analysis. While the capillary blood collected via VAMS® and the urine were directly stored in the biobank, the faeces were freeze-dried at −55°C for 24 hours before storage to quench bacterial activity, limit oxidative reactions and immobilize (non)-enzymatic reactions48. The weight of the samples was recorded before and after the freeze-drying process. Upon arrival at the research facility, all samples and materials were pseudo-anonymized by assigning unique codes to each volunteer and corresponding sample. This procedure ensured a double-blind study design.

Fig. 1
figure 1

Design of the 5-days human mycotoxin intervention trial. Created with BioRender.com and adapted from Visintin et al. (2023)49.

Outline and timing

Stage I: recruitment of the volunteers Timing ≈ 15–30 days

An invitation (Supplementary Information - Document S1) was sent to potential participants through direct e-mail along with the IB and Volunteer manual part I (Supplementary Information - Document S3).

The interested volunteers who meet the inclusion criteria specified in the IB to the Research Centre were invited to get an overview of the study and its implications (Supplementary Information - Documents S3 and S4). After this the volunteers signed the ICF (Supplementary Information - Document S2); without signed ICF, participation was not allowed.

An intake meeting was scheduled together with the supervising MD nominated for the trial. During this meeting more details were provided regarding the study and participants were required to fill in a sociodemographic questionnaire before receiving the aqueous solution with or without mycotoxin and the required sample collection material.

Stage II: preparation of the human TK trial Timing ≈ 48 hours

On day 1 and 2, the volunteers were required to avoid or limit the consumption of any food and drinks commonly contaminated with the mycotoxin administrated during the trial; this specific information was reflected in the instructions provided (Supplementary Information - Documents S3 and S4). To control for this setting, a food diary was kept during the 5-day trial to trace the consumption pattern (Supplementary Information - Document S6).

Stage III: self-execution of the human TK trial Timing ≈ 48 hours

On day 3, the volunteers collected blank samples after awakening: blank blood, blank urine, and blank stool (if possible). Next, the volunteers consumed an aqueous bolus solution with a dedicated amount of mycotoxin (ng) calculated based on the declared body weight (bw, kg).

After the mycotoxin intake, the volunteers were requested to start the collection of urine, capillary blood, and faecal material for exactly 48 hours and to log all collection times into the collection schedule (Supplementary Information - Document S7). For the blood collection via VAMS, the volunteers followed the time schedule of collection.

The volunteers were required to continue to avoid or limit the consumption of any food and drinks commonly contaminated with the mycotoxin administrated during the trial.

Stage IV: finalizing the human intervention trial Timing ≈ 24 hours

Volunteers visited the Research Centre one final time for the deposit of the collected samples and the written documents within 24 hours.

Sample analysis and data acquisition

Multiple methods were proposed for the extraction of mycotoxins from urine, blood, and faeces samples depending on the class of mycotoxins under investigation and the analytical technique used for the detection. In Supplementary Information - Document S8, a brief review discussing the most common extraction procedures is provided.

The most common analytical approaches for mycotoxins’ detection in human biomonitoring studies have been attributed to the metabolomics field which is a powerful tool to gain insights into identification of complex molecular mechanisms in biomedical research50. Targeted approaches based on LC-MS/MS, mostly tandem quadrupoles systems, are commonly used to quantify mycotoxins in the context of biomonitoring51. This approach focuses on the analysis of known metabolites, of which the absolute or relative quantity needs to be determined to answer the specific research question. The initial assumption is that the researcher knows a priori if the mycotoxin is metabolised and in which matrix the mycotoxin and its metabolites are excreted. Once the molecules of interest have been identified, the method is applied using external and internal standards (IS)50. The MS/MS systems for targeted analysis for the detection of mycotoxins are typically operated in multiple reaction monitoring (MRM) mode52. This data acquisition modality provides a significant gain in both sensitivity and selectivity and allows the quantification of the compounds. Details about application and development of targeted and untargeted strategies for mycotoxins detection have been described extensively in the reviews of Arroyo-Manzanares et al. (2021)51 and Malachová et al. (2018)52. However, not all mycotoxins are commercially available as reference standards for human biomonitoring, and, as a result, interest in HRMS is increasing. More specifically, orbitrap and time-of-flight (TOF) mass analysers have emerged as the most convenient technique for mycotoxin detection, in some cases coupled with ion mobility spectrometry (IMS)51. Untargeted metabolomics allows the detection of thousands of different molecules in a single chromatographic run, yet the acquisition of a full metabolic profile is achieved at the expense of sensitivity53. Consequently, the compound identification and quantification in untargeted analysis remain challenging54.

Outline and timing

Stage I: storage of the samples Timing ≈ 24–48 hours

As soon as the biological samples were delivered to the Research Centre, they must be stored in the registered biobank at −80°C until the day of analysis. The capillary blood collected via VAMS and the urine were directly stored in the biobank. The Mitra® tips were eventually re-organized in Mitra® 96-autoracks or Eppendorf® tubes (2,500 µL). The faecal material underwent a lyophilization step for 24 hours before storage at −80°C. The sample, its location, and the correspondent ICF was registered in a dedicated and secure system to comply with the GDPR regulation. The stability of the compounds was investigated under storage conditions compatible with the trial to assure an accurate quantification.

Stage II: optimization of the extraction and analytical method for multi-mycotoxin detection Timing ≈ 2–4 weeks per method

Stage III: extraction and analysis of the samples Timing ≈ 24 hours

The quantification of the analytes was performed using an optimised and validated method employing matrix-matched calibration and an internal standard (13C-labelled or deuterated). Recovery and sensitivity of the method were crucial since the concentration of the biomarkers in biological samples is typically in the order of 0.05–100.05 ng/mL depending on the mycotoxin and its dose.

Data elaboration

Data elaboration depends greatly on the specific research question considered. Generally, it consists of several steps including the creation of a final dataset(s) integrating the metadata and results obtained, the development of a coding pipeline, if not already existent, and the data elaboration itself including creation of graphical and numerical output. The timing can vary greatly due to the interplay of these numerous variables with a minimum of 24/48 hours.

Biomarker identification

To facilitate the discovery of metabolites and biomarkers of exposure, samples collected during the TK trial are preferably analysed using an untargeted metabolomics approach. This approach was applied to the urinary samples collected during a human intervention trial performed for the investigation of tenuazonic acid (TeA) by Visintin et al. (2023)49. The consumption of a bolus containing the mycotoxin or water was followed by 48 hours of individual urinary samples collection from both the intervention group and control group. A simple dilute-and-shoot approach was employed for sample preparation49. The urinary polar metabolome was acquired using ultra-high-performance liquid chromatography coupled with high-resolution mass spectrometry (UHPLC-HRMS), possibly in both positive and negative ionization modes.

The raw data, obtained as.raw files, achieved from the analysis of samples, analytical standards, and blanks, were used to build a dataset by importing the raw profile data of the real samples in a software for alignment such as Progenesis QI55 (version 3.0, Waters®, Manchester, UK). The dataset is created by exporting as.csv file the chromatographic areas for each compound (identified by accurate m/z and rt) in each sample. In metabolomic experiments involving multiple samples, corresponding analytes in different samples must be mapped before conducting quantitative, comparative, or statistical analyses. The alignment of the runs was done by matching the chromatograms feature by feature through correspondence. The correspondence approach used by Progenesis QI version 3.0 (Waters®, Manchester, UK) consists of constructing a vector in the retention time direction that warps the content of the different runs and matching them the most closely to a given reference run. Data elaboration was performed using R56 (version 4.4.1, 2024-06-14) in Rstudio environment57. Orthogonal Partial Least Square Discriminant Analysis (OPLS-DA) was used as a chemometric filtering tool on the urinary polar metabolome dataset. The OPLS-DA algorithm was applied using ropls package58 in R environment59,60 to select the most effective variables in classifying control and intervention samples and calculate the classification efficiency of the OPLS-processed and reduced datasets. A volcano plot was built for each dataset after calculating the log2 fold-change, and a p-value of each feature was corrected for the false discovery rate61. The top features were selected and used to build and cross-validate the final OPLS-DA model. The performances of the model in classification was evaluated by computing the cumulative R2X, R2Y, Q2, and RMSEE62,63. All the R codes are reported in the GitHub repository https://github.com/liavisintin/OPLSDA_TeA. Potential phase I and II metabolites were predicted in silico by the use of GLORY64 for the cytochrome P450 phase I metabolism, GLORYX65 for the phase II metabolism, and Compound DiscovererTM66 (version 3.3, ThermoFisher SCIENTIFIC®, Waltham, MA USA) for combination of phase I and II. During the comparison between the detected features and possible metabolites, the retention times were also compared with the retention time of the parent mycotoxin, to exclude cases of in-source fragmentation, isotopes, and/or adducts.

Biomarker selection and validation

Once biomarkers of exposure are known and reference standards are available, samples collected during the human intervention trial can be analysed for the quantification of the biomarkers. The samples were used to determine and validate the most effective biomarker by testing multiple discrimination and multivariate regression models. The dataset was obtained from the intervention trial conducted by Vidal et al. (2018)18 and further extended by Mengelers et al. (2019)19 for the investigation of DON metabolization. As previously mentioned, through the trial, Vidal et al. (2018)18 was able to validated one of the few mycotoxin biomarkers existent18,19. The study concluded that DON-15-GlucA is the major urinary metabolite of DON and confirmed its use as validated biomarker by comparison of the excreted amounts of the known metabolites.

In the current work, as alternative to the method used by Vidal et al. (2018)18, ML and Multivariate Data Analysis (MDA) models were applied to the same data to investigate the role of the different metabolites and define the best biomarker of exposure to DON. All R scripts can be found in the associated GitHub repository ( https://github.com/eugenioalladio/human-mycotoxin-UGent).

The recruited volunteers were divided in 2 groups: DON group (n=16) which received a bolus of DON at TDI of 1 µg/kg bw and control group (n=4) which received pure water. Overall, between 6 and 17 urine samples were collected from each volunteer within 24 hours from the intake. The urinary concentration of DON and its major urinary metabolites DON-3-GlcA and DON-15-GlcA were quantified via UHPLC-MS/MS in all the collected samples. The concentration values were then converted to mass excreted (ng) using the total urinary void volume recorded for each individual sample. For the computation of ML and MDA models, a final dataset (4 × 139) was prepared consisting of 139 urine samples per 4 features, i.e., DON dose and the amounts of DON, DON-3-GlcA, and DON-15-GlcA recovered in urine. Supervised discrimination models, LDA and PLS-DA, were employed to discriminate volunteers from the DON and control group. In parallel, ML models such as k-Nearest Neighbors (k-NN)30, Naïve Bayes (NB)67, Support Vector Machine (SVM)68, Decision Trees (DT)69, Random Forests (RF)70, and eXtreme Gradient Boosting (XGBoost)71 were tested to recognize control from DON volunteers. A benchmark comparison of all the computed models was made by evaluating the following classification metrics: sensitivity (DON volunteers correctly identified), specificity (control volunteers correctly identified), accuracy (volunteers correctly classified) and mean misclassification error (volunteers wrongly classified)72. Benchmarking comparisons were tested to evaluate different learning algorithms and investigate the information contained by identifying those models that perform best for a particular classification or regression task. Furthermore, multivariate regression models were computed to study the correlation of the collected variables, namely mass excreted of DON, DON-3-GlcA, and DON-15-GlcA, with the DON dose taken by the study volunteers. The DON dose for the control group was set to zero. Multivariate linear regression (MLR)73, principal component regression (PCR)74, partial least squares regression (PLS-R)75, Artificial Neural Network (ANN)76, RF, k-NN, SVM, and XGBoost77 models were computed and a benchmark comparison was made by evaluating the goodness of prediction of DON dose in terms of the R2 determination coefficient, and the root mean square error (RMSE). All the ML models were cross-validated rigorously to obtain unbiased ML models and avoid misinterpretation of the results. A 5-times repeated 10-fold cross-validation strategy was used for all the evaluated classification and regression models. Autoscaling was used as a preprocessing strategy before computing the ML models. The benchmark was performed using R56 (version 4.4.1, 2024-06-14) in Rstudio environment57 and the mlr package78.

Toxicokinetic modelling

Once likely biomarkers have been identified, they can be used to estimate ADME parameters through the use of TK modelling. The overall components of the approach applied by Lu et al. (2023)79 are illustrated in Figure 2 and consist of a compartmental TK model (1); a Bayesian statistical model accounting for population variability, statistical errors, and prior parameter values (2); and a Markov chain Monte Carlo simulation to make posterior predictions after fitting the model to the data (3).

Fig. 2
figure 2

Components of a toxicokinetic modelling approaches for biomarkers in blood, urine, and faeces. A. Compartmental TK model. B. Bayesian statistical model accounting for population variability, statistical errors, and prior parameter value. C.Markov chain Monte Carlo simulation to make posterior predictions after fitting the model to the data. Adapted from Lu et al. (2023)79.

Because each mycotoxin has a unique metabolic profile, for illustration, we consider a “generic” mycotoxin model with a single metabolite, and samples collected in blood, urine, and faeces as described above.

The TK model consists of three internal compartments (i.e., GI tract, a compartment for the mycotoxin, and a compartment for the metabolite), and three external compartments (i.e., mycotoxin in faeces, mycotoxin in urine, and metabolite in urine). From the GI tract, there are two elimination pathways: excretion to faeces with rate kgutelim, and absorption to systemic circulation with rate kgutabs. From the mycotoxin body compartment there are also two elimination pathways: excretion to urine with rate ku and metabolism to the metabolite with rate kmet; the metabolite is henceforth excreted to urine with rate kumet. The ratio between the amount in the body and the concentration in blood is given by the volumes of distribution Vd and Vdmet for the mycotoxin and metabolite, respectively. All elimination rates have units of hr−1 and volumes of distribution have units of L/kg. In the example mycotoxin code provided, the following parameter transformations were used in order to facilitate calculation of key TK quantities (see below) and improve convergence:

  • Fgutabs = kgutabs/(kgutabs + kgutelim)

  • ktot = kmet + ku

  • kufrac = ku/(kmet + ku)

Moreover, all parameters were natural log-transformed for fitting.

The Bayesian statistical model is made up of three parts80. The first is a population variability model in which each of the TK model parameters is assumed to exhibit lognormal inter-individual variability, with a corresponding population geometric mean (GM) and geometric standard deviation (GSD). Second, a statistical model is specified for measurement and model errors in terms of the data likelihood function, which is also assumed to be lognormally distributed. Third, each parameter is given a prior distribution that is informed by a review of the literature for each mycotoxin. For instance, in the case of DON, prior distributions on estimates were based on in vitro data and inter-species scaling81.

The third component of the model is the actual model fitting using Markov chain Monte Carlo (MCMC) simulation. The MCMC approach draws a sample along a Markov chain, which is constructed to have the posterior distribution as its long-run stationary distribution82,83,84. In this case, a hierarchical Metropolis-Hastings algorithms within the Gibbs sampler, was performed using GNU MCSim software (version 6.1.0)85,86,87 integrated in a R56 (version 4.4.1, 2024-06-14) pipeline developed in Rstudio environment57. Each iteration of the Markov chain generates a vector of parameters sampled from the parameter distributions and values of the posterior likelihood associated with that vector of parameters. The MCMC simulation thereby generates posterior (updated) parameter values at the population level as well as at the individual level. Convergence of the Markov chains to the posterior distribution is monitored using analysis of variance as described by Gelman et al. (1996)80. The Gelman and Rubin (1992) shrink factors (potential scale reduction factor, R), a ratio of an upper bound and a lower bound of the variance in the target distribution, is used to assess whether the independent MCMC chains have converged to a common distribution88. As multiple, independent chains move closer together toward the same distribution, the ratio declines to unity. A convergence diagnostic “R” of 1.2 or less has been proposed as a criterion for acceptable convergence89. Our previous model for DON required 4 chains with 50,000 samples in each chain to reach acceptable convergence, which took around 5 hours on a modern personal computer. Key TK quantities of interest are derived from the model parameters, including: the fraction absorbed (Fgutabs) = kgutabs/(kgutabs + kgutelim), the mycotoxin elimination half-life (T½) = ln (2)/(kmet + ku), the mycotoxin clearance (CL) = Vd x kmet, and the area under the concentration curve (AUC) = Dose x Fgutabs/CL. Additionally, with these parameters, dose can be reconstructed using excretion data consisting of total faecal excretion of the parent compound (Efaeces), total urine collection of the parent compound, (Utot), and total urine collection of the metabolite (Utotmet). Specifically, by mass balance, these totals must equal the total dose (D): D = Efaeces + Utotmet + Utot. Additionally the fraction absorbed can also be estimated: Fgutabs = (Utotmet + Utot)/D. Finally, the inter-individual variability in each of these TK quantities can be estimated using the posterior distributions for the GSD for each of the underlying parameters.

Results & discussion

Biomarker identification

The implementation of this approach using urinary samples from a human intervention trial investigating TeA has been published before49. In brief, following a single-dose intake, the average cumulative urinary excretion of TeA was 0.39 ± 0.22 over 13 hours. Following untargeted and suspect screening analyses, established OPLS-DA models effectively distinguished between intervention and control groups (accuracy: 0.88 for HESI+, 0.85 for HESI-), indicating TeA exposure significantly impacted metabolic pathways. A total of 122 potential biomarkers were identified, with in silico predictions suggesting 9 possible TeA metabolites including oxidized, N-alkylated, desaturated, glucuronidated, and sulfonated forms, confirming both phase I and phase II metabolism49.

Biomarker selection and validation

In the current work, ML and MDA models were applied to the data obtained by Vidal et al. (2018)18 to define the best biomarker of exposure to DON. Benchmarking comparisons of several classification and regression ML models were tested to evaluate different learning algorithms. The classification task involved the discrimination of control from DON volunteers by taking into account the urinary concentration levels of DON, DON-3-GlcA, and DON-15-GlcA monitored over multiple time points. On the other hand, the regression task focused on evaluating the correlation of DON, DON-3-GlcA, and DON-15-GlcA with the DON (or placebo) dose provided to the different volunteers. The classification benchmark provided satisfactory results in terms of accuracy, mean misclassification error, sensitivity, and accuracy for several ML models. The metrics’ results for the classification benchmark are graphically shown in Figure 3.

Fig. 3
figure 3

Bar plots showing the accuracy, mean misclassification error, sensitivity, and specificity values of the classification benchmark comparing deoxynivalenol (DON) volunteers versus control volunteers. The models tested were k-Nearest Neighbors (k-NN), Linear Discriminant Analysis (LDA) and Partial Least Squares – Discriminant Analysis (PLS-DA), Naïve Bayes (NB), Support Vector Machine (SVM), Decision Trees (DT), Random Forests (RF), and eXtreme Gradient Boosting (XGBoost).

As it can be observed, different ML classification models provided metrics values close to 100% in terms of accuracy (k-NN, DT, RF and XGBoost), sensitivity (NB and k-NN), and specificity (all except for NB), while the lowest mean misclassification error values were shown by k-NN, DT, RF, and XGBoost. Consequently, the excreted mass of DON, DON-3-GlcA, and DON-15-GlcA can be used to discriminate the volunteers from the DON and control groups. On the contrary, LDA, PLS-DA, and SVM had no sensitivity and NB low specificity (45%) and accuracy (53%). The trade-off comparing the different performance metrics indicated k-NN, DT, XGBoost, and RF methods performed the best since their accuracy, specificity, and sensitivity values were the highest and mean misclassification error the lowest. Therefore, these models were investigated more deeply by fine tuning the specific hyperparameters using a train and test set.

Tuning of k-NN, DT, XGBoost, and RF models was performed by making a grid search of several hyperparameters. For the RF model, ntree (i.e., the number of trees in the forest, from 10 to 500), mtry (i.e., the number of variables randomly sampled at each node, from 1 to 4), nodesize (i.e., the minimum number of instances allowed in a leaf, from 1 to 5), maxnodes (i.e., the maximum number of leaves permitted, from 5 to 20) were evaluated. The optimal k (i.e., the number of nearest neighbors to each instance, from 1 to 15) was determined for k-NN model. For DT, minsplit (i.e., the minimum number of observations a node must have before attempting to split, from 2 to 5) and minbucket (i.e., the minimum number of observations that each terminal node should have after a split, from 1 to 100) hyperparameters were evaluated. Finally, optimal values for min_child_weight (i.e., the minimum sum of instance weights needed in a child node to split, from 1 to 500), gamma (i.e., the minimum loss reduction required for making a split, from 0 to 1), eta (i.e., the scale factor for the contribution of each tree in the boosting process, from 1 to 500), and max_depth (i.e., the maximum depth of each tree, from 1 to 500) were determined for the XGBoost model.

The optimal hyperparameters were selected based on the highest values of accuracy. To optimize them, data were divided into training (67% of the available samples) and test sets (33% of the available samples). The performance of the models built on the training set were assessed by predicting the classification outcomes on the data of the test set. Finally, the Receiver Operating Characteristic (ROC) curves and Area Under the Curve (AUC) values were computed. The tuned models (ntree = 10, mtry = 1, nodesize = 1, and maxnodes = 5 for RF; k = 6 for k-NN; minsplit = 4 and minbucket = 12 for DT; min_child_weight = 1, gamma = 0, eta = 0.44, and max_depth = 500 for XGBoost) showed optimal performances obtaining AUC values between 0.8988 and 0.9688 and as it can be seen from the ROC curves in Figure 4A. RF, DT, and XGBoost models also allowed to show the importance of the variables as reported in Figure 4B for RF. The highest importance is provided by DON-15-GlcA, followed by DON-3-GlcA and DON for all three model. Therefore, the ML approach confirmed the role of DON-15-GlcA in discriminating DON volunteers group from control individuals as the major validated human urinary biomarker18,19.

Fig. 4
figure 4

(A) ROC curve for Decision Trees (DT), k-Nearest Neighbors (k-NN), RF (Random Forest), and XGBoost models. (B) Variable importance plot of the RF model.

The regression plots for the DON dose obtained from the regression benchmark are reported in Figure 5. When comparing predicted vs. measured results, the regression models showed a correlation between the DON, DON-3-GlcA, DON-15-GlcA values and the DON dose. However, the only model for which the R2 resulted acceptable was the RF regression model that achieved R2 = 0.72. Additionally, the aggregated RMSE obtained over the 10 iteration was 60.63, due to the high variability of the collected data, especially from the volunteers of the control group. Finally, no further tuning of the RF regression model was deemed applicable at this stage, as the number of monitored individuals remains relatively low.

Fig. 5
figure 5

Regression plots showing the effective versus predicted dose for several ML regression models. The blue line represents the ordinary least squares (OLS) regression line whose equation is reported in the upper part of each plot, together with the R2 determination coefficient. The models tested were Support Vector Machine (SVM), Random Forests (RF), eXtreme Gradient Boosting (XGBoost), k-Nearest Neighbors (k-NN), Partial Least Squares-Regression (PLS-R), Multivariate Linear Regression (MLR), Principal Components Regression (PCR), Artificial Neuronal Network (ANN).

In conclusion, the tuned ML classification and regression models provided consistent results and confirmed the findings of Vidal et al. (2018)18. The classification model, specifically RF and k-NN, proved to be effective in differentiating DON from control volunteers. On the contrary, the regression models showed poor prediction of the intake of DON in function of the dose of DON. Additionally, the models were characterized by high values of RMSE due to the high variability of the data and the low sample size. Indeed, a higher number of volunteers should be considered in future studies to validate the findings in the future since the number of instances (n = 20 volunteers) was low for regression studies and uneven for classification purposes (4 control volunteers versus 16 DON volunteers).

Toxicokinetic modelling

The dataset containing DON data were used to verify the application of a Bayesian TK modelling approach by Lu et al. (2023)79, having identified DON, DON-3-GlcA, and DON-15-GlcA as important biomarkers90. The model of Mengelers et al. (2019)19 was modified as shown in Figure 6A, and a hierarchical Bayesian approach was employed to fit the model parameters as well as their inter-individual variability (details reported separately)91. Figure 6B shows the model fits to the urinary excretion data for all three biomarkers, along with 95% Credible Intervals, showing that the model results in excellent fits to the available data, and accounts for inter-individual variability in absorption, metabolism, and excretion. The resulting TK parameter estimates are useful for a variety of applications, including dose reconstruction from human biomonitoring data, derivation of more accurate tolerable daily intakes, and overall risk characterization92,93,94.

Fig. 6
figure 6

A: DON toxicokinetic model based on biomarker identification. The oral dose first enters the GI tract, where it is either eliminated to faeces (at a rate kgutelim) or absorbed systemically (at a rate kgutabs) to the central compartment. The blood concentration is determined by the volume of distribution (Vd) of the central compartment. Elimination is either to unchanged DON in urine (with rate kuD), metabolism to DON-3-GlcA that is excreted in urine (at a rate km_d3g) or metabolism to DON-15-GlcA that is excreted in urine (at a rate km_d15g). Potential biomarkers include the concentration of DON in blood and the amount of DON, DON-3-GlcA, and/or DON-15-GlcA in urine. B: Comparison of model predictions with data after Bayesian population calibration, based on biomarkers of DON, DON-3-GlcA, and/or DON-15-GlcA in urine. In each sub-panel, the symbols represent the data for one of 16 individuals from Vidal et al. (2018), and the lines represent posterior predictions (solid=median prediction; dashed=95% credible interval). Almost all data are within the 95% credible interval, showing adequate model fit. Additionally, substantial inter-individual variability is evident for each biomarker, showing the necessity of a population-based model incorporating random effects for each parameter shown in A. Adapted from Lu et al. (2023)79.

Advantages and limitations

Human intervention trial

The trial was developed to provide TK data on mycotoxins and investigate the presence of unknown biomarkers in humans. By examining the metabolic behaviour of mycotoxins directly in humans, uncertainties linked to the extrapolation of in vitro models, as well as interspecies differences related to animal models can be avoided. Although in vitro and in silico models provide a simplified approach to study the ADME properties of mycotoxins, they cannot encompass the total complexity of an organism95,96,97. Indeed, in vivo experiments on animals are a more comprehensive way to study the behaviour of a xenobiotic, but it is well known that different species have different major metabolic pathways that may lead to a wide variety of TK parameters and metabolites97.

Understanding the potential health consequences in humans necessitates the capability of conducting accurate mycotoxin exposure assessments at the individual level and determining population toxicokinetics. In response to this growing need for accurate mycotoxin exposure assessment, HBM has been evolving, with a shift towards methodologies that are less invasive and more patient-friendly. This transition is pivotal for enhancing our capacity to monitor exposure to mycotoxins in a manner that is both efficient and acceptable to the public. Capillary blood microsampling, particularly using VAMS, allows home and remote sampling by providing a minimally invasive alternative to the traditional blood withdrawal98,99. Microsampling significantly enhances the feasibility of large-scale HBM studies100 and facilitates more comprehensive pharmaco-/toxicokinetic trials101 by increasing patient participation and enabling more frequent sample collection. Moreover, thanks to the improvements of the analytical performances in detection and quantification of analytes, the analysis of capillary blood on VAMS, whole blood, serum, and plasma, provides comparable results in terms of sensitivity101,102. These factors make whole blood microsampling a valid alternative compared to traditional hematic matrices aligning with current trends in analytical methodology that prioritize patient comfort and study feasibility while maintaining analytical rigor103. On the other hand, whole blood is a complex matrix and VAMS can collect only up to 30 µL mining the sensitivity of the quantification method. Therefore, collection of plasma samples, possibly with the aid of a cannula104 to avoid multiple peripheral venipuncture procedures, should be considered in cases in which the performance of the analytical method can represent an important limitation. The sampling strategy for urine and faeces is standard and ubiquitously accepted by the scientific community for the biomonitoring of mycotoxins18,19,105,106,107,108,109,110, although the long sample collection timeframe (48 hours) may be considered a burden in some contexts. Nevertheless, a limitation for this design is connected to the faeces sampling. In fact, for this matrix, no weight is recorded at collection making it impossible to calculate the mass excreted via the GI tract. For mycotoxins excreted predominantly via the GI tract, this limitation must be overcome by recording the mass to obtain a comprehensive description of the excretion profile.

Chemometric filtering and suspect screening

The main limitations of the chemometric filtering approach are represented by the low sample size, which limits the generalizability of the findings and the statistical power of the analyses. TK trials inherently pose certain limitations on sample size when it relates to human subjects, often imposing a smaller sample size due to ethical, logistical, and practical constraints. In this study example, the set was kept as general as possible, including healthy subjects of both gender and from 20 to 65 years old. In other comparative studies investigating biomarkers of exposure111, the general nutrition of all participants is regulated during food consumption studies to minimize the differences that might be attributed to different nutritional habits. However, due to the necessity of collecting samples for up to 48 hours, a less invasive and consuming asset that could be performed at home was prioritized, which is a considerable advantage. Studies recommend a minimum sample size of 200 cases112. PLS-DA, on the other hand, is applicable to very small sample sizes. In fact, a Monte Carlo simulation conducted by Chin et al. (1999)113 showed that PLS-DA can be performed with a sample size as small as 50, and Wold et al. (1975)114 even analysed 27 variables using two latent constructs with a data set consisting of ten cases. Metabolomics datasets are usually characterized by large volume, large number of features, noise, and, eventually, missing data, due to the nature of the metabolomics spectral signals. In the case of highly dimensional data, PLS-DA strives to discern the variation correlated with the clinical outcome by building overly complex models115,116. Therefore, in this study, OPLS-DA was chosen over the classic PLS-DA to disentangle group-predictive and group-unrelated variations in the measured data117. OPLS-DA was applied, through the ropls package58, since it was specifically developed for metabolic data characterized by multi-collinearity among variables. Another aspect to consider is that in source fragmentation and adducts occurring during the ionization of the analyte of interest or isotopes can mimic the presence of metabolites due to the same mass-to-charge ratio detected during suspect screening. For this reason, it is crucial to compare the retention time of any candidate metabolite to that of the original analyte. If the retention time coincides, the candidate probably represents a different ionization state of the same analyte.

Data classification and regression

In terms of ML modelling, a main issue related to human intervention trials involves insufficient participants. A limited number of instances does not allow proper validation of the developed models, neither does it allow a detailed tuning and features selection of the models themselves. As an example, a limited amount of subjects may not allow the use of robust validation strategies such as Repeated Double Cross-Validation, i.e., a strategy to optimize the complexity of supervised models that provides feasible evaluations of the prediction error when the model is applied to new cases. As a rule of thumb, the minimum number of samples of each class for a classification model should be three times the number of the variables, e.g. for 5 variables and 2 classes, the number of samples should be at least 30 (5 variables x 3 x 2 classes). Moreover, small sample sizes influence overfitting. Computing ML models involves, briefly, two steps: first, a particular model form and cost function must be selected, and second, the model must be fitted to external data. The first of these steps is often done by splitting the data into training and test/evaluation data: once a set of candidate models is trained on a subset of the available data, then the validation of the models’ performance is made on the hold-out test set. The model that performs best on the latter is then, traditionally, selected for further studies/implementation. However, the optimal model depends sensitively on the size of the training set. This observation suggests that the paradigm of splitting training and testing should sometimes be taken with caution: since a subsample is used for training in the first selection step, the model identified as optimal may not be the best one when trained with the entire dataset. For this reason, hyperparameter tuning and feature selection strategies can be performed when the sample size is significantly large, so that robust validation and cross-validation strategies can be implemented to avoid overfitting.

Toxicokinetic modelling

There are two main advantages of using Bayesian TK modelling. First, the Bayesian approach allows to incorporate “prior” information based on previous studies and literature. For instance, if there are experimental animal or in vitro data on metabolism, then extrapolation approaches can be used to estimate the rate of metabolism kmet, as was done by Fæste et al. (2018)81 for DON. Those estimates can be used as “central” estimates for the prior probability distribution in the model (Figure 2, panel B, bottom box). Additionally, Bayesian population approaches incorporate random effects across individuals that statistically separate inter-individual variability from measurement error and intra-individual variability118. The advent of MCMC algorithms, the wide availability of powerful computing platforms, and the presence of open-source/free software platforms for conducting these analyses enable the application of these approaches routinely. The main limitations of the approach are that sometimes convergence can be difficult to achieve, requiring re-parameterization or implementation of alternative MCMC algorithms. Additionally, unlike commercial TK software packages such as NONMEM, there is no “user-friendly” graphical interface, and thus some familiarity with command line interfaces is needed to implement the approaches. However, a GitHub repository containing the templates for such implementation was shared within this work, and hence it is hoped to enable a more widespread application.

Conclusions

The design of a human mycotoxin intervention trial was proposed for the investigation of TK and metabolism of IARC-group III mycotoxins. Samples gathered during such a trial and analysed with different analytical technique provides experimental data rich in information79,119,120,121. Different pipelines can be applied on the experimental data, as the data can be elaborated for three different objectives:

  1. 1)

    Identification of metabolites and biomarkers of exposure

  2. 2)

    Determination and validation of the best biomarker for human biomonitoring

  3. 3)

    TK modelling and determination of population TK parameters

The detailed procedure for the data elaboration of each objective is implemented in R and all codes and files required are provided in distinct GitHub repositories.

  1. 1)

    https://github.com/liavisintin/OPLSDA_TeA;

  2. 2)

    https://github.com/eugenioalladio/human-mycotoxin-UGent;

  3. 3)

    https://github.com/wachiuphd/Mycotoxin-PK-model.

The standardization of this innovative and holistic framework provides repeatability and reproducibility of experimental results obtained. The gain for the scientific community and risk assessors is high as the TK properties and newly discovered biomarkers will allow public health authorities to perform a comprehensive risk assessment based on biomonitoring rather than food consumption and occurrence data. Eventually, the unique opportunity to provide specific legislation or recommendations regarding the safety of foods for human consumption is offered.