Introduction

Many crops and cereals are prone to fungal attacks in the field as well as during storage. These fungi are capable of producing various groups of secondary chemical metabolites popularly known as mycotoxins1. Various factors influence the levels of mycotoxins in certain food or food materials, which increased the favorable conditions for the fungal growth and invasion2. Most of the mycotoxins are identified after they have instigated different subacute health issues in humans, as well as other livestock, negatively affecting various systems and organs. Based on this, most of the mycotoxins have been reported to be potent agents for human and animal carcinogens. Despite climatic and geographical differences in the occurrence and production of mycotoxins, the exposure to these chemical substances is global, with most of the world’s food supply being contaminated to some extent3. Therefore, monitoring to check the presence of mycotoxins is of paramount significance.

Different investigative tools can be used in checking the presence of these chemical substances as well as to check their amounts in order to design a protocol towards their prevention and control4. Knowledge of chromatographic techniques such as gas chromatography (GC)5, thin layer chromatography (TLC)6, high performance thin layer chromatography (HPTLC), ultra-high performance chromatography (UPLC) and high performance liquid chromatography for the detection of mycotoxins using different detection approach such as mass spectrometry (MS) and diode array detector (DAD) etc7,8,9,10,11,12.

For example, Zhang et al. also indicated the instantaneous estimation of diverse aflatoxin G1, G2, B1, B2 T-2 toxin, Zearalanone and Ochratoxin A using the A locally manufactured column was used to hyphenate tandem mass spectrometry (HPLC) data. The suggested approach satisfies every prerequisite taken into account for rapid sample preparation and increased sensitivity for multiple discovery13,14, also demonstrated the examination of several beer-brewing samples to detect aflatoxin and ochratoxin. Mycotoxins known as ochratoxin and aflatoxin are produced from samples of fungi, opaque beer, sorghum malt, and maize. The HPLC technique was used to perform the analysis. The outcome demonstrated the necessity of employing the chromatographic approach to assess the mycotoxins present in beer samples. Pakshir et al. used an HPLC method developed with a fluorescence detector (FD) to equally assess different classes of aflatoxin and ochratoxin from tea samples. The authors suggested that, given the significant level of mycotoxin contamination in the tea samples, frequent and routine evaluation during the tea’s processing could be utilized to enhance the tea’s quality15.

Even though, various studies have been depicted in the literature regarding the applications of chromatography in the determination and elucidation of mycotoxins, but the major setbacks consist the usage of large amounts of toxic solvents such as chloroform, high cost of analysis and long analysis time16. This in turn made chromatographic technique complex and tough to present a green technique that can cover all the essential variables needed to give result within short analysis time, and the need for manual parameter tuning, making them inefficient for large-scale screening programs17. Recently, chromatographers have been employing the application of chemometrics in various fields of chromatography to simulate chromatographic systems’ behavior. This approach aims to obtain reasonable results with low analysis time, minimize the use of toxic solvents, be cost-effective, and, more importantly, use green methods18. However, the traditional chromatographic approach serves as the primary source of the data as well the foundation for this kind of hyphenation19,20,21,22. The use of artificial intelligence (AI) techniques has proved to be reliable and promising chemometrics-based approach for the determination of mycotoxins using HPLC technique. Among AI-driven techniques, SVR is particularly effective in capturing nonlinear relationships in complex datasets. However, its performance depends on optimal hyperparameter selection, which can be challenging using conventional methods. Metaheuristic algorithms, such as Harris Hawks Optimization (HHO) and Particle Swarm Optimization (PSO), enhance SVR by automating hyperparameter tuning, improving predictive accuracy and robustness. The integration of AI-driven approaches in food safety has gained significant attention, with advancements in predictive modeling contributing to more efficient contaminant detection, quality control, and regulatory enforcement. By leveraging SVR-HHO for chromatographic modeling, this study aligns with global food safety standards, such as those established by the Codex Alimentarius and the European Food Safety Authority (EFSA), offering a scalable, high-precision tool for mycotoxin screening in food industries and regulatory agencies23. Numerous studies have emphasized the drawbacks of conventional chromatographic methods, such as their dependence on hazardous solvents, protracted processing periods, and human parameter adjustment24,25,26,27,28.

Recently, chromatographers have increasingly turned to chemometrics to simulate chromatographic systems’ behavior, aiming to obtain reliable results with reduced analysis time, minimized use of toxic solvents, cost-effectiveness, and environmentally friendly methods29. SVR has emerged as a powerful machine learning technique for these purposes due to its robustness in handling nonlinear relationships and its ability to provide high prediction accuracy. SVR works by finding the hyperplane that best fits the data, minimizing the error within a specified margin. Despite its advantages, SVR has some limitations, such as sensitivity to the choice of hyperparameters and potential overfitting, particularly with small or noisy datasets30.

To address these limitations, optimizing SVR’s hyperparameters is crucial. Traditional methods like grid search can be computationally expensive and may not always find the global optimum. This is where metaheuristic optimization methods, such as Harris Hawks Optimization (HHO) and Particle Swarm Optimization (PSO), come into play. These algorithms can efficiently navigate complex, multidimensional search spaces, enhancing SVR’s performance by finding optimal hyperparameters that traditional methods might miss. Recent developments in meta-heuristic optimization have improved food safety and environmental science prediction modeling. Algorithms that have recently been proposed perform better at solving complex problems. While the Fata Morgana Algorithm (FATA)31 uses refraction mechanisms inspired by mirages for improved convergence, the Educational Competition Optimizer (ECO)32 mimics student competition for optimal learning. The Rime Optimization Algorithm (RIME)33 simulates frost formation for efficient worldwide search, while the Polar Lights Optimization (PLO)34 simulates auroras to strike a balance between exploration and exploitation. In optimization challenges, these new algorithms have demonstrated great promise. By incorporating such cutting-edge methods, predictive models can be further enhanced, especially in the areas of environmental monitoring and food safety, making them useful resources for researchers studying nature-inspired computing and data-driven decision-making. Furthermore, apart from conventional techniques inspired by nature, a number of innovative algorithms have surfaced. To improve convergence time, the gradient-based optimizer (GBO)35 combines gradient information with metaheuristic techniques. Inspired by slime molds’ clever foraging strategies, the Slime Mold Algorithm (SMA)36 effectively strikes a balance between exploration and exploitation. To increase the searchability of solutions, the Heap-Based Optimizer (HBO)37 simulates the hierarchical behavior of heap structures in data organization. In order to steer clear of local optima, the Escape Algorithm (ESC)38 mimics escape processes in prey-predator interactions. Also, JASMA-SVM model was proposed by Shi et al.39 and achieved 92.998% accuracy in predicting recurrent spontaneous abortion. PM-SMEKLM was presented by Fei et al.40 as a transfer learning method for diagnosing brain diseases. MaOPEO, an optimization algorithm created by Chen et al.41 outperforms current techniques in many-objective problems by increasing convergence and variety. More information on on machine learning, optimization, deep learning and data balancing techniques can be found in41,42,43,44,45. Recent studies have explored various optimization algorithms to enhance machine learning models in complex problem domains. For instance, Hassan et al.46 has demonstrated superior convergence speed and accuracy in solving high-dimensional optimization problems. Integrating such techniques into SVR parameter tuning can potentially improve prediction accuracy and model stability. Similarly, Ashraf et al.47 introduces an enhanced metaheuristic approach for function optimization, offering an alternative strategy for fine-tuning machine learning models.

In recent years, significant findings from previous studies in AI-based methods have laid the groundwork for the development and adoption of metaheuristic methods in various fields, including chromatography and the prediction of mycotoxins. For instance, early AI models like SVM and neural networks demonstrated the ability to accurately predict complex chemical behaviors, including mycotoxin levels in food samples30. These models set a high standard for prediction accuracy but also revealed limitations in parameter optimization and overfitting30. Also, initial methods for optimizing model parameters, such as grid search and manual tuning, were time-consuming and often failed to find the global optimum, leading to suboptimal performance48.

Despite the fact that various scientific articles have reported about determination of mycotoxins using the chromatographic method coupled with the AI-based models49,50,51,52,53,54, however, only few focused on the use of diverse learning methods and optimization of network architecture55, such as HHO47, which enhance the accuracy of predictions by fine-tuning the parameters of the AI models, leading to more reliable and precise identification of mycotoxins, through reducing the need for extensive data pre-processing and large datasets, making the method more practical and cost-effective, by ultimately improving the detection and analysis of mycotoxins in food samples, and more importantly by improving the efficiency and effectiveness of the predictive models, optimization contributes to reducing the usage of toxic solvents and overall environmental impact, aligning with green chemistry principle. Also, more information on other techniques that can be used in improving the single model performance can be found22,56,57,58,59,60,61,62,63,64,65,66,67.

However, to the best of our knowledge, there is no published paper in the literature showing implementations of the recently evolved Harris Hawks Optimization (HHO) algorithm for the prediction of mycotoxins using chromatographic techniques. The primary goal of the current study is to optimize support vector regression (SVR) using nature-inspired algorithms to improve the predicted accuracy and efficiency of mycotoxin contamination modeling in food. By combining these optimization strategies, we want to enhance SVR’s ability to analyze virtual water samples, which will ultimately lead to more accurate environmental risk assessments and food safety monitoring. Furthermore, this paper focuses on (1) the use of hybrid SVR with HHO and PSO (i.e., SVR-HHO and SVR-PSO) to model the qualitative properties of different classes of mycotoxins from food samples using the HPLC technique. (2) Compare the feasibility of SVR with PSO and HHO (i.e., SVR-PSO) and SVR-HHO for modelling the retention behaviour of mycotoxins using different input combinations. The environmental prediction of mycotoxins in food-virtual water samples is a critical area of research, given the significant health risks associated with mycotoxin contamination. Accurate prediction models are essential for effective monitoring and mitigation strategies. The current manuscript was structured into four different sections, whereby; section one demonstrates the introduction part of the study, while the second section presents the methodology part, which discussed in detail regarding the selected algorithms used in the current study, the third section depicts the results obtained based on the modelling process using various algorithm techniques as well as discuss the findings, whereas section four presents the conclusion of the study.

Materials and method

Proposed computational algorithms

Support vector regression (SVR)

Support vector machine (SVM) was introduced by Vapnik et al.68, The SVM is an approach to problem solving that combines pattern recognition, regression analysis, prediction, and classification. It is a concept of a machine learning. Hence, SVR is a form of SVM designed for regression task There are two types of SVR; non-linear and linear69. The layer-based SVM known as SVR combines the function-weighted sum of the kernel outputs and the kernel function weighting on the input parameter. SVR aims to find a function that approximates the relationship between input features and continuous target values while minimizing prediction errors. Additionally, it offers a test error rate bound approximation70. SVR can be denoted using

$$f\left( x \right) = w \times \upphi \left( {\text{x}} \right) + b$$
(1)

In other words, w represents the weight of the vector, ϕ represents the transfer function, and b represents the bias. To visualize the SVR function f(x), the regression problem can be represented as follows:

$${\text{Minimise}}\quad \frac{1}{2}\|w \|^{2} + C\mathop \sum \limits_{i = 1}^{N} \left( {\xi_{i} + \xi_{i}^{*} } \right)$$
(2)
$${\text{Subject to the condition:}}\; \left\{ {\begin{array}{*{20}l} {y_{i} - {\text{f}}\left( {\text{x}} \right) \le \upvarepsilon + \xi_{i} } \hfill \\ {{\text{f}}\left( {\text{x}} \right) - y_{i} \le \upvarepsilon + \xi_{i}^{*} } \hfill \\ {\xi_{i} ,\xi_{i}^{*} \ge 0, \;\;i = 1, 2, 3, \ldots ,N } \hfill \\ \end{array} } \right.$$
(3)

where ε indicates the tube’s size and shows the optimization performance, C indicates the variabl, and ξi and ξi* are the slack variables. Applying the Lagrangian functions in the following manner will yield the non-linear regression function’s solution:

$$f\left( x \right) = \mathop \sum \limits_{i = 1}^{N} \left( {\alpha_{i} - \alpha_{i}^{*} } \right) K\left( {x,x_{i} } \right) + b$$
(4)

In this case, C, αi, and αi* > 0, and K(x, xi), αi, and αi* are the kernel function and dual variables, correspondingly.

Harris Hawks optimization (HHO)

The Harris Hawks Optimization (HHO) algorithm is regarded as new since it was created by imitating the hawk’s hunting technique. In recent years, the method has been effectively applied to resolve a variety of intricated problems in science and engineering71,72. Hawks used to hunt and pursue primarily on their own, whereas Harris hawks hunt in pairs. Thus, the HHO method is similar to the natural Harris hawks’ cooperative mechanism and hunting style. Tracing, surrounding, approaching, and attacking are the mechanisms used in HHO hunting. which consist of three primary steps: investigation, exploitation, and a transition from exploitation to exploration (Fig. 1).

Fig. 1
figure 1

Phases of the Harris Hawks Optimization process73. Where r: is a random number between 0 and 1, which is used to introduce randomness in the position updates of the hawks. It helps in diversifying the search process and is crucial for both exploration and exploitation phases. q: is another random number between 0 and 1, which is used to determine the type of behavior (strategy) the hawks will adopt. It helps in deciding whether the hawks will perform a hard or soft besiege strategy. E: is considered as the absolute value of the escape energy of the prey. This parameter decreases over iterations, starting from a positive value and gradually becoming negative. The value of E determines whether the algorithm is in the exploration phase or the exploitation phase. Hence, for E ≥ 0.5: The hawks are in the exploration phase, meaning they are searching broadly across the solution space. While for E < 0.5: The hawks are in the exploitation phase, meaning they are focusing on fine-tuning the solutions around the best-found solutions.

Exploration is the phase in which the algorithm searches the solution space broadly to identify potential locations. This phase of HHO replicates the activity of Harris hawks assessing their surroundings and scouting for prey. During exploration, the algorithm prioritizes diversifying the search to avoid early convergence to a local optimum. The hawks select how aggressively to explore based on the escape energy of the victim (rabbit). The escape energy (E) is a factor that diminishes each repetition, balancing exploration with exploitation. Various tactics, such as Levy flight or random walks, are used to ensure that the search space is thoroughly covered.

The exploration phase is the initial stage and is illustrated as follows:

$$X\left( {t + 1} \right) = \left\{ {\begin{array}{*{20}l} {X_{rand} \left( t \right) - r_{1} \left| {X_{rand} \left( t \right) - 2r_{2} X\left( t \right)} \right|} \hfill & {\quad if\;\;q \ge 0.5} \hfill \\ {X_{rabbit} \left( t \right) - X_{a} \left( t \right) - r_{3} \left( {LB + r_{4} \left( {UB - LB} \right)} \right)} \hfill & {\quad if\;\;q < 0.5} \hfill \\ \end{array} } \right.$$
(5)
$$X_{a} \left( t \right) = \frac{1}{N}\mathop \sum \limits_{1}^{N} X_{i} \left( t \right)$$
(6)

The number of Harris eagles is represented by N, Xa(t) is the average of eagles, and X (t + 1) is the position of the eagle in subsequent iterations. LB and UB are the lower and upper equations, and q, r1, r2, r3 and r4 are the differences between 0 and 1. t + 1, Xi(t) is the current position of the Harris Hawk in the iteration order t, Xrand(t)) is the randomly selected hawk.

If the hawks’ energy is consumed as little as possible throughout the hunt, the second phase is regarded as a shift from investigation to exploitation. The following diagram illustrates the energy expended in evading the hunt:

$$E = 2E_{0} \left( {1 - \frac{t}{T}} \right)$$
(7)

T is the maximum number of iterations, and E0 is the initial energy used for each advancement (E0 (− 1, 1)).

If we consider the third phase, the development phase focuses on the development of traditional solutions into new solutions. Based on the principle of animals escaping and eagles hunting, eagles quickly attack the species detected in the previous stage at this stage. Choose the type of enclosure according to the E and r values to catch the rabbit; Select hard when E < 0.5 and select soft when E ≤ 0.5.

Also, exploitation is the phase in which the algorithm intensifies its search for the best solutions discovered during exploration. This phase in HHO is similar to how Harris hawks encircle and dive into their prey. The goal here is to fine-tune the solutions to find the global optimum. Hawks descend on the prey (optimal solution) using a combination of soft and harsh besiege methods. In HHO, the balance between exploration and exploitation is dynamically regulated dependent on the prey’s escape energy (E). High escape energy encourages hawks to roam more broadly while low escape energy encourages exploitation by bringing hawks together and fine-tuning the search for promising answers.

The HHO algorithm can use four strategies: soft flanking, hard flanking, fast escalation, and rapid escalation to simulate the attack phase (Fig. 1).

Particle Swarm Optimization (PSO) algorithm

Kennedy and Eberhart74 first developed this optimization method. It is based on population search algorithm, which is stimulated through the social movement and behavior dynamics of some animals74. The original idea behind the PSO concept was to provide a graphic representation of the social behavior of various animals, such as birds. The purpose of this is to identify the processes that allow the birds to fly in unison and quickly alter course by optimally regrouping75. The idea developed into an effective and straightforward optimization technique based on this notion. PSO refers to the population as a “swarm” and the individuals as “particles”. To maintain track of its position inside the group, each member moves through the search space at a specific speed. One member of the swarm informs the others of this message76. The position of a particle in the search space is updated based on its current velocity and position77. The velocity is influenced by the particle’s own experience and the experience of its neighbors. Generally, there are three steps in PSO optimization; (1) Initialization: which evaluate the fitness of each particle and initialize the personal best position and global best position. (2) The iteration stage: which evaluate the new fitness of each particle and (3) Termination: Repeat the iteration step until a stopping criterion is met (e.g., a maximum number of iterations or a satisfactory fitness level)78.

Data normalization

Equation (8) is used to normalize the data used in the current data79. One of the primary purposes of data normalization before AI modelling is to avoid the use of higher numerical errors and data redundancy80.

$${\varvec{y}} = 0.05 + \left( {0.95 \times \left( {\frac{{x - x_{min} }}{{x_{max} - x_{min} }}} \right)} \right)$$
(8)

Model evaluation criteria

Any data-driven strategy compares the experimental outcome with the projected results to verify the performance criteria using various indices81,82,83,84,85,86,87,88,89. According to Legates and McCabe90, and Elkiran et al.91, the efficiency performance of any data intelligence model should comprise at least one goodness-of-fit (e.g., NSE) and at least one absolute error measure (e.g., RMSE). The use of five performance criteria in this study was due to the fact that multi-criteria indicators for measuring model performance are commonly used in contemporary studies. Another key reason for employing several criteria is that data qualities such as normalcy, size, and linearity influence the performance accuracy of any model that may be evaluated using these criteria90,91.

Three metrics were used in determining the performance of the AI based model established during the modelling phases. The correlation co-efficient (CC), mean square error (MSE), and Nash–Sutcliffe efficiency (NSE).

Mean square error (MSE) measures the average of the squares of the errors, which are the differences between observed and predicted values. It is a common measure of the accuracy of a predictive model.

$${\text{MSE}} = \frac{1}{N} \mathop \sum \limits_{i = 1}^{N} \left( {Y_{obsi} - Y_{comi} } \right)^{2}$$
(10)

NSE measures how well the model predictions match the observed data compared to the mean of the observed data, with values closer to 1 indicating better performance.

$$NSE = \frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {Y_{obs} - \overline{Y}_{obs} } \right)\left( {Y_{com} - \overline{Y}_{com} } \right)}}{{\sqrt {\mathop \sum \nolimits_{i = 1}^{N} \left( {Y_{obs} - \overline{Y}_{obs} } \right)^{2} } \mathop \sum \nolimits_{i = 1}^{N} \left( {Y_{com} - \overline{Y}_{com} } \right)^{2} }}$$
(11)

The correlation coefficient (CC), measures the strength and direction of the linear relationship between two variables. For two datasets, observed values (obs,j) and computed values (com,j).

$$CC = 1 - \frac{{\mathop \sum \nolimits_{j = 1}^{N} \left[ {\left( Y \right)_{obs,j} - \left( Y \right)_{com,j} } \right]^{2} }}{{\mathop \sum \nolimits_{j = 1}^{N} \left[ {\left( Y \right)_{obs,j} - \overline{\left( Y \right)}_{obs,j} } \right]^{2} }}$$
(12)

Model validation

Generally, in deep learning and machine learning approaches, one of the primary objectives is to ensure that the models fit with good performance based on the indicators employed, in order to obtain a solid and trustworthy simulation of the test dataset29,89,92,93,94,95,96,97,98,99,100. However, problems such as local minima and overfitting necessitate the validation of the dataset101,102,103,104,105. Because of this, the training data’s performance might not be enough, particularly if a smaller dataset is being used for the analysis. A variety of validation approach, including holdout, leave-one-out, and cross-validation (also known as k-fold), can be used. The k-fold approach was used in this work to prevent issues like overfitting. There are two phases to the data used in this study: training (70%) and testing (30%). More information about model validation can be found in49,85,98,106,107.

Proposed modelling techniques

It is important to precisely define the model’s parameters in order to improve SVR’s performance accuracy. Selecting different suitable conditions to reach the optimal state determines how accurate it is. Finding the worldwide ideal solution is crucial since it can help you achieve the finest results. As a result, the SVR-PSO and SVR-HHO hybrid methods were created by combining the SVR approach with nature-inspired algorithms (PSO and HHO). The three main SVR model parameters (C, γ, and ε) were found using techniques inspired by nature. The flowchart depicts the hybrid SVR-PSO and SVR-HHO models utilized in this work based on Fig. 3.

Furthermore, the complexity of SVM algorithms depends on several factors, including the type of SVM (linear or kernelized), the optimization algorithm used to train the SVM, the number of training samples (N), training complexity, space complexity and the number of features (D) in the dataset.

The data involved in this study was derived from a research conducted by Nielsen and Smedsgaard108. Prior to dwelling into the modelling step various pre-processing analysis were conducted including; handling missing data, data cleaning, feature engineering, outlier detection and handling. Nowadays, the novel AI-based methods (both hybrid and simple models) are now becoming interestingly popular in the field of mycotoxins in classifications as well as regression for determining their chemical structure as well as grouping them into various respective classes based on their features and properties demonstrated109. These models are equally gaining popularity in the areas of chromatography in determining the qualitative and quantitative properties of various chemical compounds using their respective retention properties and peak areas as well as in other field of spectroscopy and spectrometry such as mass spectrometry in predicting and classifying various compounds based on their peak areas and mass-to-charge ratio110. Using the retention index, mono isotopic mass, relative sensitivity factor, and peak symmetry as the independent variables for each associated mycotoxin, the retention time as the dependent variable was predicted. To simulate the retention time of the associated mycotoxins, 150 data points from seven different mycotoxin groups were used.

More also, the overall methodology flowchart is demonstrated in Fig. 2.

Fig. 2
figure 2

Overall methodology flowchart.

Various researchers such as Rajaee et al.111 depicts that 70–30% usually presents better performance accuracy. But nevertheless, the current study checked for other partitioning such as 60–40%, 65–35%, 80–20% and 90–10%. However, 70–30% not only showed higher performance but equally solve the major modelling difficulties such as overfitting and under fitting issues. Hence, 70% was utilized as the training data and 30% as the testing data. The performance of the models was checked using the testing data set to provide an unbiased determination of the models’ accuracy. Besides, various part of the datasets was employed for validation and training of the chromatographic behaviour of the mycotoxins.

Results and discussions

The main objective of this study is to develop an AI-based model (SVR) and hybridize it with two different metaheuristic algorithms, namely HHO and PSO, in order to find the optimum values of the SVR parameters, which are used to accurately model the chromatographic behaviour of various classes of mycotoxins in food samples. By fine-tuning these parameters, the AI-based model can provide more precise simulations, leading to better predictions and analyses when using the HPLC technique as illustrated in Fig. 3. In this section, the outcomes obtained are established both in visualized and quantitative format. Table 1 demonstrates a pre-analysis of the raw data for different classes of mycotoxins using both statistical and correlation analysis. Various methods, including modeling, are used for preliminary data before predictions are made.

Fig. 3
figure 3

Hybrid support vector regression models flowchart.

Table 1 Correlation analysis of the experimental variables.

Based on Table 1, it can be observed that the retention index has the highest correlation with the dependent variable (retention time (tR)), having an R-value of 0.8826, followed by peak symmetry with an R-value of 0.7210. On the other hand, the mono-isotopic mass has the lowest correlation result with an R-value of − 0.0924. Furthermore, the retention index, as the dominant input variables, has the highest skewness value and, as seen in Table 1, the highest correlation. Furthermore, the skewness values of all the other input variables were greater and better. As indicated in Table 1, this has an impact on the input and output parameter correlation analyses, which in turn influences the modeling.

The understanding between the variables, which support the formation of three distinct model types, were further evaluated using the correlation analysis approach (inform of models M1, M2 and M3). Table 1 presents the assurance of the correlation coefficient conducted in order to improve the models’ methodology. The guiding symbols (+ or −) indicate the relationship between the variables. We can identify the highest and lowest variables that are related to one another with the use of correlation analysis. Furthermore, by displaying the input parameter that has the highest correlation with the output variable, it aids in understanding the science and mechanism of the data before delving into the modeling method. This is also beneficial for experimental analysis, particularly for the optimization process. As a result, different models were created based on the correlation analysis in order to actually see how the method affected the simulation process. Table 3 displays these models.

Results of metaheuristic algorithms and machine learning

MATLAB 9.3 (R2020a) was used for the modelling, as mentioned in the literature section. Metaheuristic methods such as PSO and HHO were used to optimize the structure of the SVR model. NSE and CC were used to assess the simulation’s findings in terms of the fitness among the actual and forecast values, and MSE was used to ascertain the error that the models in both the training and testing phases represented. Based on the model combinations in Table 2, the simulated outcome are quantitatively presented in Table 3. Table 3 exhibits the multi-model information regarding the performance efficiency of the models using various input variable selection in both the training and testing stages. For modelling mycotoxins from food samples using SVR technique the predictive approaches showed different efficiencies in accordance with the evaluation metrics. In general, for the single AI-based model (inform of SVR) SVR-M3 with four input variables depict higher performance based on NSE, CC and MSE. Although it is difficult to rank the model based on exact accuracy, the SVR-M3 method also showed the topmost prediction accuracy for modelling the mycotoxins retention behaviour with more than 90% performance efficiency in both the calibration and verification stages.

Table 2 Input variable selection.
Table 3 Results of evaluation criteria for SVR, SVR-HHO and SVR-PSO model.

Regardless of the nonlinear association among the predictors and their corresponding targets, the overall accuracy of the SVR is inadequate, particularly in M1 and M2. By creating a hybrid technique and using the optimization algorithms (HHO and PSO), this could be improved. Essentially, it is important to remember that the hopeful estimations happened during the calibration phase, which is largely used to successfully calibrate the models using targets and known input variables. However, as testing evaluates the model’s accuracy based on unknown goal values, it is essential to evaluating a model’s performance. The training set does not includes this benefit. In light of this, the trustworthy model ought to operate steadily and evenly during the calibration and verification stages. Regarding the single model, the section on optimization algorithms typically showed promising abilities. This is not at all surprising, as numerous works of literature demonstrate71,73,112,113,114,115. Just like in the case of the SVR model, the performance simulation of the metaheuristic’s technique was equally evaluated using NSE, CC and MSE. Based on Table 3 SVR-HHO-M3 showed the highest NSE value = 0.990096 and the lowest MSE value = 0.000137 in the verification stage. Furthermore, the results proved the ability of SVR-HHO over the single SVR model as well as the SVR-PSO metaheuristic approach for the simulation of mycotoxins in food samples using the chromatographic technique. The predictive ability of HHO over others could be seen in various recent literature for example71,72,115.

Moreover, the optimization capacity of HHO on SVR technique in the current study is in line with ability of GSK algorithm in optimizing the structure of various techniques to provide higher accuracy. For instance, Muhammad et al.82 proposes the application of GSK in solving binary optimization problems through a novel technique called novel binary Gaining Sharing knowledge-based optimization algorithm (NBGSK) that enables exploitation and exploring of the search space effectively and efficiently for solving problems in the binary space. Additionally, in order to keep the solutions from trapping into local optima, which progressively reduce the population size via a linear function, the population size reduction (PR-NBGSK) is implemented. When applied to a set of knapsack examples of both small and big dimensions, the suggested NBGSK and PR-NBGSK demonstrate their superior efficacy and efficiency in terms of convergence, robustness, and accuracy. In general, six issues in 10D, four in 30D, three in 50D, and one in 100D were successfully solved at least once by the GSK performance result. In 51 runs across 10 and 30 dimensions, GSK consistently finds the global optimal solution, demonstrating exceptional performance on unimodal issues (f1–f3). The optimum in 50 dimensions is found in 1 case; the standard deviation and mean error vary from 1.24E+03 to 1.51E+03, and from 1.09E+03 to 3.85E+03. The standard deviation in 100 dimensions is 4.63E+03 to 2.15E+04, whereas the mean error is 5.80E+03 to 1.15E+05. As a result, the inaccuracy that their data show is generally smaller than the error that HHO produced in our investigation. Hence, even though it was applied to different datasets, demonstrated a good potential over the HHO model.

Furthermore, Zhong et al.116 presented the first hybridization of DE with HHO and GSK (DEGH), which was used as bench mark over 32 other techniques, more also, experimental results equally showed that the proposed DEGH algorithm is significantly superior to the compared algorithms. Hence, the performance results based on the goodness-of-fit demonstrates that FS-BGSK, FS-pBGSK both demonstrated performance accuracy of 1.00 at M6. This indicates that GSK has the ability of improving the performance of HHO with 0.98 and 0.99 goodness-of-fit in the training and testing steps respectively. More information regarding GSK can be found at117.

Moreover, Khosrokhavar et al. reported utilizing data from liquid chromatography with UV and MS detector to predict the retention behaviour of various mycotoxins by the application of SVM and MLR approaches. The performance results inform of correlation and predictability measure by R2 and q2 are 0.931 and 0.932, respectively, for SVM and 0.923 and 0.915, respectively, for MLR30. This indicates that the performance of SVR-M3 (0.97 and 0.94) and SVR-HHO-M3 (0.98 and 0.99) depicted in the training and testing phases of the current study presented higher performance than that30.

Furthermore, Gilandeh et al. demonstrated the use of image processing to identify and distinguish between different wheat grains that are contaminated with a mycotoxin known as F. graminearum. They did this by utilizing SVM and DA algorithms118. The obtained results showed that classification using SVM method is better than non-linear SVM, having a 100% performance accuracy. This demonstrates that their performance accuracy is higher than the one obtained in the current study. Even though, the nature of the data and analysis differs, in which their analysis involves image processing techniques that involves the use of images as the datasets, while in our study, we employ the use of regression techniques using numerical dataset.

Additionally, Ge et al. reported on a comparison performance analysis between MLR and ANN models for cheese quality prediction. Using the principle component regression (PCR), partial least squares (PLS), SVM, and principal component analysis (PCA-SVM) techniques, linear and nonlinear regression models are built to connect the absorption spectra and the concentrations of 160 samples. Hence, the techniques depict the following RMSE-values for PLS (0.753, 0.691), PCR (0.587, 0.643), SVM (1.365, 1.674) and PCA-SVM (1.864, 1.953) in the training and testing phases respectively. Whereby, the outcomes of our study depict a range of MSE-values of 0.001 to 0.005 and 0.000137 to 0.001 in the training and testing stages respectively119. Hence, this indicates that the performance of our study outperformed the previous study with an outstanding performance.

In general, the single AI-based model in the form of SVR, specifically SVR-M3 with four input variables, shows higher performance based on NSE, CC, and MSE. Despite the difficulty in ranking the models according to the achieved accuracies, the SVR-M3 approach relatively showed the best prediction accuracy for modeling the mycotoxins retention behavior, with more than 90% performance efficiency in both the testing and training stages.

However, the overall accuracy of the SVR is unsatisfactory, particularly in M1 and M2, regardless of the nonlinear relationship between predictors and their corresponding targets. This could be improved by developing a hybrid technique through the application of optimization algorithms (HHO and PSO). It should be considered that the promising estimations occurred during the calibration phase, which is primarily employed to effectively calibrate the models based on known input variables and targets. However, the testing step is vital in assessing the performance of a model, as it examines the model’s accuracy based on unseen target values. This benefit does not exist in the training set. Therefore, a reliable model should have stable and balanced performance in both the calibration and verification phases. The section on optimization algorithms generally depicted promising ability compared to the single model, which is not surprising as seen in much of the literature71,73,112,113,114,115. Just like in the case of the SVR model, the performance simulation of the metaheuristic’s techniques was equally evaluated using NSE, CC and MSE. Based on Table 3 SVR-HHO-M3 showed the highest NSE value = 0.990096 and the lowest MSE value = 0.000137 in the verification stage. Furthermore, the results proved the ability of SVR-HHO over the single SVR model as well as the SVR-PSO metaheuristic approach for the simulation of mycotoxins in food samples using the chromatographic technique. The predictive ability of HHO over others could be seen in various recent literature for example71,72,115.

Nevertheless, the comparative predictive performance of the best model combinations (i.e., SVR-M3, SVR-HHO and SVR-PSO) can visualized and compared using the scatter plots (see Fig. 4). This plot usually indicated the level of arrangement among the actual and simulated values for the overall goodness-of-fit, which is indicated based on the determination co-efficient value (R2), the higher the R2 the better the performance of the model and vice-versa. More also, the regression equation y = mx + c, whereby m is the slope, c is the intercept and y and x are unknown can be used in determining a line of best fit, It can be used to forecast the x and y variables’ outcomes in a given data set or sample data and is represented as a scatter plot. Although there are several methods for determining a regression line, the least-squares regression line is typically employed since it yields a uniform line120. It is evident from the plot that SVR-HHO-M3 showed higher prediction efficiency as compared with SVR-PSO-M3 and SVR-M3 models.

Fig. 4
figure 4

Graphical representation for comparative analysis of the observe and computed values of the best model input combinations.

The relative analysis of the models can be better envisaged using the time series plot imbedded with contour plot (see Fig. 5). The time series plot is a robust graphical depiction of data that gives an overview and a numerical summary of a data set.

Fig. 5
figure 5

Time series for both the actual and computed values for the mycotoxin’s determination using chromatographic technique.

Furthermore, the overall comparative performance of the topmost AI-based model and the hybrid techniques (SVR-M3, SVR-PSO-M3 and SVR-HHO) is shown through the recently use visualized two-dimensional plot known as the Taylor diagram, as illustrated in Fig. 6. This diagram is generally employed to summarize and highlights various statistical indices such as correlation co-efficient (CC), standard deviation (std.), root mean square error (RMSE), determination co-efficient (DC) and mean absolute error (MAE) between the experimental and the simulated/predicted values. The Taylor diagram has been used in different areas such as engineering, water analysis and rainfall forecasting etc. due to its paramount statistical applications. Remarkably, to best of the author’s knowledge this research serves as the first to use this diagram in modelling mycotoxins from food samples. The Taylor diagram is discussed in further detail in121. Based on Fig. 6, the SVR-HHO-M3 showed higher goodness-of-fit for the prediction of mycotoxins from food samples using chromatographic method in both the training and testing stages. The obtained results lead to the conclusion that SVR-HHO-M3, SVR-PSO-M3, and SVR-M3 can capture the complex non-linear behaviour of the data involved in this article.

Fig. 6
figure 6

Two-dimensional representation showing the models’ calibration and verification in terms of the Taylor diagram.

Furthermore, the predictive performance of the metaheuristic algorithms can also be presented using a bump chart coupled with a bar plot (see Fig. 7). A bump chart is a type of chart used to visualize changes in rank over time. It’s particularly useful for showing the movement or ranking of different categories or items across multiple periods. Therefore, the ranking given in Fig. 7 for the bump plot includes three different metrics (NSE, MSE, and CC) for the three models (SVR-HHO-M3, SVR-M3, and SVR-POS-M3) in both the training and testing stages respectively. Hence, the illustrations depict the power of SVR-HHO over SVR-POS and conventional SVR techniques in modelling the mycotoxins. Additionally, Fig. 7B illustrates the respective performance metrics of the models using NSE, CC and MSE inform of bar plot. Overall, SVR-HHO presents the highest predictive performance.

Fig. 7
figure 7

(A) Bump chart and (B) bar plot for the performance metrics.

In general, the findings show that the SVR-M3 model, when hybridized with optimization algorithms (HHO and PSO), achieves over 90% performance efficiency in both training and testing stages. This high level of accuracy is meaningful as it indicates that the model can reliably predict the retention behavior of mycotoxins, which is crucial for ensuring food safety.

The suggested SVR-HHO framework is a good choice for chromatographic retention time modeling since it strikes a compromise between computational cost and prediction accuracy. However, memory and processor cost must be taken into account for large-scale deployment in analytical chemistry labs. Although SVR is still comparatively light in comparison to deep learning models, which require a lot of processing power, the addition of HHO for hyperparameter tuning introduces an optimization layer that raises the computational cost. For moderately small datasets, this is still feasible, but in order to preserve efficiency when scaling to high-dimensional chromatographic data with thousands of samples, parallel processing or GPU acceleration may be needed. One interesting approach to real-time chromatographic analysis is the deployment of SVR-HHO on cloud-based or edge computing platforms. Large-scale data processing and on-demand model retraining are made possible by cloud computing, guaranteeing flexibility in response to changing analytical circumstances. On the other side, edge computing minimizes latency and makes it possible to estimate retention durations almost instantly by enabling localized inference at chromatographic workstations. Optimizing model compression methods and utilizing hardware accelerators to shorten inference times High-throughput sample analysis in industrial and regulatory labs may also be made easier by combining the SVR-HHO model with automated liquid handling devices. Labs might improve operational efficiency, decrease manual involvement, and optimize chromatographic procedures by automating retention time predictions. Future studies might concentrate on creating an end-to-end pipeline for chromatographic quality control systems that combines automated decision-making, AI-driven predictions, and real-time data collecting. Without sacrificing accuracy would be necessary for such implementations to be feasible.

The SVR-HHO model can be successfully applied to food safety regulatory systems, such as FDA guidelines or hazard analysis and critical control points (HACCP), by improving the predictive capabilities for contamination risks and supporting decision-making processes through supply chain risk management, risk-based monitoring, automated alerts, optimized mitigation strategies, improved surveillance programs, etc. Also, using domain-specific datasets and altering input features, the SVR-HHO model can be expanded to detect pesticides, heavy metals, and other pollutants. Furthermore, by altering input characteristics and training the model with a variety of contamination datasets, the SVR-HHO model can generalize across different food matrices, including grains, dairy, and drinks. Custom feature selection and matrix-specific contamination data training are necessary for the model to adapt to various food matrices. SVR can enhance performance in these many categories by using Harris Hawks Optimization (HHO) to optimize the hyperparameters. This integration ensures effective, scalable, and internationally harmonized regulatory compliance by providing a next-generation AI-driven solution to food safety monitoring.

Conclusion

A multi-model data-driven method using an AI-based model (SVR) with two metaheuristic methods, comprising PSO and HHO (SVR-PSO and SVR-HHO), was used to predict the chromatographic behavior of various classes of mycotoxins in food samples. This approach aims to qualitatively elucidate this type of fungi and develop a mechanism for its prevention. The M3 model combination showed the best predictions in both SVR, SVR-HHO and SVR-PSO, with SVR-HHO having the highest prediction ability. Also, the results indicate that the main objective of the study is the implementation of SVR-HHO (NSE = 0.98, CC = 0.99 and MSE = 0.001 in the training and NSE = 0.99, CC = 0.995 and MSE = 0.000137 in the testing phase), which can enhance the performance of previous methods by 4–7% in the training and testing phases, respectively, in predicting the qualitative characteristics of mycotoxins in food samples. The integration of SVR with metaheuristic optimization enhances predictive accuracy and efficiency, making it suitable for deployment in real-world chromatographic laboratories. By automating hyperparameter tuning, the proposed framework can be integrated into regulatory and commercial food safety programs, improving contaminant detection and compliance monitoring.

This limited scope means that other potentially effective models and hybrid combinations were not explored, model generalization across different food matrices, computational demands, and integration with existing chromatographic software which could have yielded different results. More also, developing and running hybrid AI models, especially with optimization algorithms, requires significant computational resources. This limitation might affect the feasibility of applying these models in environments with limited computational capabilities.

The findings also imply that additional hybrid models, including, ensemble machine learning strategies, and other metaheuristic algorithms, might be used to enhance the capacity to predict the chromatographic characteristics of the mycotoxins. Investigate other AI and machine learning models such as random forest, gradient boosting, neural networks, AutoML and other ensemble methods. Comparing a wider range of models can help identify the most effective approaches for predicting mycotoxins’ retention behaviour.