Prediction of speed of sound of deep eutectic solvents using artificial neural network coupled with group contribution approach

Adhab, Ayat Hussein; Mahdi, Morug Salih; Doshi, Hardik; Yadav, Anupam; Manjunatha, R.; Kumar, Sushil; Shit, Debasish; Sangwan, Gargi; Mansoor, Aseel Salah; Radi, Usama Kadem; Abd, Nasr Saadoun

doi:10.1038/s41598-025-14094-w

Download PDF

Article
Open access
Published: 10 August 2025

Prediction of speed of sound of deep eutectic solvents using artificial neural network coupled with group contribution approach

Ayat Hussein Adhab¹,
Morug Salih Mahdi²,
Hardik Doshi³,
Anupam Yadav⁴,
R. Manjunatha⁵,
Sushil Kumar⁶,
Debasish Shit⁷,
Gargi Sangwan⁸,
Aseel Salah Mansoor⁹,
Usama Kadem Radi¹⁰ &
…
Nasr Saadoun Abd¹¹

Scientific Reports volume 15, Article number: 29238 (2025) Cite this article

2388 Accesses
2 Citations
2 Altmetric
Metrics details

Subjects

Chemical engineering

Abstract

Predicting the physiochemical properties of deep eutectic solvents (DESs) is crucial for designing new solvents. Heat capacity and speed of sound are important thermodynamic properties in chemical processes. However, experimental data on the speed of sound in DESs is limited. Consequently, a thermodynamic model is needed to estimate the speed of sound in DESs over a wide range of pressures and temperatures. A key challenge in these models is accurately estimating the ideal gas heat capacity. Since the ideal gas heat capacity of DESs is often unavailable, a machine learning (ML) approach, using artificial neural networks (ANNs) coupled with a Group Contribution (GC) method, is a promising technique. The GC approach will be used to estimate critical temperature, volume, and acentric factor of DESs, which can then be input into the ANN model to predict the speed of sound. The results show that using a combination of a GC method and ANNs or CatBoost ML provides a highly accurate prediction of the speed of sound in DESs. Input parameters to the ANN + GC include temperature, acentric factor, molecular weight, and critical volume. The absolute relative deviation (ARD%) and R² values of correlated speed of sound for the ANN + GC model have been obtained 0.032% and 0.998, respectively. The ARD% for both the ANN + GC and ML + GC approaches was substantially lower than that of the correlation-based models. Furthermore, cumulative frequency diagrams and the leverage approach were implemented to validate the quality and reliability of the proposed model. The leverage analysis confirmed the accuracy of the data used and the high reliability of the ANN + GC model for estimating the speed of sound in DESs. This analysis indicates that the ANN + GC and ML + GC methods can effectively estimate the speed of sound in DESs based on molecular structure. Therefore, these approaches offer a promising tool for predicting the speed of sound of newly designed DESs when experimental data is unavailable.

Machine learning-based prediction and SHAP sensitivity analysis of sound speed in hydrogen-rich gas mixtures

Article Open access 19 November 2025

Accurate prediction of density, viscosity, and speed of sound in aqueous aliphatic biogenic polyamine solutions using data-driven modeling

Article Open access 08 January 2026

Enhanced CO₂ capture potential of UiO-66-NH₂ synthesized by sonochemical method: experimental findings and performance evaluation

Article Open access 14 November 2023

Introduction

Deep eutectic solvents (DESs) are mixtures of a hydrogen bond donor (HBD) and a hydrogen bond acceptor (HBA), which combine to form a solvent with a melting point lower than that of either individual component. Their advantageous properties, including low toxicity and low vapor pressure, have led to significant interest in DESs across a variety of fields¹. These types of solvents are considered an alternative to traditional organic solvents. Similar to ionic liquid (IL), DES physicochemical properties can be tuned by the suitable combination of HBD–HBA compounds. It must be noted that ILs and DESs are different due to the nature of starting materials and the methods for their formation. When considering starting materials and formation methods, ILs are mixtures of cations and anions, requiring synthesis with reagents and solvents. In contrast, DESs consist of HBAs and HBDs, and they can be prepared using single components through a simple heat treatment². DESs are widely used in pharmacology³, gas-capturing processes (especially CO₂ capture)⁴, extraction operations^5,6,7, and water treatment⁸. The low vapor pressure (almost negligible vapor pressure) of DESs is one of their main properties⁹. In industry, environmentally friendlier and greener solvent alternatives for the manufacturing of their products are required¹⁰. Bowen et al. have demonstrated the utility of DESs in protein extraction and purification¹⁰ highlighting their potential as a promising, alternative solvent for protein extraction from diverse raw biomass sources. Furthermore, recent research¹¹ has focused on designing green, highly efficient, and recyclable DESs for separating EVA films from end-of-life (EOL) photovoltaic (PV) modules. The inherent high-temperature stability and acidic nature of DESs can effectively facilitate the separation of EVA film layers within these modules¹¹. Jahanbakhsh-Bonab et al. used molecular dynamics (MD) simulations to examine the physicochemical and structural characteristics of novel DESs¹². Research indicates that methyl-β-cyclodextrin (MBCD)-based DESs can provide predictive insights for their applications in extraction processes. Furthermore, studies have explored the structural and physicochemical properties of chiral DESs, composed of racemic mixtures of menthol with acetic acid, menthol with lauric acid, and menthol with pyruvic acid, specifically for enantioselective extraction processes¹³. Jahanbakhsh-Bonab et al. utilized the MD simulations to examine the structural and dynamical properties of DESs-based boron nitride nanotube (BNNT) nanofluid¹⁴. The impact of nanotube diameter on the physicochemical parameters of DES-based systems has been investigated. Results indicate that adding Boron Nitride Nanotubes (BNNTs) to DESs increases viscosity due to a reduction in the diffusion coefficient of the DES components. Understanding the structure of DES-based nanofluids is crucial for comprehending the properties of these novel solvents in chemical processes. Moreover, MD simulations have been employed to examine the effects of external electric fields (EEFs) on the structural and transport properties of DESs composed of a 2:1 molar ratio of glycerol (Gly) and choline chloride (ChCl)¹⁵. They calculated several key physicochemical properties of Gly/ChCl DESs in the absence of external electric fields (EEFs), including viscosity, self-diffusion coefficient, isothermal compressibility, and density. They also employed the radial distribution function (RDF), coordination number, and number of hydrogen bonds to analyze the arrangement of the DES components. Their findings indicated that the correlation of movement between glycerol (Gly) and chloride (Cl) ions decreases as the strength of the EEF increases. Furthermore, in recent years, MD simulations have been increasingly used to investigate the performance of DESs in separating acid gases from natural gas¹⁶. The performance of DESs was compared with that of methyl diethanolamine (MDEA) system. The results show that, the diffusion coefficient of H₂S and CO₂ follow the trends Nano-DES < DES-MDEA < DES < MDEA < aqueous MDEA, and DES < DES-MDEA < MDEA < Nano-DES < aqueous MDEA in the liquid phase. Also, the performance of an amine-based DES composed of a 6:1 molar ratio of methyl diethanolamine (MDEA) and choline chloride (ChCl) for the natural gas (NG) sweetening process was investigated using MD simulations¹⁷. The effect of pressure on the performance of amine-based DESs for natural gas sweetening has been examined. The results suggest that DESs based on N-methyldiethanolamine (MDEA) can compete with conventional amine solvents in natural gas sweetening processes. P. Jahanbakhsh-Bonab et al. simulated the absorption capability of DESs based on MDEA using MD simulations¹⁸. Also, the effect of temperature on absorption capability was studied. The thermophysical properties of monoethanolamine (MEA)-based DESs for H₂S extraction from biogas were investigated¹⁹. The impact of pressure on the performance of amine-based DESs for the extraction of H₂S and CO₂ was studied. Esfahani et al. used choline chloride-based DESs for the extraction of 1-butanol or 2-butanol from azeotropic n-heptane + butanol mixtures. This shows a difference in application and the type of DES used⁷. In summary, the prediction of thermodynamic properties of DESs plays an important role in chemical processes. The first-order derivative thermodynamic properties such as density, and second-order derivative thermodynamic properties such as heat capacity and speed of sound of the system, must be estimated in the pre-design of a new chemical process. Predictive models are more cost-effective than experimentally measuring the properties of a large number of potential designs (DESs) for a chemical process. M. Taherzadeh et al. proposed a correlation-based model to correlate/predict the heat capacity of 28 DESs over the wide temperature range from 278 to 363 K²⁰. The proposed correlation was developed based on molecular weight, temperature, acentric factor, and critical pressure. The absolute average relative deviation (AARD%) of the model for all of the studied data points was 4.7%. Leron and Li created a model specifically for estimating the heat capacity of DES²¹. Naser et al. developed a model for the specific heat capacity calculation of 15 DESs²². Zhang et al. proposed an empirical equation for the heat capacity estimation of two DESs²³. H. Peyrovedin et al. proposed a correlation-based model aimed at estimating the speed of sound in 39 different deep eutectic solvents (DESs) across a broad temperature range. This model likely utilizes empirical correlations derived from experimental data to provide accurate predictions of sound speed in these solvents, which is essential for understanding their thermophysical properties and potential applications in various fields²⁴. Their results indicated that the AARD% of the model is about 5.4%. Lapeña et al. correlated the speed of sound, heat capacity, density, isentropic compressibility, and viscosity of two DESs using a linear correlation²⁵. Peng and Minceva utilized the perturbed chain polar statistical associating fluid theory (PCP-SAFT) model to predict the density and viscosity of DESs²⁶. Lashkarbolooki et al. utilized the ANN to predict the heat capacity of binary ionic liquids mixtures²⁷. They collected 1571 binary heat capacity data points for ILs. A neural network with one hidden layer containing 16 neurons successfully predicted IL binary heat capacities²⁷. Thermodynamic models are a primary approach for predicting phase equilibrium and calculating second-order derivative thermodynamic properties

Theory and methodology

Data collection

As mentioned in the introduction section, the experimental data on the speed of sound are scarce.

In this study, 415 experimental data points of 38 DESs over a wide range of temperatures have been selected in the literature. The data points have been randomly divided into two sets containing 300 training and 115 testing data points. The most common train-test splits in the literature are 70:30 and 80:20, offering a good balance by providing enough data for both training and testing, and are often selected for their robustness and reliability in various contexts⁴⁴. The training data points have been considered to develop the model, and the test data points have been utilized to check the model performance.

The input layer is critical in machine learning models. Predicting the speed of sound in DESs requires appropriate input features, likely drawn from various thermophysical properties. However, a challenge arises with newly introduced DESs, as their thermophysical properties are often unknown. This presents a problem for training accurate machine learning models to predict the speed of sound in these novel solvents. The group contribution methods can be utilized to estimate the thermo-physical properties of DESs such as critical pressure, critical temperature and critical volume, and acentric factor. In this work, the modified Lydersen and Joback–Reid GC method^45,46 is used to estimate the thermo-physical properties of DESs. In Table 1 the groups parameters of the Lydersen and Joback–Reid method have been presented.

Table 1 Groups considered in the Joback–Reid method.

Full size table

Valderrama et al. estimated the critical points, normal boiling temperature, and acentric factor of ILs using the Joback method⁴⁵. As shown in Table 1, the ion parameters have been considered based on the Valderrama et al. method. The normal boiling temperature (T_b), critical temperature (T_c), critical pressure (P_c), critical volume (V_c), and acentric factor (ω) are estimated using Eqs. (1)–(5) as follows:

$$T_{b} \left( K \right) = 198 + \mathop \sum \limits_{k} N_{k} tb_{k}$$

(1)

$$T_{c} \left( K \right) = T_{b} \left[ {0.584 + 0.965\left\{ {\mathop \sum \limits_{k} N_{k} tc_{k} } \right\} - \left\{ {\mathop \sum \limits_{k} N_{k} tb_{k} } \right\}^{2} } \right]^{ - 1}$$

(2)

$$P_{c} \left( {bar} \right) = \left[ {0.113 + 0.0032N_{atoms} - \mathop \sum \limits_{k} N_{k} pc_{k} } \right]^{ - 2}$$

(3)

$$V_{c} \left( {\frac{{cm^{3} }}{mol}} \right) = 17.5 + \mathop \sum \limits_{k} N_{k} vc_{k}$$

(4)

where N_atoms is the total number of atoms in the molecule. The acentric factors of ILs were estimated as follows⁴⁶:

$$\omega = \frac{{\left( {T_{b} - 43} \right)\left( {T_{c} - 43} \right)}}{{\left( {T_{c} - T_{b} } \right)\left( {0.7T_{c} - 43} \right)}}\log \left( {\frac{{P_{c} }}{{P_{b} }}} \right) - \left( {\frac{{T_{c} - 43}}{{T_{c} - T_{b} }}} \right)\log \left( {\frac{{P_{c} }}{{P_{b} }}} \right) + \log \left( {\frac{{P_{c} }}{{P_{b} }}} \right) - 1$$

(5)

In Table 2 critical properties and acentric factor of DESs have been reported.

Table 2 The list of studied DESs and their thermo-physical properties that estimated by the modified Lydersen and Joback–Reid^45,46, and references of speed of sound experiential data.

Full size table

The GC method serves as an effective approach to estimate the thermophysical properties of DESs based on their molecular structures. This is particularly useful when experimental data is lacking, allowing researchers to generate necessary property estimates that can inform the modeling processes. In this work, the GC approach has been integrated with ANNs to predict the speed of sound in DESs.

Thoroughly analyzing and preparing input data is vital for building robust machine learning models. Each of these steps contributes to improving the quality of the input data, which in turn enhances the model’s ability to learn and generalize from the training data. The input data have been analyzed and reported in Table 3⁵⁸.

Table 3 Statistical description of the data set used for modeling.

Full size table

In the next section, the ANN + GC method has been presented.

ANN methodology

The ANN can be likened to a black box with multiple inputs and outputs. The number of neurons can vary widely, ranging from fewer than ten to tens of thousands. These neurons can be organized into one, two, three, or more layers, with the two-layer network being the most commonly used ANN design for chemical applications. In the ANN methodology, inputs are fed into the input layer and then transmitted to the hidden layer using a specific transfer function. The transformed inputs are subsequently relayed to the output layer to estimate the desired properties. The estimated values are then compared to experimental data to analyze the error using the objective function (OF). Finally, the results are fed back into the system, and this process is repeated using a trial-and-error approach to minimize the objective function (OF) error. The number of neurons in the hidden layer is adjusted during these trials to achieve the best possible outputs with the lowest OF. This adjustment is guided by error analysis. Initially, one neuron is used to estimate the error with the training subset. Next, two neurons are evaluated, and this process continues, incrementing the number of neurons until the optimal configuration is found based on the minimum error value obtained from the testing subsets. Consequently, if the desired error is not achieved, the number of neurons in the hidden layer is increased. It must be noted that all input data are divided into two subsets namely training and testing. In Fig. 1 the schematic diagram of the neural network has been shown.

It must be noted that the weight parameters between neurons in hidden layers are utilized to develop the network. Summations of outputs of the previous layer in hidden layers must be weighted and added with bias. Hyperbolic tangent sigmoid transfer function is utilized in the hidden layer as follows:

$$n_{j} = \frac{2}{{1 + {\text{exp}}\left( { - 2Z} \right)}} - 1$$

(6)

$$Z = \mathop \sum \limits_{i = 1}^{r} w_{ij} p_{i} + b_{j}$$

(7)

where, $n_{j}$ is the j^th neuron output, $w_{ij}$ refers to weights of i^th neuron in the previous layer to the j^th neuron, $b_{j}$ is the bias, and $p_{i}$ is output. In this work, the weight and Bias of output and hidden layer have been reported in Table 3. The speed of sound of new-designed DESs can be predicted using the weight, Bias, and input values (obtained by the GC approach). The results of the ANN + GC model have been described in section “ANN+GC method”.

Machine learning model

Machine learning is a subset of artificial intelligence (AI) that focuses on the development of algorithms and statistical models that enable computers to perform tasks without explicit instructions. Instead of being programmed with specific rules, machines learn from data and improve their performance over time. In this study the Categorical Boosting (CatBoost) machine learning approach has been used^59,60. It excels at handling categorical features and often achieves state-of-the-art results in various machine learning tasks, particularly those involving tabular data. It’s designed to be easy to use, robust, and provides excellent accuracy. CatBoost directly handles categorical features without needing extensive preprocessing like one-hot encoding (though it can still benefit from good feature engineering)⁶¹. It uses a special method to encode categorical features called Target Statistics (also known as Ordered Target Encoding). This method reduces overfitting compared to naive target encoding. For each categorical feature, CatBoost calculates the average value of the target variable for each category. However, to prevent overfitting (a common problem with target encoding), it uses a more sophisticated approach to calculate these statistics. It avoids using the target value of the current row when calculating the target statistic for that row. This is done by calculating the target statistic based on the rows that came before the current row in the dataset’s order. Random permutations of the data are used to further reduce overfitting. CatBoost implements a variation of gradient boosting known as ordered boosting. This helps to address gradient bias that can occur in traditional gradient boosting algorithms, especially when dealing with categorical features. Gradient bias arises because the model is trained using the same data that was used to calculate the gradients, leading to an overestimation of the model’s performance. Ordered boosting aims to correct this bias. Like other gradient boosting methods, CatBoost iteratively builds an ensemble of decision trees. Each tree is trained to correct the errors made by the previous trees. The optimization process is driven by gradients calculated from a loss function. Within CatBoost, categorical boosting entails the utilization of categorical columns, incorporating permutation techniques such as one hot max size (OHMS) and target-based statistics. This method employs a greedy approach for each new split of the current tree, enabling CatBoost to unveil the exponential expansion of feature combinations⁶¹. The CatBoost method follows these steps:

Formation of a random subset of the records
Converting labels to integers
Transforming categorical features into numeric values, as outlined below:
$${\text{Average}}\,{\text{Target}} = \frac{{{\text{Count in Class }} + {\text{ Prior}}}}{{{\text{Total Count }} + { }1}}$$
(8)

In the equation provided, the parameter “Count in Class” aggregates the targets. Furthermore, each target is assigned a value of one and linked to specific categorical features, while the term “Total Count” tallies all previous instances⁶². In Fig. 2, schematic of the CatBoost tree construction has been depicted.

Statistical error analysis

Statistical error analysis involves examining the errors and uncertainties in statistical measurements and data analysis. This is a crucial process in many fields, including science, engineering, economics, and social sciences, as it helps in assessing the reliability and validity of conclusions drawn from data. In this work the model results have been evaluated using the average relative deviation percent (ARD%), standard deviation (SD), mean absolute error (MAE), root mean square deviation (RMSE), and R² values as follows:

$${\text{ARD}}\left( \% \right) = \frac{100}{N}\mathop \sum \limits_{i = 1}^{N} \left| {\frac{{U_{i}^{exp} - U_{i}^{calc} }}{{U_{i}^{exp} }}} \right|$$

(9)

$${\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{N} (U_{i}^{exp} - U_{i}^{calc} )^{2} }}{N}}$$

(10)

$${\text{SD}} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{N} \frac{{(U_{i}^{exp} - U_{i}^{calc} )^{2} }}{{U_{i}^{exp} }}}}{N - 1}}$$

(11)

$${\text{MAE}} = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left| {U_{i}^{exp} - U_{i}^{calc} } \right|$$

(12)

$$R^{2} = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {U_{i}^{exp} - \overline{U}_{i}^{calc} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{N} (U_{i}^{exp} - \overline{U}^{exp} )^{2} }}$$

(13)

where $U_{i}^{exp}$ and $U_{i}^{calc}$ are experimental and estimated speed of sound of DESs. $\overline{U}^{exp}$ is the average value of the experimental speed of sound. In the next section the results of ANN + GC and ML + GC approaches have been presented.

Results and discussion

In the previous section the ANN and ML methodologies for the prediction of the speed of sound of DESs has been described and depicted in Figs. 1 and 2. In this section the ANN + GC and ML + GC approaches have been described separately.

ANN + GC method

The number of independent variables (input layer) plays a crucial role in ANN and ML methods⁶³. In Table 1, thermodynamic properties of DESs containing T_c, V_c, MW, and ω have been reported. In the case of ANN method, different input properties and different numbers of neurons in one and two hidden layers have been considered, because one hidden layer only does not lead to adequate results^64,65,66,67. The number of the hidden layer and neuron in the hidden layer and output are obtained using a trial and error algorithm. In this work the Levenberg–Marquardt algorithm^68,69 is used to optimize the aforementioned parameters. The results show that, one hidden layer containing 16 neurons and four input properties containing the critical volume (V_c), molecular weight (Mw), temperature, and acentric factor (ω) are the optimum values. As described in section “Theory and methodology”, 300 training and 115 testing data points of the speed of sound have been considered.

As mentioned in Eq. (7), the weight parameters between neurons in hidden layers are essential to develop the network. In Table 4 the weight parameters of neurons have been reported.

Table 4 The weight of hidden and output layers for 16 neurons.

Full size table

The optimum values of neurons in the hidden layer are evaluated using the average relative deviation percent (ARD%). When the optimum network architecture was determined, the input data of the ten DESs were fed to the network to predict their speed of sound. In Fig. 3 the flowchart of the proposed ANN model has been depicted.

As shown in Fig. 3, the model can predict the speed of sound of DESs using independent variables containing T, V_c, ω, and MW. The inputs containing V_c, and ω can be estimated using GC approaches^45,46. Experimental literature data is used as the training dataset for sound speed. An ANN with one hidden layer containing sixteen neurons is employed to develop the model. Four input variables can be fed into the saved file to generate predicted sound speed. Using the “saved network” and these four inputs, the sound speed of DESs can be accurately predicted. The complete MATLAB code, including all source files used in the programming, is provided in the Supplementary Material. The correlated and predicted results of the ANN + GC approach have been shown in Fig. 4.

Figure 4 shows that the ANN + GC approach can correlate the speed of sound of DESs over a wide range of temperatures, satisfactory. The ARD% and R² of the correlated speed of sound have been obtained 0.032% and 0.9988, respectively. Figure 4b shows the prediction of the speed of sound of ten DESs using the ANN + GC approach. The results are in good agreement with experimental data. Figure 5 shows a simultaneous comparison between the experimental data and ANN + GC data.

As shown in Fig. 5, the ANN + GC method can correlate the experimental data accurately. Distributions of the deviation points for the ANN + GC method are shown in Fig. 6.

As shown in Fig. 6, the deviations between the ANN + GC predictions and experimental data do not exceed 4 m/s. Error analysis indicates that the proposed network is suitable for engineering calculations. In this study, the predictive performance of the ANN + GC model was assessed using R², ARD%, SD, MAE, and RMSE metrics; see Table 5.

Table 5 Statistical error analysis for the ANN + GC model.

Full size table

As shown in Table 5, the total ARD%, MAE, SD, RMSE, and R² values have been obtained 0.032%, 1.5656, 0.0549, 2.227, and 0.9988, respectively. The results of the ANN + GC approach show good agreement with experimental data. Models with high R² values nearing 1 and low values for ARD%, RMSE, MAE, and SD are considered more accurate in predicting the speed of sound. In this study, the ARD% for the training and testing phases of the ANN + GC model were 0.024% and 0.053%, respectively. The overall R² value approached unity at 0.9988. These results indicate that the ANN + GC model can accurately correlate the speed of sound in DESs across a wide temperature range. In the next section, the ML + GC model has been studied.

ML + GC method

Similar to the ANN + GC method, the inputs for the ML + GC model included V_c, ω, Mw, and T. Additionally, 300 training and 115 testing data points of the speed of sound were used to develop the machine learning approach. The statistical metrics for the CatBoost model are summarized in Table 6, which presents evaluations on the training and testing subsets (300 and 115 data points, respectively), as well as the complete dataset consisting of 415 points. In this study, the predictive performance of the models was assessed using R², ARD%, SD, MAE, and RMSE metrics. Comparing model predictions with experimental data across both training and testing datasets provides valuable insights into the models’ accuracy and generalization capability; see Table 6.

Table 6 Statistical error analysis for the ML + GC model using CatBoot approach.

Full size table

The greater the alignment between the predicted value and the experimental data, the higher the accuracy of the predictive model. In Figs. 7 and 8, the error distribution plot of the presented model vs the predicted speed of sound has been depicted. This visual representation demonstrates the robust agreement between the experimental data and the forecasts produced by the CatBoost ML method.

Figures 7 and 8 illustrate a strong correlation between the model-predicted data and the experimental data across both the training and testing datasets. These figures demonstrate a very close alignment between the model predictions and experimental points. In this research, graphical analysis complemented statistical methods to provide a more comprehensive evaluation of the models’ performance. These visual representations played a vital role in assessing the accuracy and reliability of the models. The percentage distribution of the relative error against the experimental values is presented in Fig. 7. In this type of error evaluation, relative error values are plotted against experimental output values. The closer the data points are to the zero-error line, the model is the more accurate. When the data points are scattered around the zero line, it indicates a significant difference between the predicted values and the experimental data, which proves the high error of the model. As a result, the proximity of the data points to the zero line for the ML + GC model indicates the high accuracy of this model. In Fig. 8, the cross-plot has been depicted. The cross-plot visually represents the comparison between predicted and experimental values. A closer alignment of data points with the unit slope line (Y = X) in the cross plot signifies higher accuracy and effectiveness of the model. The ML + GC model shows significant performance with most of the data points lying around the Y = X line. In the next section, a comparison between ANN + GC, ML + GC, and the correlation-based models has been investigated.

Comparison between ANN + GC, ML + GC, and the correlation-based models

The ANN + GC and ML + GC results have been compared to five correlation-based models^{24,70,71,72,73}. Singh and Singh proposed a correlation for speed of sound based on the surface tension and density⁷⁰. Hekayati and Esmaeilzadeh suggested a novel interrelationship between surface tension (σ), density (ρ), and speed of sound (u) of ILs⁷¹. Gardas and Coutinho proposed a relationship between surface tension (σ), density (ρ), and speed of sound (u) for imidazolium based ILs, covering wide ranges of temperature, 278.15–343.15 K⁷³. The aforementioned models are correlation-based. In Table 7 the ARD% of five correlation-based, ML + GC, and ANN + GC models have been reported and compared.

Table 7 ARD% values of ANN + GC, ML + GC, and five correlation-based models.

Full size table

The average ARD% value of Peyrovedin et al.²⁴ model was obtained 5.67%. ARD% values of Haghbakhsh et al.’s model⁷², Hekayati and Esmaeilzadeh’s model⁷¹, and Gardas and Coutinho’s model⁷³ for 38 DESs have been obtained 9.52%, 9.38%, and 9.45%, respectively. The average ARD% value of Singh and Singh’s model⁷⁰ was obtained about 39%. They correlated the speed of sound of ILs using surface tension and density data. As shown in Table 7, the Peyrovedin et al.²⁴ model gives a lower ARD% value compared to other correlation-based models. The ANN + GC and the ML + GC models give lower error values compared to correlation-based models. The ARD% of the ANN + GC model is slightly lower than the ML + GC model. In Fig. 9 the speed of sound of some DESs using the ANN + GC approach have been compared to experimental data.

As shown in Fig. 9, the ANN + GC correlates the speed of sound of DESs satisfactory. The average ARD% of the ANN + GC model was obtained 0.032%. In Fig. 10, the ANN + GC model results have been compared to experimental data and H. Peyrovedin et al. model.

As depicted in Fig. 10, the ANN + GC approach correlates the speed of sound of four DESs at various temperatures accurately. In the case of DES1, the ARD value of H. Peyrovedin et al. model is higher than ANN + GC, nevertheless, their obtained results are acceptable. Figure 10 shows that, their proposed correlation is accurate at lower temperatures, and the model deviations are increased by increasing temperature. As reported in Table 7, and Figs. 9 and 10, the average ARD% value of the testing and training results of the ANN + GC are acceptable. In Fig. 11, the ANN + GC model has been compared to the ML + GC and five correlation-based models.

As shown in Fig. 11, the average ARD% values of the ANN + GC approach are lower than the correlation-based models. The average error values of ANN + GC and ML + GC models are comparable. In Fig. 12, the error distribution plot for ten DESs has been depicted.

Cumulative frequency diagrams are one of the graphical methods used for evaluating model performance⁷⁴. Figure 13a and b illustrate the cumulative frequency diagrams of the ANN + GC and ML + GC models, along with five correlations (as reported in Table 7).

As shown in Fig. 13a, approximately 90% of the values estimated by the ANN + GC model exhibited an ARD% of less than 0.07%. In the case of ML + GC model, 90% of the ARD% values are less than 0.1%. In Fig. 13b, the cumulative frequency of five correlation-based models has been depicted. The correlation developed by Singh and Singh’s⁷⁰ demonstrated poor performance. The results show that, the ANN + GC model achieves high precision in forecasting speed of sound of DESs compared to the five correlation-based model.

The leverage approach for model analysis

The leverage approach is a valuable tool for ensuring the quality and reliability of statistical models. Identifying and addressing high-leverage points, can improve model accuracy, enhance data understanding, and lead to more informed decision-making⁷⁵. Leverage values help identify observations that have a disproportionate influence on the regression coefficients. Points with high leverage and large residuals are particularly problematic, as they can significantly distort the model fit. Leverage diagnostics are used during model validation to assess the stability and generalizability of the model. If the model is highly sensitive to a few high-leverage points, it may not perform well on new data. High-leverage points often indicate data errors or unusual events. Identifying these points allows for a targeted investigation of the data to identify and correct errors or to understand the underlying causes of the unusual observations. High leverage points can sometimes indicate the need to include additional predictor variables in the model. In some cases, transforming the predictor or response variables can reduce the influence of high-leverage points and improve the model fit. As well, investigating high-leverage points can provide valuable insights into the data and the underlying processes that generated it. In this study the leverage approach has been utilized to study the ANN + GC model. In this regard, standardized residuals (SR) and Leverage values, derived from the diagonal elements of the hat matrix have been calculated. The hat matrix was given by:

$$H = X\left( {X^{t} X} \right)^{ - 1} X^{t}$$

(14)

where $X^{t}$ refers to the transpose of matrix X. The critical leverage was calculated as 3(n + 1)/m. where m and n represent the number of data points and model input variables, respectively. The applicability domain of the ANN + GC model can be assessed by plotting standardized residuals against leverage values (Williams plot). The Williams plot is the most common and direct way to do this. The applicability domain (AD) of a model is the region where the model is considered reliable for making predictions. In simpler terms, it’s the set of conditions under which you can trust the model’s output. Extrapolating beyond the AD can lead to inaccurate or unreliable predictions.

By plotting standardized residuals against leverage values against each other, the Williams plot allows you to identify observations that:

Are outliers (large standardized residuals)
Have high leverage (unusual predictor values)
Are both outliers and have high leverage (potentially very influential and problematic)

If the majority of data points were situated within the boundaries of the 0 ≤ H ≤ critical leverage, and—3 ≤ SR ≤ 3, the established model is deemed reliable, and its predictions are confined within the applicability domain⁷⁵. In Fig. 14, the William’s plot is illustrated.

As shown in Fig. 14, the critical leverage value has been obtained about 0.0545. As depicted in Fig. 14, most of the data point falls between 0 ≤ H ≤ 0.0545, and—3 ≤ SR ≤ 3. The results indicated that, the ANN + GC model is highly reliable. There are some suspicious data (SR > 3 or SR <—3). Figure 14 shows that, only five data points have an SR-value outside the range of—3 to 3, classifying them as questionable data. On the other hand, all data points have H values lower than 0.0545. This result indicated that all data points have satisfactory leverage. The Leverage approach confirms the accuracy of databank and the high reliability of ANN + GC model in estimating speed of sound of DESs.

In the next section, the sensitivity analysis (SA) of input variables in the ANN + GC model has been studied.

Sensitivity analysis

Sensitivity analysis in ANNs involves determining how much each input variable influences the network’s output. It helps you understand which inputs are most important and how changes in those inputs affect the model’s predictions. Sensitivity analysis using weight-based methods involves evaluating the influence of input variables on the output by analyzing the weights within the network. These methods are generally more straightforward and computationally less expensive than perturbation-based methods. Garson suggested an equation based on partitioning of connection weights for sensitivity analysis of input variables as follows⁷⁶:

$$IF_{j} = \frac{{\mathop \sum \nolimits_{m = 1}^{Nh} \left( {\left( {\frac{{\left| {w_{jm}^{ih} } \right|}}{{\mathop \sum \nolimits_{k = 1}^{Ni} \left| {w_{km}^{ih} } \right|}}} \right).w_{mn}^{ho} } \right)}}{{\mathop \sum \nolimits_{k = 1}^{Nh} \left\{ {\mathop \sum \nolimits_{m = 1}^{Nh} \left( {\left( {\frac{{\left| {w_{km}^{ih} } \right|}}{{\mathop \sum \nolimits_{k = 1}^{Ni} \left| {w_{km}^{ih} } \right|}}} \right).w_{mn}^{ho} } \right)} \right\}}}$$

(15)

where IF_j is the relative importance of the j^th input variable on output variable; N_i and N_h refer to the number of input and hidden neurons, respectively. The superscripts i, h and o refer to input, hidden and output layers, respectively. The subscripts k, m and n refer to input, hidden and output layers, respectively. w is connection weights. The relative importance of input variables (IF_j) were calculated by Eq. (15). This approach expands on Garson’s method by considering the direct and indirect paths from inputs to outputs. It involves calculating the influence of each input across the network layers into the final output. In Fig. 15 the importance of input variables based on normalized percentage has been depicted.

It is evident that all selected input variables have a strong influence on the speed of sound values, with importance levels ranging from 21 to 29%. However, it is important to note that highly nonlinear models or coupled input variables can complicate sensitivity analysis. These results highlight which inputs have the most significant impact on the output, aiding in model refinement, feature selection, or providing insight into the underlying process. As shown in Fig. 15, the contributions are typically normalized to sum to 1 (or 100%) to facilitate easier interpretation of the results.

In summary, ANN methods have several key advantages and disadvantages⁷⁷. They can model complex nonlinear relationships by selecting an appropriate architecture through trial and error. Once the input layer, the number of neurons, and hidden layers are established, ANNs can predict values beyond those considered during training without reprogramming. However, acquiring large datasets is often challenging and time-consuming. Additionally, the complexity of ANNs can make their implementation difficult. Another drawback is that ANNs require robust central processing units (CPUs) or hardware, which can be resource-intensive. This study demonstrates the strong performance of ANN models in predicting second-order derivative thermodynamic properties, such as the speed of sound in DESs, despite the aforementioned limitations. Traditionally, equations of state (EoS) models have been widely used to estimate the thermo-physical properties of complex systems like ILs and DESs. However, predicting the speed of sound using EoS-based models requires the ideal gas heat capacity of the pure components. Estimating this property using GC models often results in significant deviations in some cases. Consequently, researchers are seeking alternative approaches to predict the speed of sound and specific heat capacity without relying on ideal gas heat capacity estimations. This work shows that the ANN + GC method can be considered a robust and efficient alternative, particularly for predicting second-order derivative thermodynamic properties, such as the speed of sound.

Conclusions

In this work, the Group Contribution (GC) approach was employed to estimate the input variables for the ANN model. Critical properties and the acentric factor were determined based on the molecular structure of DESs, utilizing the Lydersen and Joback–Reid GC methods. The combined ANN + GC model was developed to predict the speed of sound in DESs across a wide temperature range. For model development, 415 data points from 38 DESs were selected. The results indicate that a single hidden layer with sixteen neurons provides optimal values for ARD% and R². The model’s performance was evaluated using several metrics: average relative deviation percent (ARD%), standard deviation (SD), mean absolute error (MAE), root mean square deviation (RMSE), and R². The findings demonstrate that the ANN + GC model can accurately estimate the speed of sound in DESs over a wide temperature range. The obtained values for the metrics were ARD% = 0.032%, SD = 0.0549, MAE = 1.5656, RMSE = 2.227, and R² = 0.9988. The model results were compared to five correlation-based models and a Machine Learning (ML) model. Similar to the ANN + GC approach, the GC method was combined with the ML model. The results indicated that the ML + GC and ANN + GC models were comparable; however, the ANN + GC model showed slightly lower error values. Overall, the error metrics for the ANN + GC approach were lower than those of the five correlation-based models. The cumulative frequency diagrams and the leverage approach were implemented to validate the quality and reliability of the proposed model. The leverage analysis confirmed the accuracy of the data used and the high reliability of the ANN + GC model for estimating the speed of sound in DESs. The primary goal of this study was to predict the speed of sound in DESs based solely on their molecular structure, without relying on any experimental data. This work demonstrates that the ANN + GC method can serve as a robust model for predicting second-order derivative thermodynamic properties of DESs in the absence of experimental measurements. The proposed model could be employed in future studies, particularly in the pre-design of new solvents.

Data availability

All data generated or analysed during this study are included in this published article [and its supplementary information files].

Abbreviations

ARD (%):: Average relative deviation
$b_{j}$ :: Bias
IF _j :: Relative importance of the j^th input variable on output variable
i :: Input layers
h :: Hidden layers
k :: Input layers
m :: Hidden layers
MAE:: Mean average error
$n_{j}$ :: The j^th neuron output
N _atoms :: Total number of atoms
N _i :: Number of inputs
N _h :: Hidden neurons
n :: Output layers
o :: Output layers
$p_{i}$ :: Output
P_c :: Critical pressure
RMSE:: Root Mean Square Deviation
R² :: Coefficient of determination
SD:: Standard deviation
T_b :: Normal boiling temperature
T_c :: Critical temperature
V_c :: Critical volume
$w_{ij}$ :: Weights of i^th neuron in the previous layer to the j^th neuron
w :: Connection weights
ω:: Acentric factor

References

Alhadid, A., Mokrushina, L. & Minceva, M. Design of deep eutectic systems: A simple approach for preselecting eutectic mixture constituents. Molecules 25, 1077 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lomba, L. et al. Deep eutectic solvents: Are they safe?. Appl. Sci. 11, 10061 (2021).
Article CAS Google Scholar
Emami, S. & Shayanfar, A. Deep eutectic solvents for pharmaceutical formulation and drug delivery applications. Pharm. Dev. Technol. 25, 779–796. https://doi.org/10.1080/10837450.2020.1735414 (2020).
Article CAS PubMed Google Scholar
Chen, Y. et al. Capture of toxic gases by deep eutectic solvents. ACS sustainable chemistry & engineering 8, 5410–5430 (2020).
Article CAS Google Scholar
Maleki, B., Reza, T., Ali, K. & Ahmadpoor, F. Facile protocol for the synthesis of 2-Amino-4H-chromene derivatives using choline chloride/urea. Org. Prep. Proced. Int. 53, 34–41. https://doi.org/10.1080/00304948.2020.1833623 (2020).
Article CAS Google Scholar
Esfahani, H. S., Khoshsima, A., Pazuki, G. & Hosseini, A. Separation of methanol and ethanol from azeotropic MTBE mixtures using choline chloride-based deep eutectic solvents. J. Mol. Liq. 381, 121641. https://doi.org/10.1016/j.molliq.2023.121641 (2023).
Article CAS Google Scholar
Esfahani, H. S., Khoshsima, A. & Pazuki, G. Choline chloride-based deep eutectic solvents as green extractant for the efficient extraction of 1-butanol or 2-butanol from azeotropic n-heptane + butanol mixtures. J. Mol. Liq. 313, 113524. https://doi.org/10.1016/j.molliq.2020.113524 (2020).
Article CAS Google Scholar
Florindo, C., Monteiro, N. V., Ribeiro, B. D., Branco, L. C. & Marrucho, I. M. Hydrophobic deep eutectic solvents for purification of water contaminated with Bisphenol-A. J. Mol. Liq. 297, 111841. https://doi.org/10.1016/j.molliq.2019.111841 (2020).
Article CAS Google Scholar
Atilhan, M. & Aparicio, S. A review on the thermal conductivity of deep eutectic solvents. J. Therm. Anal. Calorim. 148, 8765–8776. https://doi.org/10.1007/s10973-023-12280-4 (2023).
Article CAS Google Scholar
Bowen, H. et al. Application of deep eutectic solvents in protein extraction and purification. Front. Chem. https://doi.org/10.3389/fchem.2022.912411 (2022).
Article PubMed PubMed Central Google Scholar
Yu, Y. et al. Green recycling of end-of-life photovoltaic modules via deep-eutectic solvents. Chem. Eng. J. 499, 155933. https://doi.org/10.1016/j.cej.2024.155933 (2024).
Article CAS Google Scholar
Jahanbakhsh-Bonab, P., Khoshnazar, Z., Sardroodi, J. J. & Heidaryan, E. A computational probe into the physicochemical properties of cyclodextrin-based deep eutectic solvents for extraction processes. Carbohydr. Polym. Technol. Appl. 8, 100596. https://doi.org/10.1016/j.carpta.2024.100596 (2024).
Article CAS Google Scholar
Jahanbakhsh-Bonab, P., Pazuki, G., Sardroodi, J. J. & Dehnavi, S. M. Assessment of the properties of natural-based chiral deep eutectic solvents for chiral drug separation: Insights from molecular dynamics simulation. Phys. Chem. Chem. Phys. 25, 17547–17557. https://doi.org/10.1039/D3CP00875D (2023).
Article CAS PubMed Google Scholar
Jahanbakhsh-Bonab, P., Esrafili, M. D., Rastkar Ebrahimzadeh, A. & Jahanbin Sardroodi, J. Exploring the structural and transport properties of glyceline DES-Based boron nitride nanotube Nanofluid: The effects of nanotube diameter. J. Mol. Liquids 341, 117277. https://doi.org/10.1016/j.molliq.2021.117277 (2021).
Article CAS Google Scholar
Jahanbakhsh-Bonab, P., Sardroodi, J. J. & Avestan, M. S. Electric field effects on the structural and dynamical properties of a glyceline deep eutectic solvent. J. Chem. Eng. Data 67, 2077–2087. https://doi.org/10.1021/acs.jced.2c00066 (2022).
Article CAS Google Scholar
Jahanbakhsh-Bonab, P., Esrafili, M. D., Rastkar Ebrahimzadeh, A. & Jahanbin Sardroodi, J. Are choline chloride-based deep eutectic solvents better than methyl diethanolamine solvents for natural gas Sweetening? Theoretical insights from molecular dynamics simulations. J. Mol. Liquids 338, 116716. https://doi.org/10.1016/j.molliq.2021.116716 (2021).
Article CAS Google Scholar
Jahanbakhsh-Bonab, P., Jahanbin Sardroodi, J. & Sadegh Avestan, M. The pressure effects on the amine-based DES performance in NG sweetening: Insights from molecular dynamics simulation. Fuel 323, 124249. https://doi.org/10.1016/j.fuel.2022.124249 (2022).
Article CAS Google Scholar
Jahanbakhsh-Bonab, P. & Sardroodi, J. J. Potential of amine-based DES for separation of CO₂ and H₂S from NG: Study of temperature effect. J. Environ. Chem. Eng. 11, 110517. https://doi.org/10.1016/j.jece.2023.110517 (2023).
Article CAS Google Scholar
Jahanbakhsh-Bonab, P., Jahanbin Sardroodi, J. & Heidaryan, E. Understanding the performance of amine-based DESs for acidic gases capture from biogas. Renewable Energy 223, 120069. https://doi.org/10.1016/j.renene.2024.120069 (2024).
Article CAS Google Scholar
Taherzadeh, M., Haghbakhsh, R., Duarte, A. R. C. & Raeissi, S. Estimation of the heat capacities of deep eutectic solvents. J. Mol. Liq. 307, 112940. https://doi.org/10.1016/j.molliq.2020.112940 (2020).
Article CAS Google Scholar
Leron, R. B. & Li, M.-H. Molar heat capacities of choline chloride-based deep eutectic solvents and their binary mixtures with water. Thermochim. Acta 530, 52–57. https://doi.org/10.1016/j.tca.2011.11.036 (2012).
Article CAS Google Scholar
Naser, J., Mjalli, F. S. & Gano, Z. S. Molar heat capacity of selected type III deep eutectic solvents. J. Chem. Eng. Data 61, 1608–1615. https://doi.org/10.1021/acs.jced.5b00989 (2016).
Article CAS Google Scholar
Zhang, K., Li, H., Ren, S., Wu, W. & Bao, Y. Specific heat capacities of two functional ionic liquids and two functional deep eutectic solvents for the absorption of SO₂. J. Chem. Eng. Data 62, 2708–2712. https://doi.org/10.1021/acs.jced.7b00102 (2017).
Article CAS Google Scholar
Peyrovedin, H., Haghbakhsh, R., Duarte, A. R. C. & Raeissi, S. A global model for the estimation of speeds of sound in deep eutectic solvents. Molecules 25, 1626 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lapeña, D., Lomba, L., Artal, M., Lafuente, C. & Giner, B. Thermophysical characterization of the deep eutectic solvent choline chloride: Ethylene glycol and one of its mixtures with water. Fluid Phase Equil. 492, 1–9. https://doi.org/10.1016/j.fluid.2019.03.018 (2019).
Article CAS Google Scholar
Peng, D. & Minceva, M. Predicting the density and viscosity of deep eutectic solvents at atmospheric and elevated pressures. Fluid Phase Equilib. 582, 114086. https://doi.org/10.1016/j.fluid.2024.114086 (2024).
Article CAS Google Scholar
Lashkarbolooki, M., Hezave, A. Z. & Ayatollahi, S. Artificial neural network as an applicable tool to predict the binary heat capacity of mixtures containing ionic liquids. Fluid Phase Equilib. 324, 102–107. https://doi.org/10.1016/j.fluid.2012.03.015 (2012).
Article CAS Google Scholar
Valavi, M., Dehghani, M. R. & Shahriari, R. Application of modified PHSC model in prediction of phase behavior of single and mixed electrolyte solutions. Fluid Phase Equilib. 344, 92–100. https://doi.org/10.1016/j.fluid.2013.01.007 (2013).
Article CAS Google Scholar
Sun, R. & Dubessy, J. Prediction of vapor–liquid equilibrium and PVTx properties of geological fluid system with SAFT-LJ EOS including multi-polar contribution. Part I: Application to H₂O–CO₂ system. Geochim. Cosmochim. Acta 74, 1982–1998. https://doi.org/10.1016/j.gca.2010.01.011 (2010).
Article CAS ADS Google Scholar
Shahriari, R., Dehghani, M. R. & Behzadi, B. A modified polar PHSC model for thermodynamic modeling of gas solubility in ionic liquids. Fluid Phase Equilib. 313, 60–72. https://doi.org/10.1016/j.fluid.2011.09.029 (2012).
Article CAS Google Scholar
Shahriari, R., Dehghani, M. R. & Behzadi, B. Thermodynamic modeling of aqueous ionic liquid solutions using PC-SAFT equation of state. Ind. Eng. Chem. Res. 51, 10274–10282. https://doi.org/10.1021/ie3012984 (2012).
Article CAS Google Scholar
Shahriari, R. & Dehghani, M. R. New electrolyte SAFT-VR Morse EOS for prediction of solid-liquid equilibrium in aqueous electrolyte solutions. Fluid Phase Equilib. 463, 128–141. https://doi.org/10.1016/j.fluid.2018.02.006 (2018).
Article CAS Google Scholar
Ramdan, D. et al. Prediction of CO₂ solubility in electrolyte solutions using the e-PHSC equation of state. J. Supercrit. Fluids https://doi.org/10.1016/j.supflu.2021.105454 (2021).
Article Google Scholar
Khoshsima, A. & Shahriari, R. Molecular modeling of systems related to the biodiesel production using the PHSC equation of state. Fluid Phase Equilib. 458, 58–83. https://doi.org/10.1016/j.fluid.2017.10.029 (2018).
Article CAS Google Scholar
Khoshsima, A. & Dehghani, M. R. Vapor–liquid and liquid–liquid equilibrium calculations in mixtures containing non-ionic glycol ether surfactant using PHSC equation of state. Fluid Phase Equilib. 377, 16–26. https://doi.org/10.1016/j.fluid.2014.05.041 (2014).
Article CAS Google Scholar
Coimbra, P., Duarte, C. M. M. & de Sousa, H. C. Cubic equation-of-state correlation of the solubility of some anti-inflammatory drugs in supercritical carbon dioxide. Fluid Phase Equilib. 239, 188–199. https://doi.org/10.1016/j.fluid.2005.11.028 (2006).
Article CAS Google Scholar
Chapman, W. G., Gubbins, K. E., Jackson, G. & Radosz, M. New reference equation of state for associating liquids. Ind. Eng. Chem. Res. 29, 1709–1721. https://doi.org/10.1021/ie00104a021 (1990).
Article CAS Google Scholar
Chapman, W. G., Gubbins, K. E., Jackson, G. & Radosz, M. New reference equation of state for associating liquids. Ind. Eng. Chem. Res. 29, 1709–1721 (1990).
Article CAS Google Scholar
Bülow, M., Ascani, M. & Held, C. ePC-SAFT advanced—Part I: Physical meaning of including a concentration-dependent dielectric constant in the born term and in the Debye-Hückel theory. Fluid Phase Equilib. 535, 112967. https://doi.org/10.1016/j.fluid.2021.112967 (2021).
Article CAS Google Scholar
Joback, K. G. & Reid, R. C. Estimation of pure-component properties from group-contributions. Chem. Eng. Commun. 57, 233–243. https://doi.org/10.1080/00986448708960487 (1987).
Article CAS Google Scholar
Liang, P. & Bose, N. Neural Network Fundamentals with Graphs, Algorithms, and Applications (Mac Graw-Hill, 1996).
MATH Google Scholar
Taskinen, J. & Yliruusi, J. Prediction of physicochemical properties based on neural network modelling. Adv. Drug Deliv. Rev. 55, 1163–1183. https://doi.org/10.1016/S0169-409X(03)00117-0 (2003).
Article CAS PubMed Google Scholar
Ketabchi, S., Ghanadzadeh, H., Ghanadzadeh, A., Fallahi, S. & Ganji, M. Estimation of VLE of binary systems (tert-butanol+ 2-ethyl-1-hexanol) and (n-butanol+ 2-ethyl-1-hexanol) using GMDH-type neural network. J. Chem. Thermodyn. 42, 1352–1355 (2010).
Article CAS ADS Google Scholar
Vrigazova, B. The proportion for splitting data into training and test set for the bootstrap in classification problems. Bus. Syst. Res. J. 12, 228–242 (2021).
Article Google Scholar
Valderrama, J. O., Sanga, W. W. & Lazzús, J. A. Critical properties, normal boiling temperature, and acentric factor of another 200 ionic liquids. Ind. Eng. Chem. Res. 47, 1318–1330. https://doi.org/10.1021/ie071055d (2008).
Article CAS Google Scholar
Valderrama, J. O. & Robles, P. A. Critical properties, normal boiling temperatures, and acentric factors of fifty ionic liquids. Ind. Eng. Chem. Res. 46, 1338–1344. https://doi.org/10.1021/ie0603058 (2007).
Article CAS Google Scholar
Zhu, J., Xu, Y., Feng, X. & Zhu, X. A detailed study of physicochemical properties and microstructure of EmimCl-EG deep eutectic solvents: Their influence on SO2 absorption behavior. J. Ind. Eng. Chem. 67, 148–155. https://doi.org/10.1016/j.jiec.2018.06.025 (2018).
Article CAS Google Scholar
Basaiahgari, A., Panda, S. & Gardas, R. L. Effect of ethylene, diethylene, and triethylene glycols and glycerol on the physicochemical properties and phase behavior of benzyltrimethyl and benzyltributylammonium chloride based deep eutectic solvents at 283.15–343.15 K. J. Chem. Eng. Data 63, 2613–2627. https://doi.org/10.1021/acs.jced.8b00213 (2018).
Article CAS Google Scholar
Basaiahgari, A., Panda, S. & Gardas, R. L. Acoustic, volumetric, transport, optical and rheological properties of Benzyltripropylammonium based deep eutectic solvents. Fluid Phase Equilib. 448, 41–49. https://doi.org/10.1016/j.fluid.2017.03.011 (2017).
Article CAS Google Scholar
Sánchez, P. B., González, B., Salgado, J., José Parajó, J. & Domínguez, Á. Physical properties of seven deep eutectic solvents based on l-proline or betaine. J. Chem. Thermodyn. 131, 517–523. https://doi.org/10.1016/j.jct.2018.12.017 (2019).
Article CAS ADS Google Scholar
Abdel Jabbar, N. M. & Mjalli, F. S. Ultrasonic study of binary aqueous mixtures of three common eutectic solvents. Phys. Chem. Liquids 57, 1–18. https://doi.org/10.1080/00319104.2017.1385075 (2019).
Article CAS Google Scholar
Vuksanović, J., Kijevčanin, M. L. & Radović, I. R. Effect of water addition on extraction ability of eutectic solvent choline chloride+ 1,2-propanediol for separation of hexane/heptane+ethanol systems. Korean J. Chem. Eng. 35, 1477–1487. https://doi.org/10.1007/s11814-018-0030-z (2018).
Article CAS Google Scholar
Sas, O. G., Fidalgo, R., Domínguez, I., Macedo, E. A. & González, B. Physical properties of the pure deep eutectic solvent, [ChCl]:[Lev] (1:2) DES, and its binary mixtures with alcohols. J. Chem. Eng. Data 61, 4191–4202. https://doi.org/10.1021/acs.jced.6b00563 (2016).
Article CAS Google Scholar
Kuddushi, M., Nangala, G. S., Rajput, S., Ijardar, S. P. & Malek, N. I. Understanding the peculiar effect of water on the physicochemical properties of choline chloride based deep eutectic solvents theoretically and experimentally. J. Mol. Liquids 278, 607–615. https://doi.org/10.1016/j.molliq.2019.01.053 (2019).
Article CAS Google Scholar
Shekaari, H., Zafarani-Moattar, M. T., Mokhtarpour, M. & Faraji, S. Volumetric and compressibility properties for aqueous solutions of choline chloride based deep eutectic solvents and Prigogine–Flory–Patterson theory to correlate of excess molar volumes at T = (293.15 to 308.15) K. J. Mol. Liquids 289, 111077. https://doi.org/10.1016/j.molliq.2019.111077 (2019).
Article CAS Google Scholar
Sas, O. G., Castro, M., Domínguez, Á. & González, B. Removing phenolic pollutants using deep eutectic solvents. Separ. Purif. Technol. 227, 115703. https://doi.org/10.1016/j.seppur.2019.115703 (2019).
Article CAS Google Scholar
Abri, A., Babajani, N., Zonouz, A. M. & Shekaari, H. Spectral and thermophysical properties of some novel deep eutectic solvent based on l-menthol and their mixtures with ethanol. J. Mol. Liquids 285, 477–487. https://doi.org/10.1016/j.molliq.2019.04.001 (2019).
Article CAS Google Scholar
Dabiri, M.-S., Hadavimoghaddam, F., Ashoorian, S., Schaffie, M. & Hemmati-Sarapardeh, A. Modeling liquid rate through wellhead chokes using machine learning techniques. Sci. Rep. 14, 6945. https://doi.org/10.1038/s41598-024-54010-2 (2024).
Article CAS PubMed PubMed Central ADS Google Scholar
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V. & Gulin, A. CatBoost: unbiased boosting with categorical features. Advances in Neural Information Processing Systems 31 (2018).
Bandpey, A. F., Abdi, J. & Firozjaee, T. T. Improved estimation of dark fermentation biohydrogen production utilizing a robust categorical boosting machine-learning algorithm. Int. J. Hydrogen Energy 52, 190–199 (2024).
Article ADS Google Scholar
Sheikhshoaei, A. H., Khoshsima, A. & Zabihzadeh, D. Predicting the heat capacity of strontium-praseodymium oxysilicate SrPr4(SiO4)3O using machine learning, deep learning, and hybrid models. Chem. Thermodyn. Therm. Anal. 17, 100154. https://doi.org/10.1016/j.ctta.2024.100154 (2025).
Article Google Scholar
Meng, Q. et al. A communication-efficient parallel algorithm for decision tree. Advances in Neural Information Processing Systems 29 (2016).
Abiodun, O. I. et al. State-of-the-art in artificial neural network applications: A survey. Heliyon 4 (2018).
Zou, J., Han, Y. & So, S.-S. Overview of artificial neural networks. Artificial neural networks: methods and applications, 14–22 (2009).
Zhang, Z. & Zhang, Z. Artificial neural network. Multivariate time series analysis in climate and environmental research, 1–35 (2018).
Wu, Y.-C. & Feng, J.-W. Development and application of artificial neural network. Wirel. Pers. Commun. 102, 1645–1656 (2018).
Article Google Scholar
Asadollahfardi, G. in Water Quality Management: Assessment and Interpretation 77–91 (Springer, 2014).
Marquardt, D. W. An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11, 431–441. https://doi.org/10.1137/0111030 (1963).
Article MathSciNet MATH Google Scholar
Levenberg, K. A method for the solution of certain non-linear problems in least squares. Q. Appl. Math. 2, 164–168 (1944).
Article MathSciNet MATH Google Scholar
Singh, M. P. & Singh, R. K. Correlation between ultrasonic velocity, surface tension, density and viscosity of ionic liquids. Fluid Phase Equilib. 304, 1–6. https://doi.org/10.1016/j.fluid.2011.01.029 (2011).
Article CAS Google Scholar
Hekayati, J. & Esmaeilzadeh, F. Predictive correlation between surface tension, density, and speed of sound of ionic liquids: Auerbach model revisited. J. Mol. Liquids 274, 193–203. https://doi.org/10.1016/j.molliq.2018.10.099 (2019).
Article CAS Google Scholar
Haghbakhsh, R., Keshtkari, S. & Raeissi, S. Simple estimations of the speed of sound in ionic liquids, with and without any physical property data available. Fluid Phase Equilib. 503, 112291. https://doi.org/10.1016/j.fluid.2019.112291 (2020).
Article CAS Google Scholar
Gardas, R. L. & Coutinho, J. A. P. Estimation of speed of sound of ionic liquids using surface tensions and densities: A volume based approach. Fluid Phase Equilib. 267, 188–192. https://doi.org/10.1016/j.fluid.2008.03.008 (2008).
Article CAS Google Scholar
Dabiri, M.-S. et al. Artificial Intelligence approaches to modeling equivalent circulating density for improved drilling mud management. ACS Omega https://doi.org/10.1021/acsomega.5c02050 (2025).
Article PubMed PubMed Central Google Scholar
Dabiri, M.-S. et al. Modeling drilling fluid density at high-pressure high-temperature conditions using advanced machine-learning techniques. Geoenergy Sci. Eng. 244, 213369. https://doi.org/10.1016/j.geoen.2024.213369 (2025).
Article CAS Google Scholar
GD, G. Interpreting neural network connections weights. Al Expert. Vol. 6 46–51 (Miller Freeman, 1991).
Valderrama, J. O., Faúndez, C. A. & Vicencio, V. J. Artificial neural networks and the melting temperature of ionic liquids. Ind. Eng. Chem. Res. 53, 10504–10511. https://doi.org/10.1021/ie5010459 (2014).
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Department of Pharmacy, Al-Zahrawi University College, Karbala, Iraq
Ayat Hussein Adhab
College of MLT, Ahl Al Bayt University, Karbala, Iraq
Morug Salih Mahdi
Department of Computer Engineering, Faculty of Engineering and Technology, Marwadi University Research Center, Marwadi University, Rajkot, Gujarat, 360003, India
Hardik Doshi
Department of Computer Engineering and Application, GLA University, Mathura, 281406, India
Anupam Yadav
Department of Data Analytics and Mathematical Sciences, School of Sciences, JAIN (Deemed to be University), Bangalore, Karnataka, India
R. Manjunatha
Department of Computer Application, Chandigarh Engineering College, Chandigarh Group of Colleges-Jhanjeri, Mohali, Punjab, 140307, India
Sushil Kumar
Centre for Research Impact and Outcome, Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, 140401, India
Debasish Shit
Chitkara Centre for Research and Development, Chitkara University, Baddi, Himachal Pradesh, 174103, India
Gargi Sangwan
Gilgamesh Ahliya University, Baghdad, Iraq
Aseel Salah Mansoor
Collage of Pharmacy, National University of Science and Technology, Dhi Qar, 64001, Iraq
Usama Kadem Radi
Medical Technical College, Al-Farahidi University, Baghdad, Iraq
Nasr Saadoun Abd

Authors

Ayat Hussein Adhab
View author publications
Search author on:PubMed Google Scholar
Morug Salih Mahdi
View author publications
Search author on:PubMed Google Scholar
Hardik Doshi
View author publications
Search author on:PubMed Google Scholar
Anupam Yadav
View author publications
Search author on:PubMed Google Scholar
R. Manjunatha
View author publications
Search author on:PubMed Google Scholar
Sushil Kumar
View author publications
Search author on:PubMed Google Scholar
Debasish Shit
View author publications
Search author on:PubMed Google Scholar
Gargi Sangwan
View author publications
Search author on:PubMed Google Scholar
Aseel Salah Mansoor
View author publications
Search author on:PubMed Google Scholar
Usama Kadem Radi
View author publications
Search author on:PubMed Google Scholar
Nasr Saadoun Abd
View author publications
Search author on:PubMed Google Scholar

Contributions

Ayat Hussein Adhab: Writing—Original draft and review, Validation, Investigation, Methodology. Morug Salih Mahdi: Writing—Original draft and review, Validation, Investigation, Methodology. Hardik Doshi: Supervision, Administration, Methodology, Conceptualization Anupam Yadav: Software (Programming, software development; designing computer programs, implementation of the computer code and supporting algorithms). Manjunatha R: Data Curation, Investigation (data collection), writing –review, Validation Sushil Kumar: Data Curation, Investigation (data collection), writing –review, Validation Debasish Shit: Data Curation, Investigation (data collection), writing –review, Validation Gargi Sangwan: Writing–Original draft and review, data collection Aseel Salah Mansoor: Writing—Original draft and review, data collection Usama Kadem Radi: Writing—Original draft and review, data collection Nasr Saadoun Abd: computational, and formal techniques to analyze or synthesize study data.

Corresponding author

Correspondence to Hardik Doshi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information. (download DOCX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Adhab, A.H., Mahdi, M.S., Doshi, H. et al. Prediction of speed of sound of deep eutectic solvents using artificial neural network coupled with group contribution approach. Sci Rep 15, 29238 (2025). https://doi.org/10.1038/s41598-025-14094-w

Download citation

Received: 02 December 2024
Accepted: 29 July 2025
Published: 10 August 2025
Version of record: 10 August 2025
DOI: https://doi.org/10.1038/s41598-025-14094-w