Abstract
In this work the Artificial Neural Network (ANN) and the Perturbed Hard Sphere Chain (PHSC) equation of state (EoS) have been utilized to estimate the osmotic coefficient, activity coefficient, and water activity of aqueous sugar solutions containing glucose, fructose, fucose, xylose, maltose, mannitol, mannose, sorbitol, xylitol, galactose, lactose, ribose, arabinose, and sucrose. The PHSC model parameters have been adjusted using the osmotic coefficient experimental data. Then, the water activity and sugar activity coefficient were predicted. In the case of the ANN approach, six variables containing critical temperature (Tc), critical volume (Vc), molality, temperature, melting temperature (Tm), and melting enthalpy (∆Hm) of sugars have been considered as input layer. As well, 32 neurons are considered in one hidden layer. The Group Contribution (GC) method was utilized to estimate the critical properties of sugars. The training correlating coefficient (R2), and the Mean Square Error (MSE) have been obtained 0.999 and 2.06 × 10–6, respectively. The average relative deviation (ARD) value of osmotic coefficient, water activity, and sugar activity coefficient using the PHSC EoS and the ANN + GC model have been obtained 0.43%, 0.12%, 0.66%, and 2.1%, 0.89%,1.65%, respectively. The model’s performance has been evaluated using the prediction of sugar solubilities in water. The results show that the ANN + GC and PHSC model can predict the solubility data satisfactory. The ANN + GC method can be used to predict the thermodynamic properties of a new aqueous sugar solution using the molecular structure in the absence of experimental data.
Similar content being viewed by others
Introduction
Sugars are composed of carbon, hydrogen, and oxygen and are found in a wide variety of foods. Sugars are known as a type of carbohydrate. Phase equilibrium calculations of sugars involve understanding the solubility of sugars in liquid solvents under varying conditions of temperature and pressure. This is crucial in fields like food science and pharmaceuticals where the stability and solubility of sugar are important. For example, understanding phase equilibrium helps in designing processes for candy making and sugar refining, ensures the stability and effectiveness of sugar-containing medications and supplements, and helps in formulating new products with desired textural and stability characteristics. In this regard, the estimation of thermodynamic properties of sugar-containing systems plays a crucial role in the phase equilibrium calculations. Estimating the thermodynamic properties of sugar-containing systems involves understanding various characteristics and behaviors of these systems, which can be complex due to interactions between solutes (sugars), solvents (usually water), and other potential components. Higher concentrations of sugar can lead to non-ideal behavior of the system. As well, interactions with acids, bases, or other solutes can alter the properties significantly. Estimating the thermodynamic properties of sugar-containing systems requires a combination of experimental data and theoretical modeling. Each specific system may present unique challenges, so a tailored approach that considers the particular sugars involved, their concentrations, and environmental conditions is crucial. For precise calculations or predictions, computational methods or simulation tools, alongside experimental validation, can provide comprehensive insights. In this work, the Artificial Neural Network (ANN) 1 approach and the Perturbed Hard Sphere Chain (PHSC) Equation of State (EoS) 2,3,4,5,6,7 have been utilized to estimate the osmotic coefficient and activity coefficient of aqueous sugar solutions over a wide range of concentrations. Thermodynamic models such as EoS and activity coefficient-based models are widely used for thermodynamic modeling of sugars-containing systems 8,9,10,11,12,13. Held et al. utilized the Perturbed-Chain Statistical Association Fluid Theory (PC-SAFT) EoS to correlate the osmotic coefficient of sugars in aqueous solutions 13. The PC-SAFT model parameters were obtained using experimental osmotic coefficient data. They predicated the solubility of sugars in water and alcohols up to high sugar concentrations. The UNIQUAC Functional-group Activity Coefficients (UNIFAC) model was used to estimate the solubility of sugars in aqueous and non-aqueous solutions 8,12. The osmotic coefficient (or water activity) is an important thermodynamic property in food industries because many enzymatic reactions that influence food stability are dependent on the availability of water 14. L. Ninni et al. estimated the water activity of various aqueous sugar solutions using the UNIFAC model 14. Feng et al. predicted the water activity of aqueous sugar solutions using the Statistical Association Fluid Theory (SAFT) EoS 15. They estimated the pure model parameters using the critical properties of sugars obtained by the Joback Group Contribution (GC) method 15. In this work the PHSC EoS is used to estimate the activity coefficient, and water activity of aqueous sugars solutions. The osmotic coefficient (or activity coefficient) data of sugars in water has been utilized to optimize the PHSC EoS model parameters. The sugar activity coefficient and solubility of sugars in water have been predicted without using any additional adjustable parameters. The results of the PHSC model have been compared to the ANN + GC model to evaluate the ANN performance. The ANNs have emerged as powerful tools for thermodynamic modeling due to their ability to capture complex, non-linear relationships in data. They are increasingly used to model the thermodynamic properties and behavior of systems where traditional methods may be inadequate or cumbersome. ANNs can model highly non-linear relationships between input and output variables, which is common in thermodynamic systems. As well, they rely on empirical data, making them useful when theoretical models are difficult to formulate or are too complex.
The phase behavior of mixtures, such as predicting vapor–liquid equilibrium (VLE) and liquid–liquid equilibrium (LLE), reaction rates and mechanisms in chemical processes, critical properties, enthalpies, entropies, and other thermodynamic properties for complex systems can be predicted using the ANNs. Therefore, ANN in thermodynamic modeling is a promising approach that offers flexibility and powerful predictive capabilities, especially when dealing with complex, multi-component systems. In this work, the ANN + GC approach has been used to estimate the osmotic coefficient, activity coefficient, and water activity of aqueous sugar solutions. The activity coefficient is a crucial parameter in determining how real solutions deviate from ideal behavior. In sugar solutions, these coefficients are influenced by factors such as sugar type, concentration, temperature, and intermolecular interactions. After training, ANNs can provide rapid predictions of activity coefficient (or water activity), significantly speeding up the modeling process compared to classical approaches.
In summary, this work is divided into four sections: first, the sugar’s critical temperature, pressure, and volume have been estimated using the Joback GC method, and the results of the ANN approach have been reported. Then, the PHSC EoS model parameters have been adjusted using experimental osmotic coefficient data. Then, the PHSC model results have been compared to the ANN + GC approach. Finally, the solubility of sugars in water has been predicted using the ANN + GC and PHSC models.
Models descriptions
The PHSC EoS
The PHSC EoS is a theoretical model used to describe the thermodynamic properties of fluids, particularly those composed of chain-like molecules. It combines the concepts of hard sphere models with perturbation theory to account for the interactions between segments of the chain molecules. In the PHSC model, molecules are considered as chains of hard spheres (segments) connected. This accounts for the shape and flexibility of real molecules 16,17. The perturbation theory adds corrections to the hard sphere model to account for attractive forces between molecules. It allows for more accurate predictions of thermodynamic properties. The basic idea behind the perturbed hard-sphere chain (PHSC) model is to account for the interactions in a chain of hard spheres (which represents polymeric systems) while also considering perturbations that modify the behavior of the hard spheres. The PHSC EoS is primarily an extension of the hard-sphere model. The hard sphere model, which is one of the simplest models used to describe the behavior of dense fluids, assumes that molecules behave as hard, impenetrable spheres that do not interact except for collisions. While the hard-sphere model can predict certain behaviors, it doesn’t capture the complexities of real systems, such as polymer solutions or systems with long-range interactions2. The PHSC EoS is widely used in various industries and academic fields where accurate thermodynamic predictions of polymeric systems, colloidal suspensions, or soft matter systems are required 3,4,5,18,19. The model assumes that all particles are spherical, which may not apply to systems with elongated or irregularly shaped particles. The perturbation terms are only valid in certain regions (e.g., dilute to moderately dense systems). For very dense systems, the assumptions behind the perturbation may break down, leading to inaccurate predictions. By modeling chain-like molecules and introducing perturbation terms, this EoS can describe complex behaviors in a variety of physical systems, including polymer solutions, colloidal dispersions, and soft matter systems. However, its accuracy and applicability depend on the assumptions of the model, such as spherical symmetry and the validity of perturbation theory. The PHSC EoS is typically expressed in terms of the compressibility factor as follows:
where Z, Zref, Zpert, and Zassoc refer to the total compressibility factor, reference, perturbation, and association, respectively. The parameters, a(T) and b(T) were defined as follows 2:
where σ and ε refer to segment diameter and segment energy in the PHSC EoS. In Eqs. (2) and (3), T and kB refer to temperature and Boltzmann constant. Fa and Fb are temperature-dependent parameters; refer to original PHSC paper 2. The Boublik–Mansoori–Carnahan–Starling equation is used for the radial distribution function (ɡhs) of hard sphere mixtures 20:
where η is reduced density and ζ refers to packing fraction. \({Z}^{assoc}\) is defined based on the Statistical Association Fluid Theory (SAFT) model introduced by Chapman et al. 21:
where \(X^{{A_{i} }}\), \({\Delta }^{{A_{i} B_{j} }}\), \(\varepsilon^{{A_{i} B_{j} }}\), and \(\kappa^{{A_{i} B_{j} }}\) refer to the unbonded association fraction, association strength, association energy, and association volume, respectively. This interaction often involves hydrogen bonding or dipole–dipole interactions, where molecules can form associations that impact the overall properties of the fluid. Association interactions in SAFT provide a powerful way to predict the behavior of complex fluids. They highlight the importance of molecular interactions in determining physical properties and phase behavior. Understanding this interaction is critical in various fields, including material science (such as designing polymers with specific properties), biophysics (such as protein interactions), and chemical engineering (such as emulsions and dispersions).
Using the Eqs. (1)–(7), the water activity coefficients aw and activity coefficient of component i are calculated by:
where \({\varphi }_{w}\) is the fugacity coefficient of water in the mixture, and \({\varphi }_{0w}\) stands for the fugacity coefficient of pure water. The osmotic coefficient is as follows:
with Mw is the molecular weight of water, mi is the molality, and \({\nu }_{i}\) represents the number of species per solute. The symmetric activity coefficient of component i is defined as follows:
The fugacity coefficient of component i is calculated using the PHSC EoS. For more details refer to 3,4,5,6,7,18,19.
The ANN models
The ANN algorithm was developed based on the number of input neurons, number of hidden layers, and output neurons. The ANN models can be specified by three entities: interconnections, activation functions, and learning rules. In this work Feedforward Neural Networks (FNN) architecture has been considered. In the case of the FNN approach, the information flows in one direction from input to output and the layers are fully connected. Therefore, each neuron in a layer is connected to all the neurons in the next layer. In the ANN method, the inputs are fed to the input layer and delivered to the hidden layer using a specific transfer function. The converted inputs are sent to the output layer to estimate the desired properties. The estimated values are compared to experimental data to analyze the error using the objective function (OF). Finally, the results are fed back to the system. This process is repeated using trial and error to reach a minimum error value of OF. The number of neurons in the hidden layer in the trial and error approach is varied to reach the best OF outputs. The number of neurons in the hidden layer is obtained using the error analysis. At first, one neuron is considered to estimate the error using the training subset. Then, two neurons are considered to estimate the error. This process is continued to reach the optimum number of neurons using the minimum error value of testing subsets. Therefore, the number of neurons in the hidden layer increases if the desired error is not obtained. In this study the multilayer feed-forward network is utilized; see Fig. 1.
The neuron’s outputs in the network are estimated using the activation function and adjustable weight values. A neural network is a complex network of interconnected nodes, each representing a neuron. The neurons work together to produce output. However, not all connections between neurons are created equal. The weights are defined to determine the strength of the connections between neurons. During the training step of an ANN, the weights are adjusted iteratively to minimize the difference between the predictions and the actual target values. The weights have been optimized by using the Levenberg-Marquardt optimization algorithm 22,23. Weights and biases are critical concepts in ANN. These are the learnable parameters of a neural network that govern the network’s activity. The output of the ANN is more accurate by adjusting the weights and biases during training. Activation functions are used to induce non-linearity in a neuron’s output. A neuron’s output would be a linear function of its inputs without an activation function. In the next section, the results of the PHSC EoS and ANN model have been discussed.
Results and discussion
The ANN approach
In this study 840 data points for osmotic coefficient, activity coefficient, and water activity of aqueous sugar solutions have been collected. Data points have been divided into 600 train and 240 test/validation points (about 70% train, 15% test, and 15% validation datapoints). The melting temperature, melting enthalpy, molality, critical temperature, temperature, and critical volume of sugars have been considered as input layer. The critical properties of sugars have been estimated using the Group Contribution (GC) approach such as Joback–Reid 24,25. The Joback GC method is a group contribution method that estimates the critical properties (such as critical temperature and pressure) of organic compounds based on the functional groups that make up the molecule. This method uses a set of empirical correlations derived from experimental data. Sugars (carbohydrates) are generally made up of various groups such as alcohols, aldehydes, ketones, and sometimes ethers. Alcohol group (-OH) is found in the hydroxyl groups of sugars, aldehyde group (-CHO) in aldoses (like glucose), ketone group (-CO-) in ketoses (like fructose), ether group (-O-) in disaccharides and oligosaccharides.
Integrating artificial neural networks with group contribution methods like the Joback GC method presents a promising approach for enhancing the predictive capability of thermophysical properties in chemical engineering. The ANNs can capture non-linear relationships and interactions between various functional groups that traditional group contribution methods might overlook, once trained. As well, the ANN can easily adapt to new data or different classes of compounds. The model can be trained on a broader dataset that minimizes biases present in traditional group contribution table, and once established, ANNs can quickly predict properties for new compounds compared to the iterative nature of traditional methods. The ANN + GC approach can be used for new material design and chemical processes such as: assisting in designing new materials with desired thermophysical properties by predicting those values for combinations of functional groups and improving accuracy in process simulation software by providing better estimates of properties required for thermodynamic calculations. In summary, combining artificial neural networks with group contribution methods like Joback–Reid GC method 26 offers a powerful approach to predicting thermophysical properties. This integrated method leverages the strengths of both traditional empirical methods and modern machine learning techniques, leading to enhanced accuracy and adaptability for complex molecular systems. Using the right data and model design, this combination can significantly improve property estimation in chemical engineering and material science. In Table 1 the groups parameters of the Joback–Reid GC method have been reported.
The normal boiling temperature (Tb), critical temperature (Tc), and critical volume (Vc) are estimated using as follows:
where \({N}_{k}\), \({tb}_{k}\), \({tc}_{k}\), and \({vc}_{k}\) refer to number of groups, boiling temperature, critical temperature, and critical volume, respectively. In Table 2 the carbohydrates properties have been presented.
The input layer has six neurons containing molality, critical volume, critical temperature, melting temperature, temperature, and melting enthalpy. In Table 3 (in the next sections) the melting enthalpy and melting temperature of studied sugars have been reported.
The trial and error approach is used to find the optimum ANN architecture. There are no theoretical methods to identify the best optimum ANN architecture 27. The results show that the optimum number of the hidden layer, and the number of neurons in the hidden layer are one and 32 respectively. As shown in Fig. 1, three outputs containing water activity, sugar activity coefficient, and osmotic coefficient have been considered in the ANN network. The ANN model performance has been evaluated using the Mean Square Error (MSE) and R2 as follows:
where \({x}_{i}^{exp}\) and \({x}_{i}^{calc}\) refer to experimental and calculated osmotic coefficient, water activity, or activity coefficient. \({\overline{x} }^{exp}\) is the average value of the experimental \(x\). As well, the average absolute deviation (AAD%) of five sugars (that not considered previously for the training subset) is calculated as follows:
In Figure 2a and b, the MSE vs network iterations (epochs) and the regression of data points using the ANN+GC approach have been depicted.
The best MSE value has been obtained 2.06 × 10-6 at 3000 iterations, and the R2 value has been obtained about 0.999. In Figure 2b, the training, testing, and validation of the proposed ANN+GC model have been depicted. As shown in Figure 2b, the model can correlate all data points satisfactory.
In Figure 3, osmotic coefficient, water activity, and activity coefficient of ANN model have been compared to the experimental data.
The results show that the ANN+GC approach can estimate the water activity, osmotic coefficient, and activity coefficient of sugars accurately. The trial and error should be used to identify the optimum ANN architecture. A proposed ANN architecture cannot reproduce the same results after each running of the network, exactly 27. Nevertheless, if the weight and Bias of input, hidden, and output layers are known, the ANN architecture can produce the same results after each run. As well, the matrix of neural network weights can be used to determine the relative importance of the various input variables on the output variable. In this regard, in Table 3 the weight and Bias values of hidden and output layers have been reported.
The determination of the importance of each input variable (sensitivity analysis of input variables) can be studied using the weight connection between the input layer-hidden layer and between the hidden layer-output layer; Table 3. Garson suggested an equation based on partitioning of connection weights for sensitivity analysis of input variables as follows 28:
where IFj is the relative importance of the jth input variable on output variable; Ni and Nh refer to the number of input and hidden neurons, respectively. The superscripts i, h and o refer to input, hidden and output layers, respectively. The subscripts k, m and n refer to input, hidden and output layers, respectively. w is connection weights.
The relative importance of input variables (IFj) have been calculated by Eq. (17), and depicted in Fig. 4.
It can be seen that logically all selected input variables have strong effects on the osmotic coefficient, activity coefficient, and water activity values with importance equal to 14% to 19 %. As shown in Fig. 4, the melting enthalpy and temperature have a minor effect on the aforementioned properties (IF = 14%). However, the relative importance of critical volume and critical temperature are the same values. Using a hybrid method based on a Feed Forward (FF)-ANN approach the osmotic coefficient, activity coefficient, and water activity of aqueous sugar solutions have been estimated up to high sugar concentrations.
Thermodynamic modeling
In the case of sugar-water solutions, water parameters were adjusted using the vapor pressure and saturated liquid density experimental data. Five model parameters containing segment number r, segment diameter σ, segment-segment interaction energy ε, association energy \({\varepsilon }^{AB}\) and association volume \({\kappa }^{AB}\) have been adjusted for an associative component such as water. In this work, two association sites (one donor and one acceptor) have been considered for each water molecule. As well, the sugar molecules have been modeled as associative molecules. Number of association sites have been defined based on the OH group of molecules. For each OH group two association sites (one donor and one acceptor) have been considered (refer to Table 3). The osmotic coefficient experimental data in binary sugar-water system has been utilized to optimized the sugars parameters. The Eq. (18) has been used to optimize the model parameters:
where i is data points, and N refers to total number of experimental data 29. In Table 4 the model parameters have been reported.
The water activity and activity coefficient of sugars have been predicated using the adjusted parameters in Table 4. The sugar model parameters, number of association sites, maximum molality, number of data points, and temperature have been presented in Table 4.
In Figs. 5, 6 and 7, the model results have been compared to osmotic coefficient, water activity, and activity coefficient experimental data.
Osmotic coefficient of (○) D-Glucose, (□) Xylose, (∆) L-Arabinose, (◊) Xylitol, (*) D-Ribose, (⨯) D-fructose, (-) Sucrose. Lines refer to model calculations and symbols refer to experimental data 29 at 298.15 K.
Water activity (○) D-Glucose, (□) Xylose, (∆) Sucrose. Lines refer to model calculations and symbols refer to experimental data 29 at 298.15 K.
Activity coefficient (○) D-Glucose, (□) Xylose, (∆) Sucrose, (◊) Sorbitol. Lines refer to model calculations and symbols refer to experimental data 29 at 298.15 K.
As shown in Figs. 5, 6 and 7, the PHSC EoS estimates the activity coefficient, water activity, and osmotic coefficient of binary sugar-water systems accurately.
As shown in Fig. 6, the activity of water decreases as the molality of sugar in water increases because the presence of sugar molecules disrupts the water’s structure and reduces the concentration of free water molecules available to participate in reactions. Pure water has an activity of 1.0. Solutions have activities less than 1.0. When you dissolve sugar in water, the sugar molecules (sucrose, for example) also form hydrogen bonds with water molecules. The sugar molecules essentially “tie up” some of the water molecules, reducing the number of water molecules that are free to interact with each other and exert their normal vapor pressure. The sugar molecules physically obstruct the water-water interactions. Therefore, adding sugar to water decreases water activity because the sugar molecules interfere with the water-water hydrogen bonding network, reducing the number of water molecules effectively available.
As shown in Fig. 7, while adding sugar to water always decreases water activity, the activity coefficient of sugar typically increases with increasing molality. This is due to a combination of solute–solute repulsions, changes in solvation, and excluded volume effects, all leading to positive deviations from ideal behavior. The specific behavior depends on the sugar, concentration range, and temperature. It must be noted that the prediction of the activity coefficient of sugars in solvent plays a crucial role in solubility calculations. In this regard, the activity coefficient of carbohydrates in aqueous solutions must be predicted accurately. In the crystallization process for the purification of carbohydrates, the sugar activity coefficient is needed in a wide range of temperatures and sugar concentrations to estimate the solubility of sugars. The use of the PHSC EoS provides a robust framework for estimating the activity coefficients of sugar solutions. By carefully choosing parameters and validating against experimental data, reliable predictions can be obtained. In the next section the PHSC EoS results have been compared to the ANN + GC approach.
Comparison between the PHSC and ANN + GC models
In the case ANN + GC method, the input variables and the connection weights between inputs-hidden layers and hidden-output layers can be used to precited the osmotic coefficient, water activity, and activity coefficient of sugars over a wide range of sugar concentrations. On the other hand, the PHSC parameters (Table 3) can be used to predict the activity coefficient and water activity of sugars. In Table 5, the ARD% values of PHSC EoS and the ANN + GC approach have been reported.
As shown in Table 5, the average ARD% values of the osmotic coefficient, water activity, and activity coefficient of the PHSC EoS have been obtained 0.436%, 0.119%, and 0.657%, respectively.
In the case of the ANN + GC approach, the prediction capability of the ANN + GC model has been investigated. In this regard, the osmotic coefficient, water activity, and sugar activity coefficient of five sugars containing L-Arabinose, D-Ribose, D-mannose, Sorbitol, and Maltose have been predicted using the ANN + GC model. It should be noted that the experimental data for the aforementioned sugars were not used in the development of the ANN. The network architecture, Bias, and weights of all layers were saved in the “saved network” program code; refer to Supplementary material. Six inputs of the aforementioned sugars were fed to the saved file to predict their osmotic coefficient, sugar activity coefficient, and water activity. Using the “saved network” and six input variables, the osmotic coefficient, sugar activity coefficient, and water activity were predicted satisfactory. The average ARD% values of the osmotic coefficient, water activity, and sugar activity coefficient have been obtained 5.3%, 2.3%, and 4.2%, respectively. In the Supplementary material, the complete MATLAB codes which include all the source codes of the programming have been provided.
In Fig. 8, the ARD% values of sugars osmotic coefficient, and activity coefficient of the PHSC EoS and ANN + GC approach have been compared.
As shown in Fig. 8, the average ARD% value of the osmotic coefficient and activity coefficient predicted by ANN + GC approach is higher than the PHSC model. It is due to the ARD% values of five sugars containing L-Arabinose, D-Ribose, D-mannose, Sorbitol, and Maltose. The reported ARD% values of the aforementioned sugars are pure predictions. Because their experimental data was not utilized for network development. After training, ANNs can provide rapid predictions, significantly speeding up the modeling process compared to EoS-based approaches. This is the main benefit of the ANN approaches compared to the thermodynamic models such as the PHSC EoS.
The results show that the proposed ANN architecture can predict the osmotic coefficient, water activity, and activity coefficient of aqueous sugar solutions using the universal weight and Bias connections of input, hidden, and output layers. The proposed methodology for the estimation of critical properties using the GC methods helps us to use the ANN approach for the new-designed sugars.
Solubility of sugars in water
In this section, the solubility of several sugars in water has been investigated using the ANN + GC and the PHSC EoS. In this regard, the obtained solute activity coefficient must be used to calculate the sugar solubility. The Solid–Liquid Equilibrium (SLE) phase-equilibrium conditions between the liquid and the solid phase are applied to estimate the solubility of sugars in liquid solvents.
The mole fraction of component i in a solvent at equilibrium (solubility) is calculated as follows:
where xi and \({\gamma }_{i}^{solute}\) are the mole fraction and the symmetric activity coefficient of the sugars, respectively. ∆Hm stands for the melting enthalpy, Tm is the melting temperature of the sugar and ∆Cp is the difference in solute heat capacity between liquid and solid at the melting point, R is the gas constant, and T is the temperature. The melting temperature, melting enthalpy, and heat capacity for six samples were reported in literature 8,31,32; see Table 6. In Table 6, the ARD% values of the PHSC and ANN + GC models have been reported.
As shown in Eq. (19), the solubility of sugars in water depends on the solute activity coefficient, melting temperature, \(\Delta {C}_{p}\), and \(\Delta {H}_{m}\). The activity coefficient of solute at desired molality can be estimated using ANN + GC and PHSC approaches in Sections “The PHSC EoS” and “The ANN models”. Therefore, the solubility of sugars in water can be predicted without using any additional adjustable parameters or experimental data. The average ARD% values of the PHSC and ANN + GC approach have been obtained about 9.2% and 11.2%, respectively. In the case of D-glucose and D-galactose, higher ARD% values were observed, especially in the ANN + GC method. In Fig. 9, the PHSC and ANN + GC results have been compared to experimental solubility data.
As shown in Fig. 9, the PHSC and ANN + GC models can predict the solubility of D-fructose, D-Glucose, and Xylitol in water satisfactory.
The results of this work show that the ANN and the perturbation-based EoS such as the PHSC model can be used to estimate the thermodynamic properties of aqueous sugar solutions up to high sugar concentration, satisfactory. The ANN + GC method is a robust model when experimental data is not available for the new-designed carbohydrates. In the ANN + GC method, the GC approach helps us to estimate the sugar properties using the molecular structure, then the ANN approach estimates the osmotic coefficient and sugar activity coefficient, satisfactory. On the other hand, the PHSC model can extend to high pressure (or high temperature), accurately. Also, the relationship between thermodynamic properties (first-order derivative thermodynamic properties such as density and second-order derivative thermodynamic properties such as speed of sound or heat capacity) helps us to estimate/predict other thermodynamic properties over a wide range of pressures and temperatures by using the PHSC model.
Conclusion
In this work, the ANN + GC approach has been proposed to predict the osmotic coefficient, activity coefficient, and water activity of aqueous sugar solutions. The input layer of the ANN method has been estimated using the GC approach. Three output layers, and one hidden layer (with 32 neurons) have been considered to develop the ANN + GC method. The average ARD% values of the predicted osmotic coefficient, water activity, and activity coefficient have been obtained 5.3%, 2.3%, and 4.2%, respectively. Also, the PHSC EoS has been used to study the aforementioned thermodynamic properties. The PHSC EoS model parameters have been obtained using osmotic coefficient experimental data. Then the activity coefficient and water activity of sugar solutions have been predicted. The average ARD% values of the osmotic coefficient, water activity, and activity coefficient of the PHSC EoS have been obtained 0.436%, 0.119%, and 0.657%, respectively. The results show that the PHSC EoS and the ANN + GC approach can predict the activity coefficient, osmotic coefficient, and water activity of aqueous sugar solutions satisfactory. In the ANN + GC method, the GC approach helps us to estimate the sugar properties using the molecular structure. In the case of the PHSC EoS, the relationship between thermodynamic properties helps us to estimate/predict other thermodynamic properties over a wide range of pressures and temperatures. The solubility of six sugars in water was predicted to evaluate the performance of the ANN + GC and PHSC models. The results show that, the ANN + GC approach can predict the solubility of sugars in water in the absence of experimental data. By considering a fully predictive approach of the PHSC model, the results were in good agreement with experimental data. The proposed models can be extended for the calculation of thermo-physical properties of sugar-containing systems in future works.
Data availability
All data generated or analysed during this study are included in this published article [and its supplementary information files].
References
Liang, P. & Bose, N. Neural Network Fundamentals with Graphs, Algorithms, and Applications (Mac Graw-Hill, 1996).
Song, Y., Lambert, S. M. & Prausnitz, J. M. A perturbed hard-sphere-chain equation of state for normal fluids and polymers. Ind. Eng. Chem. Res. 33, 1047–1057. https://doi.org/10.1021/ie00028a037 (1994).
Shahriari, R., Dehghani, M. R. & Behzadi, B. A modified polar PHSC model for thermodynamic modeling of gas solubility in ionic liquids. Fluid Phase Equilib. 313, 60–72. https://doi.org/10.1016/j.fluid.2011.09.029 (2012).
Ramdan, D. et al. Prediction of CO2 solubility in electrolyte solutions using the e-PHSC equation of state. J. Supercrit. Fluids 180, 105454. https://doi.org/10.1016/j.supflu.2021.105454 (2022).
Khoshsima, A. & Shahriari, R. Molecular modeling of systems related to the biodiesel production using the PHSC equation of state. Fluid Phase Equilib. 458, 58–83. https://doi.org/10.1016/j.fluid.2017.10.029 (2018).
Valavi, M. & Dehghani, M. R. Application of PHSC equation of state in prediction of gas hydrate formation condition. Fluid Phase Equilib. 333, 27–37. https://doi.org/10.1016/j.fluid.2012.07.011 (2012).
Lee, B.-S. & Kim, K.-C. Phase equilibria of associating fluid mixtures using the perturbed-hard-sphere-chain equation of state combined with the association model. Korean J. Chem. Eng. 24, 133–147. https://doi.org/10.1007/s11814-007-5023-2 (2007).
Spiliotis, N. & Tassios, D. A UNIFAC model for phase equilibrium calculations in aqueous and nonaqueous sugar solutions. Fluid Phase Equilib. 173, 39–55. https://doi.org/10.1016/S0378-3812(00)00387-3 (2000).
Velezmoro, C. E., Oliveira, A. L., Cabral, F. A. & Meirelles, A. J. A. Prediction of water activity in sugar solutions using models of group contribution and equation of state. J. Chem. Eng. Jpn. 33, 645–653. https://doi.org/10.1252/jcej.33.645 (2000).
Van der Sman, R. Predicting the solubility of mixtures of sugars and their replacers using the Flory-Huggins theory. Food Funct. 8, 360–371 (2017).
Tsavas, P., Voutsas, E., Magoulas, K. & Tassios, D. Phase equilibrium calculations in aqueous and nonaqueous mixtures of sugars and sugar derivatives with a group-contribution model. Ind. Eng. Chem. Res. 43, 8391–8399. https://doi.org/10.1021/ie049353n (2004).
Peres, A. M. & Macedo, E. A. A modified UNIFAC model for the calculation of thermodynamic properties of aqueous and non-aqueous solutions containing sugars. Fluid Phase Equilib. 139, 47–74. https://doi.org/10.1016/S0378-3812(97)00196-9 (1997).
Held, C., Sadowski, G., Carneiro, A., Rodríguez, O. & Macedo, E. A. Modeling thermodynamic properties of aqueous single-solute and multi-solute sugar solutions with PC-SAFT. AIChE J. 59, 4794–4805. https://doi.org/10.1002/aic.14212 (2013).
Ninni, L., Camargo, M. & Meirelles, A. J. Water activity in polyol systems. J. Chem. Eng. Data 45, 654–660 (2000).
Feng, W., van der Kooi, H. J. & de Swaan Arons, J. Application of the SAFT equation of state to biomass fast pyrolysis liquid. Chem. Eng. Sci. 60, 617–624. https://doi.org/10.1016/j.ces.2004.08.023 (2005).
Chiew, Y. C. Percus-Yevick integral-equation theory for athermal hard-sphere chains. Mol. Phys. 70, 129–143. https://doi.org/10.1080/00268979000100891 (1990).
Song, Y. & Mason, E. A. Statistical-mechanical theory of a new analytical equation of state. J. Chem. Phys. 91, 7840–7853. https://doi.org/10.1063/1.457252 (1989).
Valavi, M., Dehghani, M. R. & Shahriari, R. Capability of PHSC equation of state for thermodynamic modeling of aqueous amino acid and peptide solutions. J. Mol. Liq. 199, 21–28. https://doi.org/10.1016/j.molliq.2014.08.002 (2014).
Faraz, O. et al. Thermodynamic modeling of pharmaceuticals solubility in pure, mixed and supercritical solvents. J. Mol. Liq. 353, 118809. https://doi.org/10.1016/j.molliq.2022.118809 (2022).
Mansoori, G. A., Carnahan, N. F., Starling, K. E. & Leland, T. W. Jr. Equilibrium thermodynamic properties of the mixture of hard spheres. J. Chem. Phys. 54, 1523–1525. https://doi.org/10.1063/1.1675048 (1971).
Chapman, W. G., Gubbins, K. E., Jackson, G. & Radosz, M. New reference equation of state for associating liquids. Ind. Eng. Chem. Res. 29, 1709–1721. https://doi.org/10.1021/ie00104a021 (1990).
Marquardt, D. W. An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11, 431–441. https://doi.org/10.1137/0111030 (1963).
Levenberg, K. A method for the solution of certain non-linear problems in least squares. Q. Appl. Math. 2, 164–168 (1944).
Valderrama, J. O., Sanga, W. W. & Lazzús, J. A. Critical properties, normal boiling temperature, and acentric factor of another 200 ionic liquids. Ind. Eng. Chem. Res. 47, 1318–1330. https://doi.org/10.1021/ie071055d (2008).
Valderrama, J. O. & Robles, P. A. Critical properties, normal boiling temperatures, and acentric factors of fifty ionic liquids. Ind. Eng. Chem. Res. 46, 1338–1344. https://doi.org/10.1021/ie0603058 (2007).
Joback, K. G. & Reid, R. C. Estimation of pure-component properties from group-contributions. Chem. Eng. Commun. 57, 233–243. https://doi.org/10.1080/00986448708960487 (1987).
Lashkarbolooki, M., Hezave, A. Z. & Ayatollahi, S. Artificial neural network as an applicable tool to predict the binary heat capacity of mixtures containing ionic liquids. Fluid Phase Equilib. 324, 102–107. https://doi.org/10.1016/j.fluid.2012.03.015 (2012).
Garson, G. D. Interpreting neural network connections weights. Al Expert 6, 46–51 (1991).
Ebrahimi, N. & Sadeghi, R. Osmotic properties of carbohydrate aqueous solutions. Fluid Phase Equilib. 417, 171–180. https://doi.org/10.1016/j.fluid.2016.02.030 (2016).
Linstrom, P. J. & Mallard, W. G. NIST chemistry WebBook. In NIST Standard Reference Database Number 69 National Institute of Standard and Thechnology Gaithersburg (2009).
Roos, Y. Melting and glass transitions of low molecular weight carbohydrates. Carbohydr. Res. 238, 39–48. https://doi.org/10.1016/0008-6215(93)87004-C (1993).
Raemy, A. & Schweizer, T. F. Thermal behaviour of carbohydrates studied by heat flow calorimetry. J. Therm. Anal. 28, 95–108. https://doi.org/10.1007/BF02105282 (1983).
Young, F. E. D-glucose-water phase diagram. J. Phys. Chem. 61, 616–619. https://doi.org/10.1021/j150551a023 (1957).
Jónsdóttir, S. Ó., Cooke, S. A. & Macedo, E. A. Modeling and measurements of solid–liquid and vapor–liquid equilibria of polyols and carbohydrates in aqueous solution. Carbohyd. Res. 337, 1563–1571 (2002).
Wang, S., Li, Q.-S., Li, Z. & Su, M.-G. Solubility of xylitol in ethanol, acetone, N, N-dimethylformamide, 1-butanol, 1-pentanol, toluene, 2-propanol, and water. J. Chem. Eng. Data 52, 186–188 (2007).
Young, F. E., Jones, F. T. & Lewis, H. J. D-fructose–water phase diagram. J. Phys. Chem. 56, 1093–1096 (1952).
Lowe, B. Experimental cookery from the chemical and physical standpoint (1955).
Author information
Authors and Affiliations
Contributions
S.K.I.: Designing computer programs, implementation of the computer code and supporting algorithms. R.J.A.: Writing—Review & Editing, Methodology, Conceptualization. D.S.: Supervision, Software (Programming, software development), Methodology, Conceptualization, Supervision. A.Y.: Designing computer programs, implementation of the computer code and supporting algorithms. S.V.M.: Data Curation, Investigation (data collection), Writing—Review & Editing, Programming. D.S.: Data Curation, Investigation (data collection), Writing—Review & Editing. S.S.: Data Curation, Investigation (data collection), Writing—Review & Editing, Programming. R.P.: Data Curation, Investigation (data collection), Writing—Review & Editing. W.M.T.: Data Curation, Investigation (data collection), Writing—Review & Editing, Programming. M.A.: Data Curation, Investigation (data collection), Writing—Review & Editing. M.J.J.: Data Curation, Investigation (data collection), Writing—Review & Editing. H.M.: Data Curation, Investigation (data collection), Writing—Review & Editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ibrahim, S.K., Albadr, R.J., Sur, D. et al. Prediction of thermodynamic properties of aqueous carbohydrates solution using the PHSC and ANN models. Sci Rep 15, 21539 (2025). https://doi.org/10.1038/s41598-025-06552-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-06552-2