Introduction

The key understanding the fracture behavior of asphalt mixtures under mechanical loading, such as single edge notch beams is crucial for ensuring long-term pavement durability performance1. Over the past decades, researchers have employed both experimental and numerical techniques in their researches, to investigate the mechanical and fracture properties of asphalt concrete2,3. Tests in laboratories for material testing, such as the Single Edge Notch Beam (SENB) test, bending, and indirect tensile strength tests have been widely used to quantify fracture toughness, fracture energy, and load-deformation characteristics of asphalt mixtures4,5,6,7. These experimental testing methods provide a quick and direct measurement of asphalt performance but are often take a lot of time, need resources, man power and limited in the number of mixtures and conditions that can be tested8,9,10. Consequently, alternative approaches using numerical modeling, including finite element simulations, have gained traction, enabling researchers to predict fracture response under various loading conditions with reduced experimental effort11,12,13. Finite element methods are extensively used by the researchers in their research studies, specially, those implemented in software such as ABAQUS, have been applied in a very extended way, to check and simulate the stress distribution and crack propagation in notched asphalt beams, which offering a flexible framework for investigating complex material behavior such as asphalt by using Prony series coefficients or using the cohesive zone modeling. These numerical models, however, depend heavily on accurate input parameters and material characterization, which often introduces uncertainty into predictions14,15,16,17,18.

In parallel, the field of materials engineering has witnessed an interest, that is growing in machine learning (ML) and data-driven based methods to model material properties such as Asphalt19,20,21,22. ML models have the capability to capture the most complex, nonlinear relationships between mixture constituents, mechanical properties, and fracture response of asphalt, without relying solely on explicit physical formulations22,23,24. Few studies have applied algorithms in this advance filed simulation techniques, such as Linear Regression, Gradient Boosting, and AdaBoost to predict stiffness modulus, fatigue life, and rutting resistance in asphalt mixtures, demonstrating that data-driven approaches can complement experimental and numerical investigations in traditional laboratories. Yet, while these models provide rapid predictions, their interpretability and direct link to physical fracture parameters remain limited24,25. This has led to the necessity of hybrid physics-guided ML approaches, where surrogate fracture parameters derived from measurable mixture properties such as stability, flow, air voids, and stiffness serve as ML targets and bridging experimental observations and numerical predictions with predictive modeling using machine learning techniques in detail26,27.

Although there are substantial advancements, key research gaps are still present. Although there are substantial advancements, key research gaps are still present. First, few studies have integrated experimental, numerical, and machine learning approaches in a combined framework to systematically predict fracture energy and brittleness in asphalt mixtures26,27,28,29. While some published ML-based studies have examined conventional performance indicators such as stiffness or rutting resistance, while direct prediction of fracture energy for notched beams through machine learning especially when coupled with hybrid physics-guided surrogates remains largely unexplored30,31. Second, while surrogate fracture parameters have been conceptually proposed, there is a lack of practical formulations that are both dimensionally homogeneous and validated against experimental and FEM results30,31. Lastly, there is limited discussion on validating ML-based surrogate predictions against both experimental and numerical benchmarks to quantify their accuracy and robustness32,33.

This study addresses these gaps mentioned in the previous paragraph, by developing a hybrid framework that integrates experimental SENB tests (performed at university of engineering and technology Peshawar, Pakistan), FEM simulation with ABAQUS, and ML models to get the fracture toughness, use the predicted response in conjunction with the Python to further train on the dataset to predict surrogate fracture parameters in asphalt mixtures. Experimental work provides measured fracture response, FEM models simulate numerical behavior for the same specimens, and ML models including Linear Regression, Gradient Boosting, and AdaBoost predict surrogate fracture energy based on mixture constituents and mechanical properties. Overall, A novel surrogate fracture equation is proposed to link mixture properties (stability, flow, ITSM20, and beam dimension) to fracture energy, allowing a physically interpretable, dimensionally consistent prediction. This approach enables a rapid prediction of fracture energy without additional physical testing, followed by cross-validation against both experimental and numerical data, and exploration of parameter sensitivity to improve mixture design.

Methodology

This study was conducted through an integrated, multi-stage methodology shown in Fig. 1 designed to develop a physics-guided machine learning framework for predicting fracture energy of the Single Edge Notch Beam. It started with an experimental program involving Single Edge Notch Beam (SENB) tests on asphalt mixtures to obtain benchmark fracture energy values. Concurrently, the finite element method (FEM) simulations were developed and validated against the experimental results of force displacement curve, to generate numerical fracture energy. A comprehensive dataset of mixture properties was subsequently utilized for machine learning modeling to perform correlation analysis and train multiple machine learning models including Linear Regression, Gradient Boosting, and AdaBoost. The core of the framework involved the development of a novelty based, dimensionally consistent surrogate equation that synthesizes key mixture properties. At the final stages, the predictions from the surrogate model were validated against both experimental and numerical benchmarks to ensure accuracy and physical relevance. To clearly articulate this progression, Sect. 3 is structured to mirror the methodological sequence. Section 3.1 presents the experimental procedures and key fracture measurements that serve as the foundational reference. Section 3.2 then introduces the FEM model development and its validation using experimental force–displacement curves. Section 3.3 focuses on the derivation of the surrogate fracture equation, highlighting its dimensional consistency and its role in linking physical parameters to fracture energy. Finally, Sect. 3.4 describes the machine learning models trained using both raw mixture properties and the surrogate-derived parameters, followed by a comprehensive evaluation against experimental and numerical benchmarks.

Fig. 1
Fig. 1
Full size image

Hybrid framework for predicting fracture energy.

Experimental program

The experimental program was designed to obtained precise data on the fracture behavior of asphalt concrete, which subsequently provided the foundation for finite element analyses and machine learning prediction. Single Edge Notch Beam (SENB) samples were made in a standard rectangular shape with dimensions of 375 mm in length, 75 mm in width, and 100 mm in height. To ensure a controlled crack initiation path, a middle notch of 19 mm depth was made at the center of each sample, as shown in Fig. 2. The aggregates used in this study were obtained from the KARKON Plant located near the Motorway Toll Plaza, with the source material originating from the Margalla region. These aggregates were selected to meet the National Highway Authority (NHA) Class B gradation specifications. Asphalt Cement (Bitumen) of grade 60–70 was procured from Attock Refinery Limited, Rawalpindi.

To study how temperature affects cracking, the samples were kept in in a temperature-controlled chamber for conditioning. Before testing, each beam was kept at the set temperature for 4 h to ensure uniform temperature. The main testing temperature was 20 °C, and tests were performed with a third-point bending setup using a displacement-controlled loading system, as shown in Fig. 3. For each bitumen content level, three specimens were prepared to ensure consistency and allow for reliable averaging of results. A total of five bitumen content levels were tested, resulting in 15 specimens overall (5 bitumen contents × 3 specimens per content level). During testing, the applied load and corresponding mid-span displacement were continuously recorded until complete failure of the specimen. The load–displacement data were then processed to compute fracture energy (Gf), which represents the energy absorbed by the mixture per unit crack surface area. This metric provided a quantitative measure of the mixture’s resistance to crack initiation and propagation, forming the basis for subsequent finite element validation and the development of surrogate machine learning models.

Fig. 2
Fig. 2
Full size image

Single edge notch beam dimensions.

Fig. 3
Fig. 3
Full size image

Single edge notch beam testing.

FEM analyses

For the Finite Element Method (FEM) component of the study, simulations were conducted exclusively at 20 °C to correlate with the experimental data at that temperature. The viscoelastic behavior of the asphalt was modeled using Prony series coefficients. All the inputs and details are tabulated in Table 1.

Table 1 Material properties and constants used in ABAQUS model.

The beam geometry, including the single-edge notch, was first created in Rhinoceros software and subsequently imported into Abaqus/CAE. Within Abaqus, the material properties were assigned, and boundary conditions were established to replicate a third-point loading test setup. A displacement-controlled analysis was performed by applying a 5 mm displacement at the loading points. A comprehensive mesh sensitivity analysis was conducted prior to the final simulations in which five global seed sizes, that are 15 mm, 9 mm, 8 mm, 6 mm, and 5 mm were tested. The comparison was based on the force–displacement curves obtained from the SENB simulations. The corresponding curves to the 5 mm and 6 mm mesh, as shown in Fig. 4, were nearly identical in both peak load and overall curve shape, indicating mesh-independent behavior beyond this refinement level. Consequently, a 5 mm global mesh size was selected for all subsequent analyses as it provided sufficient accuracy with an optimal computational cost. A meshed model is shown in Fig. 5 for visualization purpose.

Fig. 4
Fig. 4
Full size image

Force–displacement curve comparison for different mesh densities (Mesh sensitivity study showing overlapping curves for 5 mm and 6 mm meshes, confirming mesh-independent behavior).

Fig. 5
Fig. 5
Full size image

Meshed ABAQUS model.

Following the FEM simulations, the focus shifted to the fracture mechanics and machine learning (ML) phase of the study. The force-displacement data and the evolution of stress fields from the Abaqus models were extracted to calculate key fracture parameters, specifically the fracture energy. This computationally derived fracture energy was subsequently utilized for two primary purposes, that are, validation through comparison with experimentally obtained values, and to serve as the target variable for developing a surrogate model via machine learning for the data set downloaded from Mendeley website (https://data.mendeley.com/datasets/yb4brz3sdx/3) .

Interpretation of correlation matrix (w.r.t ITSM20)

Before starting machine learning modeling, the correlation analysis was performed which showed that ITSM20 is strongly related to stability which confirms the relationship between stiffness and load bearing capacity. Positive correlations with maximum specific gravity and viscosity suggest that dense aggregate packing and binder stiffness enhance mixture rigidity in resisting the load, which was the key parameter in the experimental program as well. Stability to flow ratio also increases with stiffness which points to higher brittleness in stiffer mixtures, which is opposite of experimental program. In contrast asphalt content and effective asphalt content are negatively correlated with ITSM20 reflecting the tradeoff between binder richness and stiffness where richer mixtures are softer but more ductile, as observed in experimental program. Flow also shows a weak negative link with stiffness supporting the idea that more deformable mixtures absorb more energy but resist less load. Overall, the results confirm that stiffness and stability govern toughness while binder content influences fracture energy and ductility. All correlation coefficients are shown in Fig. 6. Table 2 presents the Pearson correlation matrix p-values showing the relationships among key asphalt mixture parameters and the Indirect Tensile Stiffness Modulus at 20 °C (ITSM20).

Fig. 6
Fig. 6
Full size image

Pearson correlation.

Table 2 Pearson correlations P-values.

Machine learning framework

After performing correlation analyses, three machine learning algorithms were implemented to predict fracture related responses from the asphalt mixture dataset, details of which are tabulated in Table 3. In the machine learning framework, the predictive models were trained using a comprehensive set of mixture design, physical, mechanical, and gradation parameters as input features, namely viscosity (Vis), asphalt content (AC), maximum theoretical specific gravity (Gmm), effective asphalt content (EffAC), air voids (Va), unit weight (UWt), stability (Stab), flow value (Flow), stability-to-flow ratio (SFR), percent passing sieve #200 (P200), percent retained on sieve #4 (P4), percent retained on sieve #3/8 (P38), and the coarse-to-fine ratio (CF). These inputs collectively capture the effects of binder rheology, mixture volumetrics, and aggregate structure on mechanical response. The target output for all machine learning models was the Indirect Tensile Stiffness Modulus at 20 °C (ITSM20), which was selected as the primary response variable due to its relevance in characterizing mixture stiffness under service conditions. No models were trained using data at 30 °C. Three algorithms such as Linear Regression, Gradient Boosting, and AdaBoost were evaluated using this same input–output structure across multiple cross-validation folds. Gradient Boosting was applied using scikit learn with 100 trees, a learning rate of 0.1, maximum depth of 3 for individual trees, minimum split size of 2, and a full training fraction to ensure replicable performance. AdaBoost was employed with decision tree estimators as the base learner, 50 estimators in total, a learning rate of 1.0, and linear regression loss for continuous target prediction. A standard Linear Regression model without regularization was also used as a baseline due to its interpretability and direct mapping of input parameters to output response. These models were trained under identical conditions and later validated using k fold cross validation to assess robustness and generalization capability.

Table 3 Machine learning models and their parameter configurations.

Cross validation procedure and error analysis

Following The correlation analyses, A k fold cross validation procedure was applied to check the effectiveness of the machine learning models. Different fold numbers including 2, 3, 5, 10, and 20 were tested in all the chosen ML algorithms as shown in Table 4, in order to examine the effect of fold selection on the stability of the performance metrics. This approach ensured that the models were assessed under varying validation schemes and reduced the risk of overfitting to a specific data split. Figure 7 illustrates a cross-validation-based error analysis of three machine learning models across four key metrics (MSE, RMSE, MAE, and R2) as the number of folds varies from 2.5 to 20. Both Linear Regression and Gradient Boosting were superior models, with Gradient Boosting providing a slight edge in minimizing absolute error at higher fold counts, while AdaBoost was the least suitable choice. The 10-fold cross-validation was finally adopted, as it provided the most consistent R² and error metrics across models, ensuring a balanced and reliable evaluation framework.

Table 4 K-Fold study performed for all ML models.
Fig. 7
Fig. 7
Full size image

Error analysis.

Surrogate fracture parameters

After selection of machine learning algorithms, surrogate fracture parameters were defined based on key mixture properties from the dataset. These parameters integrate stability, flow, and stiffness modulus to represent fracture energy in a physically meaningful and dimensionally consistent manner. A characteristic beam dimension (k) is incorporated as a scaling factor to relate the surrogate parameter to single-edge notch beam mechanics based on pure novelty. This formulation allows estimation of fracture energy for each dataset sample without performing full experimental tests thus bridging the gap between observed material behavior and predictive modeling with machine learning modeling. The surrogate parameters in this study, serve as target responses for the machine learning models, ensuring that predictions remain grounded in interpretable material behavior observed in experimental program, while enabling rapid, reliable assessment of fracture performance across different asphalt mixtures. This approach adopted in this study, provides a novel, hybrid framework that combines experimental work performed, numerical simulation done, and data-driven insights for practical pavement design and mixture optimization.

Discussion

Load–displacement behavior of FEM and experimental result

After completing the experimental program and finite element analyses, the comparison between the finite element model (shown in Fig. 5) and the experimental single edge notched beam (Shown in Fig. 3) results showed good agreement in terms of overall load–displacement behavior as shown in Fig. 9. Both approaches captured the initial linear response followed by gradual softening after peak load, which is characteristic of quasi brittle fracture in asphalt mixtures. The FEM predicted a peak load of around 8.3 kN occurring at a mid-span displacement close to 0.5 mm, which closely matched the experimental measurements. The failure mode is shown in Fig. 8 which matches the experimental failure observed in Fig. 3.

After the peak, the load carrying capacity gradually decreased with increasing displacement gradually through solver controls, and the simulation reproduced the tail of the curve with reasonable accuracy, although it exhibited slightly stiffer post peak behavior compared to the experimental beam as shown in Fig. 9. This difference may be attributed to material heterogeneity and micro cracking in the laboratory specimens that are not fully represented in the numerical model but overall, the close alignment of peak load, stiffness, and softening trend confirmed the reliability of the FEM results.

Fig. 8
Fig. 8
Full size image

Deformed shape of asphalt (Contours showing the strain).

Fig. 9
Fig. 9
Full size image

Comparison of actual vs. numerical response of FD Curve.

Experimental program insights

The experimental program provided clear evidence of how asphalt mixture composition influences fracture behavior observed in the testing evaluation. Mixtures with higher stability values exhibited steeper load displacement slopes, reflecting increased stiffness but also a tendency toward brittle failure once the peak capacity of the single notch beam is obtained. Flow values revealed the role of asphalt content, where higher binder levels led to greater deformation capacity and energy absorption, but reduced overall stiffness as shown in Fig. 10. The balance between stability and flow highlighted the classic trade-off between strength and ductility in asphalt systems as visible by the work done highlighted in the Fig. 10. These findings validate the use of surrogate fracture parameters and form the basis for integrating experimental evidence with the FEM predictions and the machine learning framework.

Fig. 10
Fig. 10
Full size image

Fracture energy comparison derived from experimental and numerical load–displacement data.

Correlation matrix interpretation

The correlation matrix in Fig. 6, highlighted the dominant role of stiffness modulus at 20 °C in relation to mixture properties. A strong positive correlation with stability (0.695) confirmed that mixtures capable of carrying higher loads also exhibited greater stiffness. Moderate positive links with maximum theoretical specific gravity (0.539) and viscosity (0.513) suggested that dense aggregate packing and stiffer binders contribute to higher modulus values. But on the other hand, asphalt content (–0.500) and effective asphalt content (–0.481) showed negative correlations, reflecting the well-known trade off where richer binder mixtures are softer but more ductile. The stability to flow ratio (0.462) also increased with stiffness, indicating more brittle behavior in stiffer mixtures. Air voids (0.342) and fines content (0.241 with P200) properly showed a weaker positive association, while flow (–0.191) and unit weight (–0.127) showed weak negative relationships. Overall, the correlations reinforced that stiffness and stability govern load resistance, whereas binder content and flow control the energy absorption capacity of asphalt mixtures.

Surrogate fracture parameters and physical justification

The novelty based surrogate fracture energy (\(\:{G}_{f}\)) was formulated, through Python, to capture the fracture behavior of asphalt mixtures using readily same available material properties from the dataset, which were used in the machine learning algorithms. Key parameters are mixture stability, flow, and stiffness modulus were selected based on their established influence on load-bearing capacity, brittleness, and energy dissipation. The surrogate formulation is shown in Eq. 1 combines these factors to approximate fracture energy in a dimensionally consistent manner, where the characteristic beam dimension is also included as a physical scaling factor.

$$\:\begin{array}{c}{G}_{f}=k*\left\{Stab\left(1+D\right)+Flow\left(ITSM20\right)\right\}\end{array}$$
(1)

Where: \(\:{G}_{f}\) = surrogate fracture energy (N·mm], \(\:k\) = scaling constant (dimensionless), \(\:Stab\) = mixture stability [N], \(\:D\) = characteristic beam dimension [mm], \(\:Flow\) = mixture flow [mm], \(\:ITSM20\) = stiffness modulus at 20 °C [N/mm²].

This approach is physically justified: higher stability and stiffness increase mixture rigidity and fracture resistance, while flow captures deformability under loading. Figure 11 presents four 3D surface plots (A–D) illustrating the variation of surrogate fracture energy with key mixture parameters across the dataset, enabling visualization of the combined effects of stability, flow, and stiffness modulus.

Table 5 Surrogate fracture energy values computed using only Stab, Flow, ITSM20.

Table 5 contains the surrogate fracture energy values computed using only Stab, Flow, ITSM20, and the characteristic beam dimension D, along with the corresponding input ranges for these parameters, clearly documenting the domain used for Gf estimation. With these parameters tabulated in Table 5, the surrogate parameter provides an interpretable estimate of fracture energy that aligns with single-edge notch beam mechanics and bridges experimental and computational observations. The proposed surrogated equation is applicable within the calibrated range of the experimental data, representing asphalt mixtures tested at 20 °C, and its extrapolation beyond this range may lead to reduced accuracy.

Fig. 11
Fig. 11
Full size image

Variation of surrogate fracture energy with key mixture parameters across the dataset.

Machine learning model performance

The predictive performance of the three machine learning models after correlation analyses (tabulated in Table 4) was evaluated across the dataset using ITSM20 as the targeted feature. Linear Regression tended to overpredict for lower stiffness values and underpredict for mid-range values, resulting in larger deviations for certain samples. AdaBoost exhibited higher variability, particularly underestimating mid-range stiffness cases. In contrast, Gradient Boosting provided more balanced predictions across the full ITSM20 range, closely capturing the trend of the target fracture response while avoiding extreme deviations. The summarization of the comparison of performance is shown in Fig. 12. This consistency suggests that Gradient Boosting is the most reliable model among the three for predicting \(\:{G}_{f}\), offering a best framework for subsequent comparisons with experimental and numerical data.

Fig. 12
Fig. 12
Full size image

ML model predictions across dataset samples.

Integration of physics, experiments, and ML

The present study demonstrates a hybrid and novelty-based approach, applied exclusively to baseline asphalt mixtures (National Highway Authority Class B gradation, 60–70 penetration grade bitumen, no additives) by integrating physical modeling, experimental measurements, and machine learning. The surrogate \(\:{G}_{f}\) equation, Eq. 1, was derived based on physical reasoning from the single-edge notch beam concept, incorporating key material properties such as stiffness (ITSM20), mixture stability, and flow characteristics. Experimental fracture tests provided reference energy values (2659.75 N·mm), while FEM simulations offered numerical benchmarks (2910.97 N·mm). The machine learning framework then predicted \(\:{G}_{f}\) for each dataset sample using the surrogate equation, capturing trends across the domain of input parameters. Comparison of the mean surrogate Gf (2602.26 N·mm) with experimental and numerical results demonstrates excellent agreement within 2% tolerance, validating the integration of physics-guided modeling with data-driven predictions. A chart in Fig. 13 represents the comparison of all integrated results of \(\:{G}_{f}\). This novelty-based approach, adopted in this study, not only ensures dimensional consistency and interpretability but also enables the extrapolation of fracture behavior to untested mixtures, providing a robust framework for performance prediction.

Fig. 13
Fig. 13
Full size image

Comparison of all fracture energies.

Conclusion and recommendations

Based on extensive research, a physics-guided surrogate equation was developed to accurately estimate the fracture energy (Gf) of asphalt mixtures using key mixture parameters. Based on extensive research, a physics-guided surrogate equation was developed to accurately estimate the fracture energy (Gf) of baseline asphalt mixtures (National Highway Authority Class B gradation, 60–70 penetration grade bitumen, no additives) using key mixture parameters. The proposed approach was validated through experimental testing and Finite Element Method (FEM) simulations, demonstrating strong agreement between predicted and observed results. Correlation and Machine Learning (ML) analyses confirmed that mixture stability, flow, and stiffness modulus (ITSM20) are the most influential parameters governing the fracture response of asphalt mixtures.

In practical engineering terms, the proposed surrogate model provides a cost-effective and time-efficient tool for estimating fracture energy without requiring exhaustive laboratory testing. This makes it suitable for routine quality control, mixture design optimization, and performance-based pavement evaluation in road construction projects. By predicting fracture resistance at the design stage, engineers can better select bitumen grades, aggregate blends, and binder contents to enhance pavement durability and minimize cracking risks under varying service conditions.

It is recommended that this surrogate fracture approach be further extended to other asphalt mixture types or composite materials exhibiting similar viscoelastic characteristics. Future studies should aim to incorporate temperature- and loading-frequency–dependent effects to improve the model’s generality. Moreover, coupling the surrogate model with advanced ML algorithms and field performance databases could support data-driven pavement management systems, enabling more reliable and sustainable infrastructure design.