Table 6 Summary of error evaluation criteria used in the study.

From: Interpretable machine learning approaches to assess the compressive strength of metakaolin blended sustainable cement mortar

Metric

Significance

Formula

Criteria

Reference

Mean absolute error (MAE)

This reflects the average deviation of the actual values from the predicted ones.

\(\:\frac{{\Sigma\:}\:|\text{x}\:-\:\text{y}|}{\text{n}}\)

Close to zero

93

Most commonly used metric for accuracy assessment

Its value should be The value of as close to zero as possible.

Root mean square error (RMSE)

Another widely used metric for accessing ML models.

\(\:\sqrt{\frac{{\sum\:(\text{x}\:-\:\text{y})}^{2}}{\text{n}}}\)

Close to zero

94

It gives greater weight to bigger mistakes by using the squares of the residuals before calculating the mean and used as an indicator of larger errors.

Should be kept as minimum as possible for a good model.

Coefficient of determination (\(\:{\text{R}}^{2}\))

It serves to measure the general accuracy of models based on regression.

\(\:1-\frac{\sum\:{\left(\text{x}-\text{y}\right)}^{2}}{\sum\:{\left(\text{y}-{\text{y}}_{\text{m}\text{e}\text{a}\text{n}}\right)}^{2}}\)

\(\:{\text{R}}^{2}>0.8\)

95

It cannot be used as a sole indicator of accuracy since it is unaffected by dividing or multiplying the result by a constant.

Generally, \(\:{\text{R}}^{2}\) value greater than 0.8 is considered acceptable.

a20-index (a20)

A newly introduced metric employed to evaluate the deviations of the predictions.

\(\:\frac{\text{n}20}{\text{n}}\)

Close to 1

96

It quantifies the proportions of forecasting which deviate more than + 20% and – 20% from the actual values.

Its value should be equal to 1 for an ideal model.

Performance index (PI)

It simultaneously evaluates both R and the relative root mean squared error (RRMSE)

\(\:\frac{\text{R}\text{R}\text{M}\text{S}\text{E}}{1+\text{R}}\)

\(\:\text{P}\text{I}<0.2\)

97

PI value less than 0.2 is widely used as a threshold for overall model accuracy. However, it is advised to have PI as close to zero as possible.

Objective function (OF)

OF combines RRMSE, correlation, and data points in training and testing sets.

\(\:\left(\frac{{\text{n}}_{\text{T}\text{r}\text{a}\text{i}\text{n}\text{i}\text{n}\text{g}}-{\text{n}}_{\text{T}\text{e}\text{s}\text{t}\text{i}\text{n}\text{g}}}{\text{n}}\right){\text{P}\text{I}}_{\text{T}\text{r}\text{a}\text{i}\text{n}\text{i}\text{n}\text{g}}+2\left(\frac{{\text{n}}_{\text{T}\text{e}\text{s}\text{t}\text{i}\text{n}\text{g}}}{\text{n}}\right){\text{P}\text{I}}_{\text{T}\text{e}\text{s}\text{t}\text{i}\text{n}\text{g}}\)

\(\:\text{O}\text{F}<0.2\)

98

It is used to check the performance of ML models as a whole.

It should also be less than 0.2 for a good model.