Abstract
The internal combustion engine is a widely used power equipment in various fields, and its energy utilization is measured using brake specific fuel consumption (BSFC). BSFC map plays a crucial role in the analysis, optimization, and assessment of internal combustion engines. However, due to cost constraints, some values on the BSFC map are estimated using techniques like K-nearest neighbor, inverse distance weighted interpolation, and multi-layer perceptron, which are recognized for their limited accuracy, particularly when dealing with distributed sampled data. To address this, an improved random forest method is proposed for the estimation of BSFC. Polynomial features are employed to increase higher dimensions of features for random forest by nonlinear transformation, and critical parameters are optimized by particle swarm optimization algorithms. The performance of different methods was compared on two datasets to estimate 20%, 30%, and 40% of BSFC data, and the results reveal that the method proposed in this paper outperforms other common methods and is suitable for estimating the BSFC map.
Similar content being viewed by others
Introduction
The internal combustion engine finds extensive application in automobiles, ships, agriculture, modern industry, and construction machinery. It operates by converting gas expansion into mechanical energy and is considered the most promising product for energy conservation and emission reduction. To measure its energy efficiency, the brake specific fuel consumption (BSFC) is used. This refers to the fuel consumption of the engine per kilowatt-hour of work and is crucial for improving the engine's economy and thermal efficiency as a heat engine1,2. The BSFC map is generated by plotting the fuel consumption against the engine speed and load on the X and Y axes, respectively, over the engine’s operating range. The map serves as an important tool for evaluating engine performance and enhancing its design and efficiency3,4.
The BSFC map is a widely used tool in the analysis, optimization, and control of internal combustion engines. It serves multiple purposes, the first of which is to analyze engine performance and predict fuel consumption. This is exemplified by the analysis of a two-circuit bottom cycle system for a diesel engine5, where fine-grain fuel consumption is predicted using the BSFC map6. The BSFC map can also be used to optimize fuel consumption and reduce engine emissions. For instance, in the automotive industry, the BSFC map is utilized to control diesel engines for minimum fuel consumption7 and to obtain the optimal operating mode for the highest economic standard8. In addition, the BSFC map is helpful in studying the overall arrangement and system design of internal combustion engines, such as in the modeling and scheduling of fuel-efficient ships9.
The BSFC map is an essential tool for internal combustion engine research, and its accurate representation is crucial for further development in this field. Accurate mapping of the BSFC map requires precise measurement of the BSFC, but in practice, some values can only be estimated due to cost and other constraints. Common calculation methods include the K-nearest neighbor (KNN) method, polynomial regression, inverse distance weighted (IDW) method, ordinary kriging (OK) method, and multi-layer perceptron (MLP) method. However, these methods are known to have large errors in estimating uniformly distributed data10. This is particularly problematic when drawing high-resolution prediction maps for weather data11, where accuracy is essential. Research has shown that these common methods are insufficient for data estimation, especially when the sampled data is intimidatingly distributed, as reported in many fields such as agriculture and mining12,13,14. Recently, machine learning-based methods have become increasingly popular in various fields, including medical imaging and energy15. The random forest (RF) method, as an ensemble learning method, uses a decision tree classifier to achieve integrated decision-making16. Compared to other machine learning methods, this method has low computation requirements and high precision and is not sensitive to multicollinearity. It also demonstrates good robustness to missing and unbalanced data10,11,12,13. In this regard, an improved RF method has been introduced in this study to enhance the accuracy of BSFC estimation, marking the first application of this method in estimating the BSFC map. The results show that it outperforms other common methods on two different datasets. Therefore, it is a suitable method for estimating the BSFC map.
Methods for the estimation of BSFC
The aim of calculating BSFC is to predict the fuel consumption rate of an internal combustion engine under unknown operating conditions by learning the relationship between the fuel consumption rate and the known operating conditions. The relationship between the operating conditions and fuel consumption rate is usually determined through experiments. However, experimental limitations such as cost and conditions result in a limited amount of data. Accurately estimating fuel consumption under different operating conditions is crucial for fuel consumption control, which is a critical task in practical internal combustion engine work. Therefore, predicting fuel consumption under varying operating conditions is a fundamental task for optimizing internal combustion engines.
Consider the operating state of an internal combustion engine is represented by the combined measured state variables such as speed and power, denoted as x. The corresponding fuel consumption rate of the engine, denoted as y, and \(D = \{ ({\varvec{x}}_{1} ,y_{1} ),({\varvec{x}}_{2} ,y_{2} ),...,({\varvec{x}}_{N} ,y_{N} )\}\). The task of estimating BSFC involves determining the fuel consumption rate, denoted as y, that corresponds to the unmeasured state variables, denoted as x, based on the dataset D. This problem involves establishing a mapping between the input variable and the output variable, and then using this mapping to predict the output value y for a given input value x. This problem can be classified as either a regression problem or an interpolation problem. There are several methods commonly used to solve this problem, including the KNN method, IDW method, OK method, and MLP method. The introductions are provided for each of these methods below.
KNN method
The KNN method is a conventional and efficient machine learning technique that operates on a simple concept. It calculates the average values of the points located in close proximity to the estimated points in the known dataset. Due to its speed and simplicity, the KNN method has found its application in various interpolation scenarios, such as cloud edge computing17.
IDW method
The IDW method is a conventional and efficient technique for interpolation. Its fundamental concept involves assigning higher weights to the points in the training set closer to the interpolation points. Let the coordinates of n known points be (\(X_{i} ,Y_{i} ,Z_{i}\)), and i = 1, 2, 3, …, n, then the z value at the point (x, y, z) is given as
where \({d}_{i}^{-2}\) is the inverse of the Euclidean distance from (x, y) to (\(X_{i} ,Y_{i}\)) squared. The weight in this method follows a normalization condition, and it is evident that the closer a point is to the interpolation point, the higher the weight assigned to it.
OK method
The OK method is based on the assumption that the data space has uniform expectations and variance. It uses optimal estimation to obtain the data for unknown points. This geostatistical technique is widely applied in fields such as geographical sciences, environmental sciences, and atmospheric sciences. The OK method has been utilized for deposit Cu concentration14 and has been reported to provide high-fidelity uncertainty quantification in composite shell dynamics18.
MLP method
The MLP method employs cascaded neurons that use a sigmoid nonlinear function to map the input to output, enabling the approximation of any nonlinear function. Thus, the neural network can approximate any given multivariable continuous function, including drawing characteristic curves for power machines. This method is highly flexible and possesses a strong nonlinear mapping ability, making it a broadly applicable computational technique. It has found use in numerous applications, such as predicting macroclimate index runoff in atmospheric science19 and assessing the sensitivity to flood temperature in geographical research20.
Improved RF method for the estimation of BSFC
KNN method
RF is a regression method based on trees and has the benefits of strong prediction ability, low overfitting risk, and high interpretability8,9. This method is computationally efficient and exhibits superior speed and accuracy14,15. It has been widely applied in various fields, including environmental science, agriculture, and engineering. For instance, it has been utilized to classify medical images21 and predict indoor radon concentration22.
RF is one of the widely used ensemble learning methods. It employs a large number of regression trees for ensemble learning, with random attribute selection during the training process. The regression tree serves as the fundamental learner for RF regression. As with other machine learning techniques, in RF, features and labels, are referred to as X and Y, respectively, while N represents the sample number and D represents the training data set. The representation is as follows: \(X\, = \,\{ x_{1} ,x_{2} , \ldots ,x_{N} \} ,Y\, = \,\{ y_{1} ,y_{2} , \ldots ,y_{N} \} ,D\, = \,\{ \left( {x_{1} ,y_{1} } \right), \, \left( {x_{2} ,y_{2} } \right), \ldots ,(x_{N} ,y_{N} )\}\). A regression tree corresponds to a partition of the feature space and labels on the partitioned units. Dividing the feature space into M units \(R_{1} , \, R_{2} , \ldots ,\;R_{M}\), each unit RM with a fixed label Cm, the regression tree model can be represented as
The square error (E) is used to express the prediction error of the regression tree for the training data in the feature space whose partitioning method has been determined.
This error is used to determine the optimal output value on each unit. In the RF method, the following algorithm is used to generate a regression tree.
-
Step 1: Select the j-th variable and its value s as the segmentation variable and segmentation point, respectively. The two regions are defined as follows:
$$\begin{gathered} R_{1} (j,s) = \left\{ {\left. {\left. {\varvec{x}} \right|{\varvec{x}}^{(j)} \le s} \right\}} \right. \hfill \\ R_{2} (j,s) = \left\{ {\left. {\left. {\varvec{x}} \right|{\varvec{x}}^{(j)} > s} \right\}} \right. \hfill \\ \end{gathered}$$(5) -
Step 2: Solve the following problem to obtain the optimal j and s values. These values divide the input space into two regions, R1 and R2.
$$\mathop {min}\limits_{j,s} \left[ {\mathop {min}\limits_{{c_{1} }} \sum\limits_{{{\varvec{x}}_{i} \in R_{1} (j,s)}} {(y_{i} - c_{1} )^{2} + \mathop {min}\limits_{{c_{2} }} \sum\limits_{{{\varvec{x}}_{i} \in R_{2} (j,s)}} {(y_{i} - c_{2} )^{2} } } } \right]$$(6)
It is easy to understand that the optimal value \(\hat{c}_{m}\) of \(c_{m}\) on a unit \(R_{m}\) is the mean of the outputs \(y_{i}\) corresponding to all input instances \(x_{i}\) in the unit, which can be expressed as
-
Step 3: Repeat steps 1 and 2 for R1 and R2, respectively, until the termination condition is reached. The termination condition can be that each interval contains one sample, all samples have been used, or the number of units has reached a specified number.
The RF method involves creating a training subset by randomly sampling D and evaluating the error of the remaining samples. Multiple random trees are then generated using the same method for generating random trees, except that instead of using all features, a specified number of features are randomly selected. A total of NT regression trees were generated, denoted as \(\{ f_{1} ({\varvec{x}}),f_{2} ({\varvec{x}}), \ldots ,f_{{N_{T} }} ({\varvec{x}})\}\). If the weight is set to \(W = \{ w_{1} ,w_{2} , \ldots ,w_{{N_{T} }} \}\) and \(w_{1} = w_{2} = \cdots = w_{{N_{T} }} = 1/N{\text{T}}\), the regression prediction result for feature x is
It is evident that the diversity in RF integration arises not only from sample disturbances but also from attribute disturbances. This results in a greater variation between individuals, leading to strong adaptability and anti-interference ability toward the data.
Improved RF method
In the RF algorithm, decision trees are generated directly from the features. In machine learning, adding some nonlinear features of input data can be an effective way to increase the complexity of the model. Therefore, this study introduces polynomial features that can generate higher dimensions of features and terms related to each other. Polynomial features are a method of increasing dimensionality and performing nonlinear transformations in machine learning. It combines and expands the original features, which improves the model's expression ability and fitting effect.
Let the feature vector be \({\varvec{x}} = [x_{1} ,x_{2} , \ldots ,x_{m} ]\), and define the feature of a polynomial of degree 0 as \(\phi_{0} ({\varvec{x}}) = 1\). The d-th polynomial feature can be represented by the following iterative formula.
\(\phi_{d} ^{\prime}({\varvec{x}})\) is the row vector, that contains one or more variables from all possible \(x_{1} ,x_{2} , \ldots ,x_{m}\) variables, with a degree of d as a monomial expression.
When the RF method is used to estimate BSFC, the d-degree polynomial feature \(\phi_{d} ({\varvec{x}})\) of feature x serves as the input feature of the RF regression model. This can incorporate more combinations of original features into the consideration of generating decision trees, enhancing their fitting and expression abilities.
The polynomial feature \(\phi_{d} ({\varvec{x}}_{i} )\) of each feature \({\varvec{x}}_{i}\) is used to form a new training set \(\Phi_{d} (D) = \{ (\phi_{d} ({\varvec{x}}_{1} ),y_{1} ),(\phi_{d} ({\varvec{x}}_{2} ),y_{2} ), \ldots ,(\phi_{d} ({\varvec{x}}_{N} ),y_{N} )\}\). The RF model F is trained with \(\Phi_{d} (D)\), and the features with a proportion of p in all features are employed when the nodes split.
For a given feature vector \({\varvec{x}}\), and its polynomial feature, denoted by \(\phi_{d} ({\varvec{x}})\), the predicted result value \(\hat{y} = F(\phi_{d} ({\varvec{x}}))\) is obtained using model F. The map from feature vector \({\varvec{x}}\) to \(\hat{y}\) is called a polynomial feature RF model \(f_{(d,p)} ({\varvec{x}})\) with hyperparameters \((d,p)\).
Parameter optimization based on particle swarm algorithm
When polynomial features are introduced, the feature dimension for the feature vector \({\varvec{x}} = [x_{1} ,x_{2} , \ldots ,x_{m} ]\) and polynomial feature \(\phi_{d} ({\varvec{x}})\) increases from m to \(C_{m + d}^{d} = {{(m + d)!} \mathord{\left/ {\vphantom {{(m + d)!} {(d!m!)}}} \right. \kern-0pt} {(d!m!)}}\). However, too many polynomial features can cause slow training due to a large number of feature dimensions and may lead to overfitting, while too few features can result in underfitting. Thus, the degree d of the polynomial feature needs to be selected carefully.
Similarly, in decision tree generation, the parameter p represents the proportion of features considered to the total number of features. Too many features can lead to model complexity, which can be affected by noise and randomness, while too few features may cause under-fitting, making it difficult to capture complex relationships in the data. Therefore, when polynomial features are introduced, p needs to be selected more carefully.
Since both p and d are critical parameters, particle swarm optimization algorithms can be considered to optimize their combination. The object function is as follows.
The optimization process begins with initialization, where the total number of particles and the number of iterations are specified. Each particle is randomly assigned a position pi = {pi, di} and a velocity vi = {vpi, vdi}. The objective function of each particle is then calculated to obtain the individual optimal solution of that particle, and the position of the particle with the smallest objective function is considered the global optimal solution.
In each iteration, the following calculations are performed.
For the i-th particle, the objective function of its particle is calculated. If the objective function result is less than the objective function at the position \({\varvec{g}}_{i}^{best} = \{ g_{pi}^{best} ,g_{di}^{best} \}\) of the individual optimal solution, update the individual optimal solution to the current position. If the objective function result is less than the objective function at the global optimal solution position \({\varvec{g}}^{best} = \{ g_{p}^{best} ,g_{d}^{best} \}\), update the global optimal solution to the current position. The velocity and position of the particles are updated as
In the above equation, \(\omega\) is the inertia weight, generally set to 0.9. c1 and c2 are the acceleration coefficients, generally set to 2.0. r1 and r2 are randomly selected from [0, 1] at each update.
When the maximum number of iterations is reached, \({\varvec{g}}^{best}\) is the optimal parameter of p and d.
Experiments and results
Experimental data
The data sets used in this paper were obtained from references23,24, The data sets actual measurements of two gasoline internal combustion engines, including speed, power and fuel consumption rate. The two engines produced a total of 52 and 80 measured data points, respectively. Tables 1 and 2 show the Speed, power and fuel consumption rate of the engines.
Figures 1 and 2 show the distribution of the first fuel engine in the speed-power plane and the distribution of the speed-power-fuel consumption in the three-dimensional space.
Figures 3 and 4 show the distribution of the second fuel engine in the speed-power plane and the distribution of the speed-power-fuel consumption in the three-dimensional space.
Evaluation index
In this paper, the following indicators are used for evaluation: root mean square error (RMSE), normalized mean square error (NMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and R-squared (R2). Each indicator is calculated as follows.
The RMSE is defined as:
where n is the total number of the data to be estimated, \(yi\) is the real value to be estimated, and \(yl\) is the estimated value.
To compare the accuracy and degree of variation of different datasets, the NMSE) is proposed to compare the methods on different datasets. The calculation of NMSE is as follows.
The MAE is also used to compare estimation errors. The calculation of MAE is
To evaluate and compare the accuracy of different algorithms and data sets, the MAPE is utilized in this study. The MAPE is considered more robust than the MAE, as it normalizes the error of each data point and can be used as an evaluation indicator. It is defined as
R2 is also used to evaluate different estimation methods, representing the proportion of estimated data information to original data information. The calculation of R2 is as follows.
where \(\overline{y }\) represents the average value of all the data to be estimated. The value range of R2 is \(( - \;\infty ,1]\). The closer R2 is to 1, the more accurate the estimation method's results are. On the contrary, the farther R2 is from 1, the greater the result error of the estimation method. When R2 is less than 1, it indicates that the estimation error of the method is significant, even greater than using the mean as the estimation value.
In this paper, five indicators are used for evaluation, they are RMSE, NMSE, MAE, MAPE, and R2. RMSE represents the standard deviation between the estimated value and the true value error, while NMSE represents the percentage of error. MAE represents the average error between the estimated value and the true value, while MAPE represents the percentage of this error. R2 expresses the degree of fit between the data and the regression model. NMSE and MAPE can serve as the primary performance indicators, while other indicators can serve as secondary indicators.
Experimental results
To compare different estimation methods, the known data in this study were randomly divided into two groups at a ratio of 4:1, with 80% of the data being known and the remaining 20% being used for estimation. The data estimation methods compared in this study include KNN, IDW, OK, MLP, RF, and the proposed RF. The performance indicators compared in this study include RMSE, NMSE, MAE, MAPE, and R2. To reduce the impact of grouping randomness on statistical results, the experiment was repeated 10 times, using the same ratio for random grouping each time. After each grouping, the known sample dataset and the estimated dataset used for testing have different data. The average of the performance metrics of the 10 experiments is used as the final indicator for performance comparison.
Estimating 20% of BSFC data
Tables 3 and 4 present the performance metrics of various estimation methods on Dataset 1 and Dataset 2 for estimating 20% of BSFC data, respectively. The reported values in these tables are the average results from ten experiments. Figures 5 and 6 display the estimated values of different methods for the BSFC of Datasets 1 and 2, respectively. These figures show the actual estimated result data and real data of a single experiment in the ten experiments.
The results of the experiment conducted on Dataset 1 indicate that the proposed RF method described in this paper outperforms RF method with an RMSE of 0.46 lower, and it outperforms other methods with an RMSE of 5.05 lower. And additionally, the other errors are similar, and the R2 value of RF is closest to 1. These indexes show that the proposed RF has a minimal error and the highest accuracy. Similar results were observed on Dataset 2, the proposed method outperforms other methods with an RMSE of 9.71 lower.
Estimating 20% of BSFC data
Tables 5 and 6 present the average performance metrics of various estimation methods on Dataset 1 and Dataset 2 for estimating 30% of BSFC data after ten experiments, respectively. Figures 7 and 8 display the estimated values of different methods for the BSFC of Datasets 1 and 2 in a single experiment, respectively. The results of the experiment conducted on Dataset 1 indicate that the proposed RF method described in this paper outperforms other methods with an RMSE of 0.66 lower. The proposed method outperforms other methods with an RMSE of 23.84 lower on Dataset 2. All the indexes show that the proposed RF method has a minimal error and the highest accuracy.
Estimating 40% of BSFC data
Tables 7 and 8 present the average performance metrics of various estimation methods on Dataset 1 and Dataset 2 for estimating 40% of BSFC data after ten experiments, respectively. Figures 9 and 10 display the estimated values of different methods for the BSFC of Datasets 1 and 2 in a single experimen, respectively. The proposed RF method described in this paper outperforms other methods with an RMSE of 1.39 lower on Dataset 1. The proposed method outperforms other methods with an RMSE of 18.25 lower on Dataset 2. All the indexes show that the proposed RF has a minimal error and the highest accuracy.
The performance of different methods was compared on two datasets to estimate 20%, 30%, and 40% of BSFC data. All performance indicators indicate that the improved method proposed in this paper is the most accurate.
Comparison of NMSE and MAPE
In order to analyze the distribution of performance indicators, the standard deviation of evaluation indicators is calculated. The two most important indicators, NMSE and MAPE, were selected to draw Fig. 11 and added to the paper. In this graph, bar charts with different methods, datasets, and estimated proportions of data were drawn, especially with standard deviations marked in the graph.
From Fig. 11, it can be seen that the improved random forest method proposed in this paper has the minimum average NMSE and MAPE on both dataset 1 and dataset 2. The sample standard deviations of NMSE and MAPE are also shown in the figure, indicating that the proposed method also has the smallest sample standard deviation. From Fig. 11, it can be seen that the improved random forest method proposed in this paper has the minimum average NMSE and MAPE on both dataset 1 and dataset 2. The sample standard deviations of NMSE and MAPE are also shown in the figure, indicating that the proposed method also has the smallest sample standard deviation. This analysis result is consistent with the previous analysis results.
Conclusions
The random forest method was introduced as an alternative approach for estimating brake-specific fuel consumption and was compared to commonly used calculation methods, such as the K-nearest neighbor method, inverse distance weighted method, ordinary kriging method, and multi-layer perceptron. The experimental results indicated that the proposed RF method outperformed the other methods in accuracy and precision. Therefore, it was concluded that the proposed RF method is more suitable for estimating the BSFC map compared to the other methods.
Data availability
All data generated or analyzed during this study are included in this published article.
Abbreviations
- E :
-
(Effective) work potential
- E 0 :
-
Exergy
- E 00 :
-
Energy of a system
- K:
-
Kelvin temperature scale
- S:
-
Entropy
- T:
-
Temperature or Celsius temperature scale
- W:
-
Effective work
References
Quan, R., Yue, Y. S., Huang, Z. K., Chang, Y. F. & Deng, Y. D. Effects of backpressure on the performance of internal combustion engine and automobile exhaust thermoelectric generator. J. Energy Resour. Technol. Trans. ASME 144, 092301 (2022).
Bayat, Y. & Ghazikhani, M. Experimental investigation of compressed natural gas using in an indirect injection diesel engine at different conditions. J. Clean. Prod. 271, 122450 (2020).
Mizythras, P., Boulougouris, E. & Theotokatos, G. A novel objective oriented methodology for marine engine-turbocharger matching. Int. J. Engine Res. 23, 2105–2127 (2022).
Li, Y. Y. et al. Multi-objective energy management for Atkinson cycle engine and series hybrid electric vehicle based on evolutionary NSGA-II algorithm using digital twins. Energy Convers. Manag. 230, 113788 (2021).
Zhang, H. G., Wang, E. H. & Fan, B. Y. A performance analysis of a novel system of a dual loop bottoming organic Rankine cycle (ORC) with a light-duty diesel engine. Appl. Energy 102, 1504–1513 (2013).
Fang, C., Song, S., Chen, Z. et al. Fine-grained fuel consumption prediction. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management 2783–2791 (2019).
Iwanski, G., Bigorajski, Ł & Koczara, W. Speed control with incremental algorithm of minimum fuel consumption tracking for variable speed diesel generator. Energy Convers. Manag. 161, 182–192 (2018).
Durković, R. & Grujičić, R. An approach to determine the minimum specific fuel consumption and engine economical operation curve model. Measurement 132, 303–308 (2019).
Satpathi, K., Balijepalli, V. & Ukil, A. Modeling and real-time scheduling of DC platform supply vessel for fuel efficient operation. IEEE Trans. Transp. Electrification 3(3), 762–778 (2017).
Akar, A., Konakoğlu, B. & Akar, Ö. Prediction of geoid undulations: Random forest versus classic interpolation techniques. Concurr. Comput. Pract. Exp. 2022, e7004 (2022).
Sekulić, A. et al. Random forest spatial interpolation. Remote Sens. 12(10), 1687 (2020).
Silva, C. et al. Random forest techniques for spatial interpolation of evapotranspiration data from Brazilian’s northeast. Comput. Electron. Agric. 166, 105017 (2019).
Mariano, C. & Monica, B. A random forest-based algorithm for data-intensive spatial interpolation in crop yield mapping. Comput. Electron. Agric. 184, 106094 (2021).
Daya, A. A. & Bejari, H. A comparative study between simple kriging and ordinary kriging for estimating and modeling the Cu concentration in Chehlkureh deposit, SE Iran. Arab. J. Geosci. 8(8), 6003–6020 (2015).
Ma, L. Y., Liu, X. W., Zhang, Y. & Jia, S. L. Visual target detection for energy consumption optimization of unmanned surface vehicle. Energy Rep. 8(4), 363–369 (2022).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Zhang, W. et al. A distributed storage and computation k-nearest neighbor algorithm based cloud-edge computing for cyber-physical-social systems. IEEE Access 8, 50118–50130 (2020).
Mukhopadhyay, T. et al. A critical assessment of Kriging model variants for high-fidelity uncertainty quantification in dynamics of composite shells. Arch. Comput. Methods Eng. 24(3), 495–518 (2017).
Panahi, F. et al. Streamflow prediction with large climate indices using several hybrid multilayer perceptions and copula Bayesian model averaging. Ecol. Indic. 133, 108285 (2021).
Hadian, S., Shahiri, T. E. & Pham, Q. B. Multi attributive ideal-real comparative analysis (MAIRCA) method for evaluating flood susceptibility in a temperate Mediterranean climate. Hydrol. Sci. J. 67(3), 401–418 (2022).
Zhu, M. Y. et al. Elastography ultrasound with machine learning improves the diagnostic performance of traditional ultrasound in predicting kidney fibrosis. J. Formos. Med. Assoc. 121(6), 1062–1072 (2022).
Gummadi, J. et al. Interpolation techniques for modeling and estimating indoor radon concentrations in Ohio: Comparative study. Environ. Prog. Sustain. Energy 34(1), 169–177 (2015).
Wan, D. Y., Liu, J. Q. & Ren, C. Y. Research of universal characteristics curve of internal combustion engines. Middle South Autom. Transp. (3), 5–8 (1998).
Zhou, G. M. et al. Universal characteristics curve plotting method based on MATLAB. I.C.E. Powerpl. 110(2), 34–36 (2009).
Acknowledgements
This work supported by Research Program supported by the Department of Education and Technology (program name), Country Name.
Author information
Authors and Affiliations
Contributions
QY wrote the main manuscript text, XW responsible for architecture design and overall review, CY responsible for diagram and drawing, algorithm verification, HW responsible for the design and implementation of algorithms. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Yun, Q., Wang, X., Yao, C. et al. Random forest method for estimation of brake specific fuel consumption. Sci Rep 13, 17741 (2023). https://doi.org/10.1038/s41598-023-45026-1
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-023-45026-1













