Fig. 1: Performance evaluation of the Stacking Technique for Ensemble Modeling of Particle Number Concentration (Stem-PNC).
From: Machine learning-enhanced high-resolution exposure assessment of ultrafine particles

a Comparison of measured and Stem-PNC predicted weekly mean particle number concentrations in 2020 at the five stations: Basel-Biningen (BAS), Bern-Bollwerk (BER), Harkingen-A1 (HAE), Lugano-University (LUG), Rigi-Seebodenalp (RIG). The shaded area represents the standard deviation. The daily and monthly results in 2020 can be found in Supplementary Note 3. The quantitative comparisons are presented for different averaging periods: b hourly results; c daily average results; d weekly average results; e monthly average results. Dashed lines represent 1:2 (or 2:1) ratio lines. f Comparisons among different data-driven machine learning models for particle number concentration, including Linear Models, Support Vector Machines, Neighbor-based Models, Tree-based Models and Deep Learning Models and Stem-PNC. The error bars represent the standard deviation. Statistical metrics: Coefficient of Determination (R2), Mean Bias (M-Bias), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE). More details about metrics formulas, stacking structure, model categories, hyper-parameterization, and more prediction results in each category are in Supplementary Note 2. Stem-PNC was trained on hourly PNC measurements from 2016–2019 (78% of the data, with 5% used for hyperparameter tuning) and tested on 2020 data (22%), using pollutant data from National Air Pollution Monitoring Network (NABEL), meteorological data from MeteoSwiss, traffic data, and temporal features.