Fig. 1: Flowchart outlining the complete model development process. | Communications Medicine

Fig. 1: Flowchart outlining the complete model development process.

From: Predicting arrhythmia recurrence post-ablation in atrial fibrillation using explainable machine learning

Fig. 1: Flowchart outlining the complete model development process.

A To address multicollinearity between the 89 risk factors compiled for this study, LASSO regression removed the least important and collinear features. B The product of this first stage was the 27-element LASSO-optimized feature set (LOFS). C Subsequently, the LOFS were used to train and test either random forest machine learning (ML) or logistic regression models using five-fold, 80:20-split cross-validation. The random forest and logistic regression models were then tested on data from a never-before-seen 15-patient holdout cohort. To assess model explainability, the marginal contributions of individual LOFS values on overall random forest model predictions in the original and holdout cohorts were evaluated by SHAP analysis. SHAP analysis was not needed for the logistic regression model, as each feature had a coefficient explicitly describing its impact on model predictions. Holdout and explainability tests were always performed on the single best logistic regression or random forest model, as assessed during the predictive efficacy stage via the area under the receiver operating characteristic curve (AUROC) metric. Links to relevant figures later in the study in which specific results are presented are provided in Output panels.

Back to article page