Fig. 2: Prediction performance in HCPD of the non-leaky pipelines, including the gold standard, omitting covariate regression, omitting site correction, and omitting both covariate regression and site correction.
From: Data leakage inflates prediction performance in connectome-based machine learning models

Rows represent different non-leaky analysis choices, and columns show different phenotypes. The black bar represents the median performance of the gold standard models across random iterations, and the exact value of the bar is shown as the median r, rmed. The histograms show prediction performance across 100 iterations of 5-fold cross-validation. See also Supplementary Fig. 1. HCPD Human Connectome Project Development.