The effects of data leakage on predictive models in neuroimaging studies are not well understood. Here, the authors show that data leakage via feature selection and repeated subjects drastically inflates prediction performance, whereas other forms of leakage have more minor effects.
- Matthew Rosenblatt
- Link Tejavibulya
- Dustin Scheinost