Fig. 3: As scan time increases, sample size becomes more important than scan time.
From: Longer scans boost prediction and cut costs in brain-wide association studies

a, The prediction accuracy of the HCP cognition factor score when total scan duration is fixed at 6,000 min, while varying the scan time per participant. N refers to the sample size and T refers to the scan time per participant. We repeated a tenfold cross-validation 50 times. Each violin plot shows the distribution of prediction accuracies across 50 random repetitions (that is, there were 50 datapoints in each violin with each dot corresponding to the average accuracy for a particular cross-validation split). The boxes inside violins represent the interquartile range (IQR; from the 25th to 75th percentile) and whiskers extend to the most extreme datapoints not considered outliers (within 1.5× IQR). Two-tailed paired-sample corrected-resampled t-tests58 were performed between the largest sample size (N = 600, T = 10 min) and the other sample sizes. Each corrected resampled t-test was performed on 500 pairs of prediction accuracy values. P values were as follows: 7.9 × 10−3 (N = 600 versus N = 120) and 9.8 × 10−4 (N = 600 versus N = 100). The asterisks indicate statistical significance after false discovery rate (FDR) correction; q < 0.05. P values of all tests and details of the statistical tests are provided in Supplementary Table 1. b, Prediction accuracy against total scan duration for the cognitive factor score in the HCP dataset. The curves were obtained by fitting a theoretical model to the prediction accuracies of the cognitive factor score. There are 174 datapoints in the panel. The theoretical model explains why the sample size is more important than scan time (see the main text).