Main

Advances and transparency in neuroimaging are propelled by the rapidly growing amounts of publicly available data from large-scale projects (for example, the UK Biobank (UKBB) with 50,000+ scans1), data-sharing initiatives (for example, the OpenNeuro2) and international consortia (for example, the ENIGMA3) (Extended Data Fig. 1). Neuroimaging data typically require complex and multistage preprocessing pipelines to enable accurate brain tissue segmentation, spatial normalization and other essential preprocessing steps. While prevailing preprocessing pipelines such as FreeSurfer4, fMRIPrep5, QSIPrep6 and ASLPrep7 have led to numerous important findings, their original design for relatively small sample sizes makes it difficult for them to meet scalability demands in the era of big data. Concurrently, clinical applications of neuroimaging, such as imaging-guided neuromodulation8, require fast turnaround time and robustness at the individual level. This becomes particularly challenging when handling patients with brain distortion induced by traumas, gliomas or strokes9. To fulfill these emerging requirements, a computationally efficient, scalable and robust preprocessing pipeline is needed.

Hence, we developed DeepPrep, an efficient, scalable and robust preprocessing pipeline for neuroimaging powered by deep learning algorithms and a workflow manager. Deep learning-based algorithms for neuroimaging have shown potential in improving computational efficiency and robustness10,11,12,13, but an end-to-end pipeline empowered by these algorithms is still lacking. Workflow managers are widely used to manage complex workflows for big data processing in various fields, such as bioinformatics, due to their scalability, portability and computational resource efficiency14. However, this has not been fully exploited in the field of neuroimaging. Here, we applied DeepPrep to structural and functional magnetic resonance imaging (MRI). We integrated multiple learning-based modules, including FastCSR11, spherical ultrafast graph attention framework for cortical surface registration (SUGAR)12, SynthMorph13 and FastSurferCNN10 (see Fig. 1a and Supplementary Table 1 for all dependencies). These deep learning modules facilitate computational efficiency in typically time-consuming operations, such as cortical surface reconstruction (CSR), surface registration, anatomical segmentation and volumetric spatial normalization. All software modules are linked by 83 discrete yet interdependent task processes, which are packaged into a Docker or Singularity container along with all dependencies15. Our incorporation of Nextflow, a reproducible, scalable and portable workflow manager14, enables our pipeline to maximize computational resource utilization through dynamically scheduling parallelization. The workflow manager also makes it convenient to deploy the pipeline in diverse computing environments, including local computers, high-performance computing (HPC) and cloud computing (Fig. 1b). Importantly, DeepPrep is a brain imaging data structure (BIDS) app that can automatically configure appropriate preprocessing workflows based on the metadata stored in the BIDS layout16. In addition to providing preprocessed structural and functional data (see Supplementary Fig. 1 for an example), DeepPrep also generates a visual report for each participant and a summary report for a group of participants to facilitate data quality assessments by adapting from MRIQC17 (Supplementary Fig. 2) as well as a detailed runtime report (Supplementary Fig. 3). Documentation (https://deepprep.readthedocs.io) and code base (https://github.com/pBFSLab/DeepPrep) of DeepPrep is version controlled and actively updated, enabling users and developers to participate in further development of the pipeline.

Fig. 1: A computationally efficient and scalable neuroimaging pipeline is empowered by deep learning algorithms and the workflow manager.
Fig. 1: A computationally efficient and scalable neuroimaging pipeline is empowered by deep learning algorithms and the workflow manager.
Full size image

a, DeepPrep workflow. DeepPrep accepts both anatomical and fMRI data for preprocessing. White boxes highlight deep learning algorithms for brain tissue segmentation, CSR, cortical surface registration and volumetric spatial normalization, which replace the most time-intensive modules in conventional pipelines. T1w, T1 weighted. BOLD, blood oxygenation level dependent. b, The preprocessing pipeline is organized into multiple relatively independent yet integrated processes. Imaging data adhering to the BIDS format are preprocessed through a structured workflow managed by Nextflow14. Nextflow schedules task processes and allocates computational resources across diverse infrastructures, including local computers, HPC clusters and cloud computing environments. The pipeline yields standardized preprocessed imaging and derivative files, quality control reports and runtime reports as its outputs.

To demonstrate the performance of DeepPrep, we applied it to over 55,000 scans from seven datasets with diverse populations, scanners and imaging parameters (Supplementary Note 1 and Supplementary Table 2). These datasets are distinct from those used for training deep learning models (Supplementary Table 3). These evaluation datasets include structural MRI and functional MRI (fMRI) from the UKBB1 for evaluating computational efficiency and scalability, one manually labeled brain dataset (the Mindboggle-101 dataset18) for evaluating brain segmentation, two precision neuroimaging datasets (the Midnight Scanning Club (MSC) dataset19 and the Consortium for Reliability and Reproducibility Hangzhou Normal University (CoRR-HNU) dataset20) for accuracy and reliability assessment and three clinical datasets for robustness assessment. These data were also preprocessed using the state-of-the-art processing pipeline fMRIPrep5 for comparison.

In a local workstation equipped with central processing units (CPUs) and a graphics processing unit (GPU), DeepPrep preprocessed single participants from the UKBB dataset in 31.6 ± 2.4 min (mean ± s.d.) which was 10.1 times faster than fMRIPrep (Fig. 2a; 318.9 ± 43.2 min, two-tailed paired t-test, t(99) = 67.0, P = 2.6 × 10−84). DeepPrep demonstrated efficient batch-processing capability, preprocessing 1,146 participants per week (average processing time of 8.8 min per participant), which was 10.4 times more efficient than fMRIPrep, which processed 110 participants in the same time frame (Fig. 2b). Moreover, when we performed separate preprocessing for anatomical and functional scans, DeepPrep exhibited acceleration factors of 13.3 and 7.7 times compared to those of fMRIPrep, respectively (Extended Data Fig. 2). In the HPC environment, the trade-off between preprocessing time and computational resource utilization, measured in terms of CPU hours and associated expenses, becomes a critical consideration. Achieving shorter preprocessing time by recruiting more CPUs comes with higher expense on hardware (Fig. 2c), which can be a substantial concern for users. To explore this trade-off relationship, we conducted an analysis by recruiting one, two, four, eight and 16 CPUs for processing a participant. fMRIPrep exhibited a characteristic curve, illustrating the trade-off between processing time and expense (Fig. 2c). By contrast, DeepPrep demonstrated stability in both processing time and expenses due to its computational flexibility in dynamically allocating computational resources to match the specific requirements of individual task processes. Importantly, the computational expenses associated with DeepPrep were found to be at least 5.8 times and up to 22.1 times lower than those of fMRIPrep (Fig. 2c). Additionally, we demonstrated the scalability of DeepPrep by preprocessing the entire UKBB neuroimaging dataset, consisting of over 54,515 scans, within 6.5 d in an HPC cluster.

Fig. 2: Performance of DeepPrep in speed and robustness.
Fig. 2: Performance of DeepPrep in speed and robustness.
Full size image

a, Runtimes for fMRIPrep and DeepPrep when processing each participant’s data sequentially on a local workstation (n = 100 participants from the UKBB dataset). Data are presented as mean ± s.d. Error bars represent s.d. b, Number of participants that can be processed by fMRIPrep and DeepPrep at the indicated time intervals. c, Trade-off curve between preprocessing time and hardware expense measured by CPU hours. Data are presented as mean ± s.d. d, Completion and acceptable ratios when preprocessing 53 clinical samples that could not be processed by FreeSurfer. e, Preprocessing errors are categorized into three types, with three representative clinical samples shown.

Source data

The DeepPrep and fMRIPrep outputs were also compared in the Mindboggle-101, MSC and CoRR-HNU datasets, covering a range of aspects including anatomical parcellation and segmentation (Extended Data Fig. 3), anatomical morphometrics (Extended Data Fig. 4), temporal signal-to-noise ratio (tSNR) (Extended Data Fig. 5), spatial normalization (Extended Data Fig. 6), task-evoked responses (Extended Data Fig. 7), functional connectivity (Extended Data Fig. 8) and test–retest reliability in functional connectomes (Extended Data Fig. 9). Overall, these comparisons indicated that DeepPrep consistently yielded preprocessing results that were either similar or superior to those generated by fMRIPrep when assessed using various metrics. These results support the accuracy of DeepPrep while concurrently maximizing efficiency.

Next, we screened 53 clinical samples from three clinical datasets that could not be successfully processed by FreeSurfer within 48 h due to distorted brains or imaging noises. These samples were preprocessed using both DeepPrep and fMRIPrep. DeepPrep exhibited a substantially higher pipeline completion ratio (100.0%), a higher acceptable ratio (58.5%; see Methods for definitions of completion and acceptable ratios) and shorter processing time for each participant (27.4 ± 5.2 min) than fMRIPrep (Fig. 2d; completion ratio of 69.8%, χ2 test: χ2 = 16.6, P = 4.7 × 10−5; accuracy ratio of 30.2%, χ2 = 8.6, P = 0.003; processing time of 369.6 ± 205.0 min, two-tailed paired t-test, t(36) = 10.3, P = 3.0 × 10−12). The occurrences of fMRIPrep preprocessing failures and errors could be attributed to three main causes: segmentation errors, surface reconstruction failures and surface registration errors (Fig. 2e). These causes correspond to processing modules where deep learning algorithms replace conventional algorithms, indicating the robustness of DeepPrep in handling complicated clinical samples.

In summary, DeepPrep demonstrates high efficiency and robustness in processing large-scale neuroimaging datasets and complex clinical samples. This success can be attributed to the integration of workflow managers for handling big data and the utilization of deep learning algorithms. With time, we plan to expand DeepPrep into a comprehensive platform for processing multimodal neuroimaging. While the current version focuses primarily on the most time-consuming workflows in structural MRI and fMRI, future versions will integrate additional modalities, such as arterial spin labeling and diffusion imaging. These enhancements will benefit the broader neuroimaging community.

Methods

DeepPrep offers a computationally efficient, robust and scalable neuroimaging preprocessing pipeline, incorporating recent advancements in deep learning for neuroimaging processing and the workflow manager. To achieve optimal acceleration, it is recommended to deploy DeepPrep in a computing environment equipped with GPUs, although it remains executable on CPU-only systems (see Supplementary Note 2 for minimum system requirements). Additionally, version control of the project’s codes and documents is managed through GitHub.

Preprocessing workflow for anatomical images

Overall description

The anatomical preprocessing workflow in DeepPrep closely follows the FreeSurfer pipeline while efficiently replacing several of the most time-consuming steps with deep learning algorithms. Specifically, volumetric anatomical segmentation, CSR and cortical surface registration are accomplished using FastSurferCNN10, FastCSR11 and SUGAR12, respectively. FastSurferCNN is a deep learning architecture designed for anatomical segmentation of brain tissues, enabling computationally efficient segmentation of the entire brain into 95 distinct cortical and subcortical regions10. FastCSR is a deep learning model designed for the time-consuming task of CSR through the utilization of implicit representations of cortical surfaces11. SUGAR is a deep learning framework designed for both rigid and nonrigid cortical surface registration12. Detailed information on the training and test procedures for each algorithm is provided in Supplementary Note 3. The remaining steps in the workflow remain consistent with fMRIPrep, including spherical mapping, morphometric estimation and statistics. These steps ensure the continued reliability and accuracy of the overall process while harnessing the benefits of deep learning algorithms to enhance computational efficiency. The preprocessing workflow consists of several essential steps as follows:

Motion correction

If multiple T1w images are available for each participant or each session, FreeSurfer’s command ‘recon-all -motioncor’ is employed to correct head motions across the scans. This process yields an average T1w image to minimize the impact of head motions on data quality.

Segmentations

The whole brain is segmented into 62 cortical and 33 subcortical regions using FastSurferCNN10, a deep learning model for accurate and rapid anatomical segmentations. In addition, FreeSurfer’s command ‘mri_cc’ is used to segment the corpus callosum into five subcortical regions, yielding a total of 38 subcortical regions.

Skull stripping and bias field correction

A brain mask is generated through the combination of all these segmented brain regions to achieve accurate and robust skull stripping. The T1w images undergo N4 bias field correction using SimpleITK with reference to a brain mask. The normalized and skull-stripped T1w images were then fed into the subsequent steps.

Cortical surface reconstruction

The white matter and pial cortical surfaces are reconstructed based on the anatomical segmentation. This process uses FastCSR11, a deep learning-based model designed to accelerate CSR. FastCSR leverages an implicit representation of the cortical surface through the level-set representation and uses a topology-preserving surface extraction method to yield white and pial surfaces represented by triangulated meshes.

Cortical surface registration

The reconstructed surface is inflated to a sphere with minimal distortion using the FreeSurfer command ‘mris_sphere’. Both rigid and nonrigid registrations for the spherical cortical surface are performed to align anatomical landmarks and morphometrics, facilitated by the deep learning cortical surface registration framework SUGAR12. Individual spherical surfaces are aligned to the fsaverage template surfaces by default.

Cortical surface parcellation

The cortical surface parcellation is generated based on the cortical surface registration using the FreeSurfer command ‘recon-all -cortparc’, yielding 68 cortical regions. Subsequently, the cortical parcellation is projected to the volumetric segmentation by assigning voxels their closest cortical labels using the command ‘mri_surf2volseg’, thereby replacing the cortical parcellation derived from FastSurferCNN.

Preprocessing workflow for functional images

Overall description

The functional preprocessing workflow in DeepPrep incorporates advanced registration methods, SynthMorph13,21, to replace the most time-consuming step, spatial normalization. SynthMorph is a deep learning-based approach tailored for volumetric registration, also known as spatial normalization, aiming to achieve a robust, accurate and computationally efficient alignment of previously unseen brain images. The workflow is also complemented by modules from existing tools, including AFNI22, FSL23 and fMRIPrep5, to form a comprehensive functional image-preprocessing method. The fMRI preprocessing workflow consists of several essential steps as follows:

Motion correction and slice-timing correction

The head motion parameters of the BOLD fMRI signals are estimated with MCFLIRT from FSL24, with the reference volume generated from averaging nonsteady state volumes or dummy scans. Slice-timing correction is included in our processing pipeline for fMRI data using 3dTshift from AFNI22 when slice-timing information is available in the BIDS metadata. This is an optional step and can be deactivated if the BIDS metadata does not specify slice times.

Susceptibility distortion correction

DeepPrep incorporates susceptibility distortion correction workflows (SDCflows)25 to correct susceptibility distortions. SDCflows offers versatile workflows designed to preprocess various MRI schemes, enabling the estimation of B0 field inhomogeneity maps directly associated with distortion. This distortion correction is applied to the fMRI data when the appropriate field map information is available within the BIDS metadata. Distortion correction is an optional step.

Co-registration

A rigid registration is performed using FreeSurfer’s boundary-based registration to align motion-corrected fMRI volumes to native T1w images for each participant. The registration optimizes the boundary-based loss function to align the boundary between gray and white matter across different imaging modalities.

Spatial normalization

The spatial normalization step aims to normalize individual brain images to a standard template, such as the MNI152NLin6Asym volumetric template and FreeSurfer’s fsaverage6 surface template. Traditionally, this step is highly time-consuming due to the requirement of nonrigid surface and volumetric registration, often taking hours for computations. However, in DeepPrep, a deep learning algorithm called SynthMorph13,21 is used to achieve both rigid and nonrigid volumetric registration within 1 min. Additionally, for both rigid and nonrigid cortical surface registration, SUGAR12 is used to achieve accurate alignment in anatomical landmarks and morphometrics in seconds. Subsequently, preprocessed BOLD fMRI volumes are projected to the MNI152NLin6Asym template and fsaverage6 template surfaces by default by applying deformation matrices derived from the registrations. The pipeline also flexibly supports normalization to other volumetric templates of human brains managed by TemplateFlow26.

In summary, this preprocessing workflow uses a combination of conventional methods and advanced deep learning algorithms to efficiently and accurately preprocess structural and functional images for neuroimaging analysis.

The workflow manager and containers

In DeepPrep, we used Nextflow14,27 (https://www.nextflow.io), an open-source workflow manager widely used in life sciences, and Docker (https://www.docker.com) and Singularity15 containers to establish a scalable, portable and reproducible neuroimaging preprocessing pipeline across diverse computing environments.

Nextflow enhances DeepPrep by optimizing computational resource utilization, especially in HPC clusters and cloud infrastructures, while ensuring cross-platform portability and reproducibility. Specifically, DeepPrep established and executed the pipeline using key Nextflow components, including the process, channel, workflow and executor, with some modifications aimed at improving GPU utilization efficiency. Initially, we disassembled the pipeline into 83 discrete minimal executable steps, enabling parallelization and substantial improvements in computational efficiency. The disassembly also allowed for precise monitoring and control of system resource consumption for each individual step. Each step was defined as a Nextflow process, ensuring that input and output data adhered to expected standards and specifying the system resource requirements for each process. Data flow pipelines were delineated using the Nextflow channel component, and the complete workflow was specified through the Nextflow workflow component, integrating predefined processes and data flow pipelines.

Once the pipeline was established, DeepPrep employed the Nextflow executor component, serving as an abstraction layer between the processes and the underlying execution system. This abstraction facilitated the portable execution of the pipeline across diverse platforms and infrastructures. However, in cases where the execution system was set to ‘local’, the existing version of Nextflow lacked GPU resource-monitoring capabilities. Given the benefits in computational efficiency of DeepPrep from multiple GPU-intensive processes, concurrent execution on a single GPU-equipped machine could result in errors due to insufficient GPU video memory (VRAM). To address this challenge, we devised a GPU scheduling module, using two distributed mutual exclusion locks based on Redis (https://redis.io). This module categorized GPU processes into two groups: those consuming less than half of the VRAM and those requiring more than half. Processes with lower VRAM requirements could proceed upon acquiring a single lock, while those with higher demands needed to secure both locks simultaneously. This module effectively fixed GPU-related resource conflicts and substantially improved GPU utilization efficiency before Nextflow’s GPU-monitoring implementation.

In addition, we prioritized the reproducibility of our preprocessing pipeline. To this end, DeepPrep underwent rigorous version control and was fully compatible with execution in both Docker and Singularity containers. Singularity, designed for scientific computational tasks such as neuroimaging processing, offered the flexibility to execute across a spectrum of computing infrastructures and platforms, ranging from local machines to HPC clusters and cloud environments. Notably, Singularity emphasized security, a particular concern in scientific computing, particularly in HPC settings. It operated within a user space model, obviating the need for elevated privileges, thereby substantially reducing security risks typically associated with superuser access.

DeepPrep report

DeepPrep automatically generates a descriptive HTML report for each participant and session (Supplementary Fig. 2). The report commences with a concise summary of key imaging parameters extracted from the BIDS meta-information. Subsequently, the report provides an overview of the overall CPU and GPU processing times for data preprocessing. Key processing steps and results for structural images are visually presented, including segmentation, parcellation, spatial normalization and co-registration. The normalization and co-registration outcomes are demonstrated through dynamic ‘before’ versus ‘after’ animations. Additionally, the report includes a carpet plot, showcasing both the raw and preprocessed fMRI data, along with a tSNR map. Finally, the report concludes with a comprehensive boilerplate method text, offering a clear and consistent description of all preprocessing steps employed, accompanied by appropriate citations.

Standard output

The preprocessed structural MRI data are organized to align with the results of FreeSurfer, encompassing the normalized and skull-stripped brain, reconstructed cortical surfaces and morphometrics, volumetric segmentation, cortical surface parcellation and their corresponding statistics. Additionally, transformation files for surface spherical registration are included, as illustrated in Supplementary Fig. 1, depicting the data structure. For the preprocessed fMRI data, the naming adheres to the BIDS specification for derived data. The default output spaces for the preprocessed fMRI consist of three options: (1) the native BOLD fMRI space, (2) the MNI152NLin6Asym space and (3) the fsaverage6 surface space. However, users have the flexibility to specify other output spaces, including the native T1w space and various volumetric and surface templates available on TemplateFlow. The main outputs of the preprocessed data include:

  1. 1.

    Preprocessed fMRI data

  2. 2.

    Reference volume for motion correction

  3. 3.

    Brain masks in the BOLD native space, including nuisance masks, such as ventricle and white matter masks

  4. 4.

    Transformation files for between T1w and the fMRI reference and between T1w and the standard templates

  5. 5.

    Head motion parameters and the tSNR map

  6. 6.

    Confound matrix.

Evaluation data

We employed over 55,000 scans from seven datasets1,8,18,19,20,28,29 to evaluate the performance of DeepPrep (Supplementary Table 2). The evaluation datasets are distinct from training datasets for deep learning models to avoid information leakage. The training datasets for these models are listed in Supplementary Table 3. A large-scale dataset (the UKBB dataset1, 49,300 participants with 5,215 participants including repeated scans, a total of 54,515 anatomical and functional scans, 25,639 women, age = 64.37 ± 7.98 years (mean ± s.d.)) was used to evaluate computational efficiency, a dataset with manual annotations of the anatomical regions (the Mindboggle-101 dataset18, n = 101, 44 women, age = 28.4 ± 8.6 years) was used to evaluate anatomical parcellation, and two repeated measures datasets (the MSC dataset19 (ten participants and ten sessions per participant, five women, age = 29.1 ± 3.3 years) and the CoRR-HNU dataset20 (30 participants and ten sessions per participant, 15 women, age = 20–30 years)) were used to evaluate accuracy and test–retest reliability. Moreover, to evaluate the robustness in preprocessing clinical samples with anatomical distortions, we included three in-house patient datasets, including the BTH-Glioma dataset (n = 19, six women, age = 43.2 ± 11.3 years), the SHH-DoC dataset (n = 15, eight women, age = 41.2 ± 7.2 years) and the CRRC-Stroke dataset (n = 19, six women, age = 54.0 ± 11.7 years). Written informed consents were obtained for all participants. See Supplementary Note 1 for more detailed descriptions of each dataset.

Comparison to alternative preprocessing tools

We compared the processing time, along with the anatomical and functional outcomes, of DeepPrep (version 24.1.1) and fMRIPrep (version 24.0.0).

Processing time and computational costs

The preprocessing times of both DeepPrep and fMRIPrep were recorded for comparison and evaluated in an identical hardware environment (a workstation with an Intel Core i9-10980XE 3.00 GHz × 36 CPU, an NVIDIA RTX 3090 GPU with 24 GB of VRAM and 64 GB of RAM). We processed 100 participants from the UKBB dataset and then assessed the serial processing time for single participants (Fig. 2a and Extended Data Fig. 2). To ensure a fair comparison, we allocated identical hardware resources (eight CPUs, one GPU and 16 GB of RAM) for both pipelines. Additionally, to evaluate batch-processing capabilities on a single workstation, we preprocessed the maximum possible number of participants on the same workstation, without any hardware limitations, for both pipelines (Fig. 2b). To examine the scalability of DeepPrep in large-scale datasets, we also preprocessed 54,515 scans from 49,300 participants in the UKBB dataset using our pipeline in an HPC cluster of Changping Laboratory. The cluster is equipped with 1,920 CPU cores (2.90 GHz) and 20 NVIDIA RTX 3090 GPUs. To estimate computational expenses in the HPC environment, we calculated CPU hours, derived by multiplying the number of recruited CPU cores by the duration of usage. Of note, due to the reliance on GPU processing for DeepPrep, we empirically converted one GPU hour into an equivalent of 15 CPU hours to enable a standardized comparison, according to the cloud computing charge (visit https://aws.amazon.com/ec2/pricing/on-demand/ for pricing information on r7g.medium instances for CPU processing and g5.xlarge instances for GPU processing. Last visited on 21 October 2024).

Assessments of structural imaging preprocessing

Structural images were preprocessed using both DeepPrep and fMRIPrep. The preprocessed structural results were compared between the two pipelines, including anatomical segmentation and parcellation and similarities in the morphometrics, as elaborated below.

Accuracy of anatomical parcellation and segmentation

The surface-based parcellation and volumetric segmentation of structural images were automatically delineated by different pipelines. We first assessed the similarity, measured by the Dice coefficient, between the parcellations and segmentations automatically generated by different preprocessing pipelines and manually delineated annotations that were considered as the ‘ground truth’. Higher Dice coefficients indicate greater similarity and higher accuracy. To visually demonstrate differences in parcellation and segmentations, we directly estimated the difference percentages in Dice coefficients from two pipelines.

Similarity to the morphometric atlas

To assess surface registration performance, we compared similarities in the morphometrics, including sulcal depth and curvature, between the aligned surfaces and the FreeSurfer ‘fsaverage’ template surface. We employed the mean squared error to measure the dissimilarity, with lower mean squared error values indicating better alignment. We used the metrics to assess registration performance because the primary objective of surface registration is to align these anatomical features between individual cortical surfaces and atlases12,30,31.

Assessments of functional imaging preprocessing

Preprocessed functional images were compared between DeepPrep and fMRIPrep in multiple aspects, including spatial normalization, tSNR, task activations and functional connectivity. To ensure a fair comparison, we applied identical post-processing pipelines for task and resting state fMRI for both preprocessing pipelines. For the task fMRI, we used FSL FILM32 (FMRIB’s improved linear model) to perform a standard general linear model for the first-level analysis after high-pass temporal filtering (100 s). Second-level inference was performed using FLAME33 (FMRIB’s local analysis of mixed effects), based on the first-level outcomes. For post-processing resting state fMRI, we included bandpass filtering (0.01–0.08 Hz) and regression of nuisance variables, including six motion parameters, white matter signal, ventricular signal, whole-brain signal and their first-order temporal derivatives.

Temporal signal-to-noise ratio

The voxel-wise tSNR was calculated to assess the amount of informative signal relative to the noise level in preprocessed images, following a previous report34. The tSNR was estimated using the tSNR of Nipype, which first applied quadratic detrending and then calculated the temporal mean and divided by its temporal standard deviation for each voxel of each participant. The individual voxel-wise tSNR map was averaged across all participants.

Spatial normalization

After normalizing the BOLD fMRI images to the MNI152NLin6Asym volumetric template, we calculated the standard deviation map of averaged BOLD time series across participants to examine the performance of spatial normalization. A higher standard deviation around the brain outline indicates a lower performance of spatial normalization.

Motor task activation map

The MSC motor task19 is a block design fMRI task with conditions of tongue, left hand, right hand, left foot and right foot, adapted from the paradigm used in the Human Connectome Project35. Task-evoked activations were modeled by a general linear model for each voxel and each session for the first-level analyses. The second-level analyses averaged data across session for each participant based on the fixed effect model. The third-level analyses grouped data across participants based on the mixed effect model. The task post-processing analyses were performed using FSL23.

Seed-based cortico-cortical and cortico-cerebellar functional connectivity

To assess the similarity in resting state functional connectivity (RSFC) derived from different pipelines, we examined typical long-range RSFC of a cortico-cortical and a cortico-cerebellar circuit. Specifically, we identified cortical and cerebellar seeds from previous studies36. For the cortico-cortical circuit, a seed in the left-hemisphere posterior cingulate cortex (MNI coordinates = −2, −53, 26) was selected. For the cortico-cerebellar circuit, a seed in the left angular gyrus (MNI coordinates = −49, −63, 45) was selected. Each seed region of interest (ROI) was a 6-mm sphere centered around the MNI coordinates of each seed. The RSFC was estimated by Pearson’s correlation coefficient between average BOLD time series within each seed ROI. To normalize the distribution of correlation coefficients, these r values were converted to z values with Fisher’s r-to-z transformation. To demonstrate both the surface-based and volumetric RSFC, we performed surface-based analyses for the posterior cingulate cortex seed and volumetric analyses for the angular gyrus seed. The similarity in RSFC was assessed in the MSC dataset.

Test–retest reliability in functional connectomes

To assess test–retest reliability in functional connectomes, we used a volumetric atlas comprising 300 cortical and subcortical ROI37 and a cortical surface atlas comprising 300 cortical ROI38 to estimate the whole-brain and cortical functional connectomes, respectively. Test–retest reliability was measured by calculating the similarity between functional connectomes obtained from two different segments in the repeated measures datasets, namely, the MSC dataset and the CoRR-HNU dataset. Furthermore, we investigated the dependence of reliability on scanning duration by randomly selecting and concatenating two or more sessions to measure the functional connectome and its corresponding test–retest reliability, following the procedure described previously19.

Application to data in the clinical setting

To assess the pipelines’ robustness in handling clinical samples with distorted brains, we included a total of 53 samples from three clinical datasets8,28,29, comprising patients with various conditions, including stroke, glioma and disorders of consciousness (see Supplementary Note 1 for more details). Notably, these clinical samples all failed to be completely preprocessed by FreeSurfer within 48 h on a CPU. To compare the pipelines’ performance, we employed both DeepPrep and fMRIPrep to preprocess all samples and recorded the processing time, completion and accuracy rates. The error reasons were also analyzed.

Completion and acceptable ratios

Completion is defined as successfully finishing the process within 48 h on eight CPUs and one GPU, with interpretable outputs such as meaningful reconstructed surfaces, registration files and preprocessed volumetric data, although the outputs are not guaranteed to be accurate. The completion ratio was determined as the proportion of clinical datasets that successfully completed all preprocessing steps, resulting in interpretable output files. On the other hand, the acceptable ratio was defined as the proportion of samples with preprocessing results deemed acceptable for subsequent processing, considering the rationality of surface reconstruction, anatomical segmentation, registration quality and functional connectivity. The assessment of preprocessing results was conducted independently by three experienced neuroimaging experts through visual inspection (J.R., W.C. and H.L.). The experts were blinded to the label of the preprocessing pipeline to minimize subjective biases. A sample was classified as acceptable when at least two neuroscientists reached a consensus on its output quality.

Preprocessing error categories

During the assessment of failed and inaccurate samples, we conducted a systematic step-by-step inspection of the preprocessing outputs to identify the underlying reasons for preprocessing errors. Based on this analysis, the failures were categorized into three distinct groups: segmentation errors, surface reconstruction failures and surface registration errors. Segmentation errors were observed in samples where clear inaccuracies were present in the segmentation of anatomical tissues. These errors were more commonly found in samples with extensive lesions or substantial head motion during T1w imaging scanning. Surface reconstruction failures were characterized by either the failure to reconstruct cortical surfaces or an excessively prolonged processing time (exceeding 48 h) to produce results. These issues often arose due to challenges in accurately fixing surface topology in distorted brains. Surface registration errors indicated misalignment of morphometric features, such as sulci and gyri. Such misalignments could lead to inaccuracies in the registration of brain structures.

Meta-analysis of large-scale neuroimaging studies

To summarize recent trends in big data of neuroimaging, we conducted a meta-analysis of large-scale neuroimaging studies. First, we performed a PubMed (http://www.pubmed.gov) search on studies published before 30 June 2023, using the keywords (large scale OR large sample size) AND (neuroimage OR *MRI). Next, we collected further studies by reviewing the reference lists of relevant papers, including studies that employed open-source big data or reviews focusing on large-sample size studies. The inclusion criteria for large-scale datasets were as follows: (1) the study recruited a sample larger than 1,000 healthy participants or 400 participants from specific populations, such as patients, twins or preterm brains; (2) the study was peer reviewed; and (3) the study contained MRI data with at least one type of structural and functional images. These literature searches yielded a total of 42 studies, comprising over 200,000 participants. Each sample was then categorized into one of three types of projects based on its funding resources. The first type was named ‘big project’, as it was supported by a single funding source. The second type was named ‘consortium’, as it involved international collaboration with multiple funding sources. The last type was named ‘data sharing platform’, as it shared data on the same or different topics from different groups.

Statistical analysis

The χ2 test was used for comparisons in completion and accuracy ratios. Processing time, tSNR and similarities in functional connectomes of two pipelines were statistically compared using two-tailed paired-sample t-tests. False discovery rate correction was used to account for multiple testing.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.