Abstract
Adolescence is a pivotal phase of rapid brain development shaped by genetic and environmental factors, offering a critical window for identifying early indicators of psychiatric disorders. The Chongqing Adolescent Twin Study (CATS) explores genetic and environmental influences in 136 typically developing twins aged 12 to 19. This dataset includes multimodal MRI scans (structural, resting-state functional, and diffusion MRI) alongside extensive questionnaires on cognitive abilities, emotional and social behaviors, familial and parenting dynamics, sleep wellness, stress, anxiety, and depression. We describe the dataset in detail and systematically assess its quality. When benchmarked against the Lifespan Human Connectome Project Development (HCP-D) dataset, CATS meets or exceeds HCP-D standards in signal quality, tissue contrast, image sharpness, and head motion control. Preprocessing and imaging phenotype extraction facilitate broad reuse, and high phenotype correlations with HCP-D confirm reliability. This high-quality, multimodal resource provides a unique opportunity to investigate how genetic and environmental factors, along with age-related changes, shape adolescent brain structure, connectivity, and behavior, offering critical insights for precision medicine and early interventions in psychiatry.
Similar content being viewed by others
Background & Summary
Imaging genetics integrates neuroimaging and genetic data to investigate how genetic variation influences brain structure, connectivity, and behavior. This approach has demonstrated its utility in enhancing the early diagnosis and prognosis of psychiatric disorders, including attention-deficit/hyperactivity disorder1, bipolar disorder2, autism spectrum disorder3, depression4, and schizophrenia5. While it is challenging to directly observe the genetic origins of many psychiatric disorders from phenotypes6, neuroimaging-derived measures can serve as quantifiable intermediates between underlying genetic changes and external manifestations of diseases7, known as endophenotypes.
In this context, twin studies are particularly useful in imaging genetics for investigating the genetic and environmental influences on these endophenotypes. Monozygotic (MZ) twins share nearly 100% of their genetic similarity, while dizygotic (DZ) twins share on average 50%, with both types of twins sharing their early environment if rearing together. This natural experimental design enables researchers to disentangle the effects of genes and environment on brain structure and connectivity, providing critical insights into the heritability of endophenotypes8.
Studying imaging genetics in the adolescent brain is crucial because this period represents a unique convergence of rapid brain development and the initial onset of many psychiatric disorders. During adolescence, the brain undergoes significant structural and functional changes, particularly in the regions responsible for emotion regulation, decision-making, and social interactions9. These neurodevelopmental processes provide a critical window to observe how genetic variations influence brain maturation and how these interactions contribute to the emergence of psychiatric conditions10.
Adolescence spans a formative period during which environmental factors—such as stress, peer relationships, and family dynamics—profoundly influence mental health, often interacting with genetic predispositions in complex ways11. Investigating the genetic and neurobiological mechanisms underlying these interactions could help identify biomarkers for predicting the progression of psychiatric disorders. This, in turn, could enable earlier, more targeted interventions12, ultimately leading to precision medicine strategies tailored to an individual’s genetic and neurodevelopmental profile13,14. Moreover, examining these interactions at different stages of adolescence may reveal whether heritable and environmental factors shape brain structure, connectivity, and behavior differently as adolescents mature15, offering valuable insights into typical developmental trajectories and the identification of early risk factors.
The Chongqing Adolescent Twin Study (CATS) was designed and implemented to examine these genetic and environmental effects on brain structure, connectivity, and behavior during this critical developmental stage16,17,18. This dataset includes both brain imaging and non-imaging data collected from typically developing MZ and DZ adolescent twins. The brain imaging data includes structural MRI (sMRI), resting-state functional MRI (rsfMRI), and diffusion MRI (dMRI), while the non-imaging data contains over 600 items across a wide range of questionnaires, providing rich information on cognitive abilities, emotional and social behaviors, familial and parenting dynamics, sleep wellness, stress, anxiety, and depression.
Initial analyses of this comprehensive dataset have yielded significant advancements in revealing the heritability of cortical surface area, resting-state functional connectivity, and behavior during adolescence. Specifically, high heritability was found for total cortical surface area18. Furthermore, studies confirmed that the phenotypic correlation between regional cortical surface area and verbal intelligence was largely due to genetic effects rather than unshared environmental influences or measurement error19. Regarding resting-state functional connectivity, analyses revealed that heritability varies across different resting-state networks, with stronger genetic control in sensory networks and relatively weaker control in cognitive networks17. In terms of adolescent behavior, emotional problems were largely influenced by genetics, while peer problems were predominantly shaped by environmental factors16.
While the CATS dataset has already provided valuable insights into genetic and environmental influences on adolescent brains and behavior, it remains a rich resource with significant untapped potential for future research. Priority areas include examining the genetic underpinnings of complex structural metrics20, the dynamic nature of functional connectivity21, and the interplay between brain development and cognitive, emotional, and social behaviors. Another important direction involves investigating how these genetic and environmental factors evolve across adolescence, which can be pursued by analyzing the CATS dataset on its own or in combination with other adolescent or adult datasets20,21,22.
In the present work, we provide a comprehensive description of the CATS dataset. We assessed the quality of the data and conducted a comparison with a subset of age-matched data from the Lifespan Human Connectome Project Development (HCP-D)20. To facilitate future reuse of the CATS dataset, we have performed preprocessing and extracted imaging phenotypes across multiple modalities, using two widely recognized pipelines for each modality. This dataset, along with its derivatives, offers an invaluable resource for investigating the interplay of genetic and environmental factors on various dimensions of brain structure, connectivity, and behavior in Chinese adolescents.
Methods
Participants
A total of 136 typical developing twins aged 12–19 years (mean ± SD = 15.71 ± 1.61 years, 54% female) from local schools in Chongqing, China, participated in this study (see Fig. 1 for a schematic overview). Written informed consent, including permission for data sharing, was obtained from all participants and their parents. Ethics approval for the CATS was granted by Chongqing Medical University, and the reanalysis of the CATS dataset was approved by the Research Ethics Review Committee of ShanghaiTech University. All participants were free from psychiatric disorders, nervous system diseases, and severe physical diseases. All twin pairs were reared together except for one pair. Each subject completed questionnaires for non-imaging phenotyping, assessing cognitive abilities, emotional and social behaviors, familial and parenting dynamics, sleep wellness, stress, anxiety, and depression. Among the 136 participants, 128 had zygosity test results based on short tandem repeats and amelogenin23, comprising 34 pairs of MZ twins and 30 pairs of DZ twins. Of these, 120 participants (33 MZ pairs and 27 DZ pairs) had complete data of sMRI, rsfMRI, and dMRI. Detailed demographic information is provided in Table 1.
Overview of the Chongqing Adolescent Twin Study (CATS). (a) Participants and data collection. We enrolled 136 adolescent twins; zygosity was confirmed for 128 individuals, comprising 34 monozygotic (MZ) and 30 dizygotic (DZ) pairs. The data collection includes structural MRI (sMRI), resting-state functional MRI (rsfMRI), diffusion MRI (dMRI), and questionnaires covering cognition, emotional and social behaviors, family environment, sleep, stress, anxiety, and depression. (b) MRI data quality assessment: Image quality was evaluated with MRIQC for sMRI and rsfMRI data, and with QSIPrep for dMRI. Key quality metrics for each modality are displayed. (c) Imaging phenotype extraction: sMRI data were processed with FreeSurfer and Ciftify; rsfMRI data with fMRIPrep and XCP-D; and dMRI data with QSIPrep followed by Tract-Based Spatial Statistics (TBSS). The resulting phenotypic measures for each modality are listed.
MRI acquisition
For the CATS subjects, their brain MRI data were collected using a 3 T scanner (Signa, GE Medical Systems, Waukesha, WI). All the participants were asked to lie still with their eyes closed. The T1w sMRI data were acquired using the 3DT1 sequence with the following parameters: repetition time (TR) = 6.1 ms; echo time (TE) = 2.8 ms; flip angle = 12°; matrix size = 256 × 256; field of view (FOV) = 24 cm × 24 cm; slice number = 166; slice thickness = 1.2 mm. The rsfMRI data were acquired using gradient-echo echo-planar imaging (EPI) sequence with the following parameters: TR = 3000 ms; TE = 30 ms; flip angle = 90°; matrix size = 64 × 64; FOV = 24 cm × 24 cm; slice number = 33; slice thickness = 3.5 mm. 240 volumes were acquired for each rsfMRI run. Subjects were instructed to respond post-rsfMRI as a measure to confirm wakefulness and prevent them from falling asleep throughout this experiment. The dMRI data were acquired using spin-echo EPI sequence with the following parameters: TR = 13.8 s; TE = 86.7 ms; flip angle = 90°; matrix size = 256 × 256; FOV = 24 cm × 24 cm; slice number = 37; slice thickness = 3 mm. Each run consisted of 26 diffusion-weighted volumes, which included one unweighted reference volume (b = 0 s/mm2) and 25 diffusion-weighted volumes acquired at b = 1000 s/mm2. These 25 gradient vectors were unit-length, non-colinear, and uniformly distributed on the sphere, providing angular coverage suitable for reliable tensor estimation.
Non-imaging phenotyping
Table 2 catalogs various questionnaires employed to assess CATS participants, encompassing cognitive abilities, emotional and social behaviors, familial and parenting dynamics, sleep wellness, stress, as well as anxiety and depression. The table also distinguishes between the versions of questionnaires, specifying whether they are self-reported by the adolescents, completed by their parents, or jointly by both parties. The details of these questionnaires are as follows:
-
(1)
Cognitive abilities
-
The Chinese revision of the Wechsler Intelligence Scale for Children (C-WISC)24 is an assessment of cognitive functioning and intelligence, including verbal comprehension, working memory, processing speed, attention, and reasoning skills.
-
The Wisconsin Card Sorting Test (WCST)25 is an assessment of executive functions of the frontal lobe, particularly cognitive flexibility and working memory.
-
(2)
Emotional and social behaviors
-
The Strengths and Difficulties Questionnaire (SDQ)26 evaluates emotional and behavioral issues for children aged 11 and above.
-
The Everyday Feelings Questionnaire (EFQ)27 assesses the subject’s feelings, and a higher score represents a better feeling.
-
The Index of General Affect (IGA)28 assesses an individual’s sense of happiness and life satisfaction, with higher scores indicating a stronger sense of happiness experienced by the individual.
-
The Child Behavior Checklist (CBCL)29 is a comprehensive scale to assess behaviors and emotional issues in adolescents, identifying potential problems such as anxiety, depression, aggression, and attention issues.
-
The Risk Behavior Questionnaire-Adolescent (RBQ-A)30 captures both the incidence and frequency of risky behaviors over the preceding month. Consequently, a higher total score on the RBQ-A correlates with a greater level of risk-taking behavior among adolescents.
-
The Children’s version of the Eysenck Personality Questionnaire (EPQ)31 is for assessing the personality characteristics of adolescents.
-
(3)
Familial and parenting dynamics
-
The Parenting Styles and Dimensions Questionnaire (PSDQ)32 assesses various parenting styles, including authoritative, permissive, and authoritarian. A higher score signifies a stronger alignment with the corresponding parenting style.
-
The McMaster Family Assessment Device - General Functioning Scale (FAD-GFS)33 evaluates key aspects of family dynamics, such as the coordination of activities, resilience during crises, and levels of mutual trust. Higher scores on this scale typically indicate poorer family functioning.
-
The Family Life Questionnaire (FLQ)27 is crafted to evaluate the behavioral dynamics among family members as well as the approaches to encouragement and discipline within the family. A higher score on the questionnaire reflects a more harmonious family environment.
-
The Family Adaptability and Cohesion Evaluation Scale, 2nd edition, Chinese version (FACESII-CV)34 measures the adaptability and cohesion aspects of family functioning. It is designed to accurately and effectively evaluate these dimensions in Chinese families, with higher scores representing greater adaptability and cohesion.
-
(4)
Sleep wellness
-
The Pittsburgh Sleep Quality Index (PSQI)35 is a self-assessment scale that enables individuals to rate their sleep quality, encompassing various aspects such as sleep duration, sleep disturbances, and daytime dysfunction.
-
(5)
Stress
-
The General Health Questionnaire-12 (GHQ-12)36 evaluates the psychological well-being of the parents. This widely-used screening tool is designed to identify short-term changes in mental health, specifically focusing on the ability to carry out daily activities and the presence of distressing psychological symptoms.
-
The Family Stress Questionnaire (FSQ)27 is an assessment designed to capture a spectrum of stressors experienced by family members, including occupational pressures, economic status, psychological and physiological well-being, as well as the strain arising from interactions with the external environment. Higher scores on the FSQ indicate elevated levels of stress.
-
The Simplified Coping Style Questionnaire (SCSQ)37 measures whether individuals adopt positive or negative coping styles when facing stress.
-
(6)
Anxiety and depression
-
The Self-Rating Anxiety Scale (SAS)38 measures anxiety levels through self-report, helping individuals reflect on their feelings and symptoms.
-
The Beck Depression Inventory (BDI)39 is a self-assessment tool aimed at identifying the presence and severity of depressive symptoms in individuals.
MRI data quality assessment
The quality of the CATS dataset was assessed by comparing 17 data quality metrics to those from the Lifespan Human Connectome Project Development (HCP-D)20, a well-established adolescent neuroimaging dataset. To ensure an age-matched comparison, 336 subjects aged 12 to 19 years (mean ± SD = 15.13 ± 1.93 years, 52% female) were selected from the HCP-D dataset, and their image quality metrics were derived accordingly. Statistical analyses were conducted using OriginPro (v2024; https://www.originlab.com), with Welch’s t-test applied for metrics following a normal distribution with unequal variance, and the Mann-Whitney U test for metrics with non-normal distributions. The image quality metrics are described as follows.
For the sMRI data, five quality metrics computed by the MRIQC (v0.16.1)40 were used for quality assessment:
-
(1)
Signal-to-noise ratio (SNR)41. SNR measures the level of the brain signal relative to background noise (e.g., air in the image). Higher SNR indicates that the brain signal is much stronger than noise, reflecting better image quality.
-
(2)
Full width at half maximum (FWHM) of image smoothness42. MRIQC computes FWHM by calling AFNI’s 3dFWHMx43, which fits a Gaussian model to the volume’s spatial autocorrelation. Lower FWHM indicates a sharper image with more distinct anatomical boundaries.
-
(3)
Contrast-to-noise ratio (CNR)44. CNR is used to evaluate the separation between the tissue distributions of gray matter and white matter. A higher CNR indicates a clearer distinction between tissues, implying better image quality.
-
(4)
Foreground-background energy ratio (FBER)45. FBER is the ratio of the mean energy within the head to the mean energy in the surrounding air mask. A higher FBER means the signal within the head is relatively stronger than the background air, suggesting better image quality.
-
(5)
Entropy-focus criterion (EFC)46. EFC utilizes the Shannon entropy of voxel intensities to detect ghosting and blurring resulting from head motion. Lower EFC indicates fewer artifacts such as ghosting or blurring, hence better image quality.
For the rsfMRI data, eight quality metrics computed by the MRIQC (v0.16.1)40 were used for quality assessment:
-
(1)
SNR. This metric was described in the sMRI data quality assessment paragraph above.
-
(2)
Temporal SNR (tSNR)47. The tSNR is calculated by first generating a voxelwise tSNR map, which is obtained by dividing the mean BOLD signal across time by its corresponding temporal standard deviation map. Then, the median value of this tSNR map is determined to establish the tSNR. A higher tSNR indicates that the signal is more consistent over time relative to fluctuations, implying better data quality.
-
(3)
FWHM of image smoothness. This metric was described in the sMRI data quality assessment paragraph above.
-
(4)
Average framewise displacement (Average FD)48. FD is calculated by summing the absolute magnitudes of translational and rotational head movements between each consecutive fMRI frame. Rotational movements are transformed into their translational equivalents across the curvature of a standard sphere, which is assumed to be 50 mm in radius, roughly approximating the human brain size. For each subject, an average FD value is computed and reported to provide a measure of head motion during the rsfMRI session. A lower average FD indicates less head motion, which typically corresponds to better data quality.
-
(5)
Derivatives of variance (DVARS)49. DVARS quantifies the rate of change in BOLD signal intensity across the whole brain between successive time points. It is used to assess data quality by highlighting sudden signal intensity changes, aiding in the detection of motion-related and physiological noise in the rsfMRI data. For each subject, the mean DVARS across time was reported. Lower DVARS generally suggests fewer abrupt signal changes.
-
(6)
Global correlation (GCOR)50. GCOR is the average correlation between all pairs of voxel time series within the brain. It highlights differences in the data caused by motion, physiological noise, or imaging artifacts. A smaller GCOR indicates that the data are less dominated by global artifacts, implying better quality.
-
(7)
FBER. This metric was described in the sMRI data quality assessment paragraph above.
-
(8)
EFC. This metric was described in the sMRI data quality assessment paragraph above.
For the dMRI data, four quality metrics computed by the QSIPrep (v0.17.0)51 were used for quality assessment:
-
(1)
Neighboring diffusion-weighted imaging correlation (NDC). NDC measures the spatial correlations between dMRI volumes that are acquired from nearby sampling points in q-space. High NDC values indicate that neighboring volumes are consistent with each other, reflecting better data quality.
-
(2)
Average FD. This metric was described in the rsfMRI data quality assessment paragraph above.
-
(3)
Number of bad slices (NBS)52. NBS is detected by identifying signal dropouts and comparing each slice with its neighboring slices. A smaller NBS indicates fewer corrupted slices, and thus better data quality.
-
(4)
Fiber coherence index (FCI)53. FCI quantifies the alignment of neighboring fiber orientations, weighted by anisotropy values, and is used to ensure correct b-vector orientation for accurate fiber tractography. A higher FCI indicates that the estimated fibers are more coherently aligned, suggesting better dMRI quality.
Extraction of imaging phenotypes
To facilitate further reuse of the CATS dataset, we extracted five sMRI-derived imaging phenotypes, one rsfMRI-derived imaging phenotype, and two dMRI-derived imaging phenotypes. Before extraction, data quality outliers for each modality were first excluded. Since rsfMRI and dMRI preprocessing depend on usable sMRI images, subjects with poor sMRI quality were excluded from further analyses of sMRI, rsfMRI, and dMRI. The sMRI quality was evaluated using FreeSurfer’s Euler number (FreeSurfer v6)54, which summarizes cortical surface reconstruction quality and reflects the quality of the sMRI image55. Subjects with Euler numbers exceeding the third quartile (Q3) plus three times the interquartile range (IQR) were excluded as extreme outliers56,57. For rsfMRI, subjects with an average FD greater than 0.2 mm were excluded49. For dMRI, subjects who were extreme outliers in terms of NDC or average FD were excluded.
The raw sMRI data were processed using FreeSurfer. The individual-level sMRI-derived imaging phenotypes extracted from FreeSurfer outputs included estimated total intracranial volume (ICV), cortical thickness (CTh), cortical surface area (SA), gray matter volume (GMV), and local gyrification index (LGI). ICV encompasses all intracranial volumes of gray matter, white matter, and cerebrospinal fluid. CTh measures the distance between gray-white matter and pial boundaries in specific cortical regions. SA measures the area of the outer (pial) surface of specific cortical regions. GMV quantifies the volume of gray matter within specific brain regions. LGI quantifies cortical folding at each cortical location by measuring the ratio of the buried (sulcal) SA to the outer (pial) SA. These metrics provide insights into cortical structure, allowing investigations into developmental and pathological changes in brain morphology. The Destrieux atlas was used to obtain regional values for CTh, SA, GMV, and LGI58. To enable comparison with the reference dataset (the age-matched subgroup of 336 subjects from the HCP-D dataset), we transformed FreeSurfer outputs to standard CIFTI grayordinate space using Ciftify (v2.3.3)59. The HCP-D sMRI data were preprocessed using the HCP structural pipeline (v4.3.0)60. We calculated the mean and variance of CTh for both datasets and assessed their relationship using Pearson correlation.
The raw rsfMRI data were preprocessed using fMRIPrep (v22.1.1)61 and XCP-D (v0.4)62. The fMRIPrep steps here included brain extraction, head motion correction, slice timing correction, susceptibility distortion correction, and brain normalization. The XCP-D steps included nuisance signal regression (six head motion parameters, cerebrospinal fluid signal, white matter signal, and global signal) and bandpass filtering (0.01–0.08 Hz). The HCP-D rsfMRI data were preprocessed using the HCP functional pipeline (v4.3.0)60. Preprocessed rsfMRI time series from the XCP-D or HCP outputs in the standard CIFTI grayordinate space were used to calculate individual-level resting-state functional connectivity (RSFC) matrices based on Gordon’s parcellation of 333 regions63. RSFC measures the statistical relationships (e.g., correlations) between different brain regions over time, providing insight into functional communication across the brain. We then calculated the mean and variance of RSFC for each dataset and computed the Pearson correlation of mean RSFC between the two datasets.
The raw dMRI data were reoriented to standard orientation and preprocessed using the QSIPrep pipeline (v0.17.0)51. Scilpy (v1.1.0) was used to extract the b = 1000 s/mm2 shell, and diffusion tensor model fitting was conducted using FSL (v6.0)64. The resulting fractional anisotropy (FA) and mean diffusivity (MD) images were analyzed using Tract-Based Spatial Statistics (TBSS)65, and mean values were extracted within the white matter regions defined by the Johns Hopkins University atlas66. The MD values were multiplied by 10,000 to convert to units of 10−4 mm2/s. FA reflects the degree to which diffusion is directionally constrained, while MD measures the overall magnitude of diffusion. Both are commonly used to examine the microstructural properties of white matter. For the HCP-D dMRI data, similar preprocessing steps were used, but the b = 1500 s/mm2 shell was utilized instead. We calculated the mean and variance of FA and MD for each dataset and computed the Pearson correlation of mean FA and MD between the two datasets.
To provide users with alternative preprocessing options, we employed additional processing pipelines for each imaging modality. For the raw sMRI data, we used the Computational Anatomy Toolbox (v12)67 to extract CTh and GMV. The rsfMRI data were alternatively processed using DPABI (v8.2)68, which performed slice timing correction, motion correction, sMRI-rsfMRI co-registration, brain segmentation, nuisance signal regression (the parameters of Friston 24-parameter head motion model69, cerebrospinal fluid signal, white matter signal, and global signal), brain normalization, spatial smoothing, bandpass filtering, and RSFC calculation using the same parcellation scheme as our primary processing methods. For the raw dMRI data, we implemented the FSL diffusion tensor imaging pipeline, which included reorientation to standard orientation, eddy current correction70, motion correction, diffusion tensor model fitting (using the b = 1000 s/mm2 shell), and TBSS analysis70. Regional FA and MD values were calculated using the Johns Hopkins University atlas66, consistent with our primary processing approach.
Data anonymization
To protect participant privacy, we thoroughly de-identified the CATS dataset. First, we manually checked all data to ensure the absence of personally identifiable information (e.g., name, date of birth, or date of scan). For imaging data, we performed facial anonymization using FSL’s fsl_deface and then visually inspected the resulting images71. Any images that were not successfully defaced were processed with pydeface instead72, ensuring that facial features could not be reconstructed while preserving critical brain structures for analysis. We also carefully reviewed metadata (including NIfTI file headers and JSON files) to confirm that no personally identifiable information remained in either raw or derived imaging data. Finally, we conducted thorough manual checks of non-imaging phenotyping data to remove any direct identifiers or potentially identifying details, thereby maintaining strict confidentiality throughout the dataset.
Data Records
Dataset access and organization
The CATS dataset is accessible through the Brain Science Data Center website of the Chinese Academy of Sciences (https://doi.org/10.12412/BSDC.1736128526.40001)73. Instructions for requesting access are provided in the Usage Notes section of this manuscript. Due to the dataset’s substantial size, users are advised to utilize download tools optimized for handling large files.
Fig. 2 presents the top-level folder hierarchy of the dataset. It includes raw (unprocessed) imaging data, derived imaging data, and non-imaging phenotyping data. The detailed structure of each subfolder within the derived imaging data directory is illustrated in Figs. S1–S8.
Imaging data
The imaging data adhere to the Brain Imaging Data Structure (BIDS) specification74. Basic demographic information such as age, sex, handedness, and family ID of the participants can be found in the file participants.tsv, with column descriptions provided in participants.json. Participant data are organized in subfolders named after their corresponding subject IDs. Within each subject folder, raw data from different imaging modalities are stored in the anat (sMRI), func (rsfMRI), and dwi (dMRI) subfolders, respectively (Fig. 2). Each scan within these modality folders is stored as a NIfTI file (.nii), accompanied by a JSON file (.json) that provides MRI acquisition parameters. For the sMRI data, facial features have been obscured to protect participant privacy75. For the dMRI data, gradient orientation information is provided in files named *_dwi.bvec and *_dwi.bval. All available data are included in the dataset, regardless of the quality assessment results.
Folder structure of the Chongqing Adolescent Twin Study (CATS) dataset. Raw imaging data—together with quality control outputs and preprocessed images—are organized according to the Brain Imaging Data Structure (BIDS) specification. The file CATS_DerivedImagingData_Compact.zip is a lightweight compressed archive that contains only the key subject-level imaging phenotype files, allowing rapid download and inspection of derived metrics. Extracted imaging phenotype values are stored in CATS_imaging_phenotype_data.xlsx, while non-imaging measures are provided in CATS_nonimaging_phenotype_data.xlsx.
The image quality control results and the preprocessed outputs from the primary and alternative processing pipelines are provided in subfolders under the derivatives directory. The compressed archive CATS_DerivedImagingData_Compact.zip, a slimmed-down bundle of derived imaging data, contains only the essential subject-level imaging phenotype files (Fig. S1), making it a lightweight download for rapid access to derived imaging metrics. The compressed archive MRIQC.zip contains image quality control results for the sMRI and rsfMRI data using MRIQC (Fig. S2). Within the compressed archive Freesurfer.zip, the T1w subfolder contains FreeSurfer outputs of the sMRI data in their native space, and the CIFTI subfolder includes FreeSurfer outputs resampled to the standard CIFTI grayordinate space (Fig. S3). The compressed archive CAT12.zip contains outputs from the Computational Anatomy Toolbox for sMRI data (Fig. S4). The compressed archive fMRIPrep.zip stores the fMRIPrep-preprocessed results of the rsfMRI data (Fig. S5), while the compressed archive DPABI.zip contains the DPABI-preprocessed results of the rsfMRI data (Fig. S6). The compressed archive QSIPrep.zip includes dMRI image quality control results and primary processing pipeline outputs generated using QSIPrep, Scilpy, and TBSS (Fig. S7). The compressed archive FSL.zip stores dMRI alternative processing pipeline outputs generated using FSL modules (Fig. S8). Extracted imaging phenotype values are also stored in the file CATS_imaging_phenotype_data.xlsx, with variables and properties described in CATS_imaging_phenotype_descriptions.xlsx.
Non-imaging phenotyping data
The non-imaging phenotyping data include zygosity information and questionnaire results related to cognitive abilities, emotional and social behaviors, familial and parenting dynamics, sleep wellness, stress, as well as anxiety and depression. These data are provided in the file CATS_nonimaging_phenotype_data.xlsx, with variables and properties described in CATS_nonimaging_phenotype_descriptions.xlsx (Fig. 2). All participant-identifiable information has been removed to ensure privacy.
Technical Validation
Imaging data
We validated the CATS imaging dataset quality by comparing it with the HCP-D dataset using established metrics. For sMRI, CATS demonstrated significantly higher SNR, CNR, and FWHM (Fig. 3a), indicating better image quality and sharper anatomical delineation. For rsfMRI, CATS showed significantly improved SNR, tSNR, FWHM, average FD, and EFC (Figs. 3b and S9b), reflecting enhanced image quality, sharpness, and head motion control. For dMRI, CATS exhibited significantly better NDC, average FD, and NBS than HCP-D (Fig. 3c), suggesting improved spatial consistency among neighboring diffusion-weighted volumes, reduced head motion artifacts, and fewer corrupted slices. Although HCP-D outperformed CATS in certain sMRI (FBER, EFC; Figs. 3a, S9a), rsfMRI (DVARS, GCOR, FBER; Fig. S9b), and dMRI metrics (FCI; Fig. 3c), the overall comparison confirms that CATS meets or exceeds HCP-D standards across all imaging modalities.
Data quality comparison between the Chongqing Adolescent Twin Study (CATS) and Lifespan Human Connectome Project Development (HCP-D) datasets. This figure compares the data quality metrics of the CATS with those of the HCP-D datasets. For each quality control metric, the beeswarm plot circled with a dashed box indicates the dataset with superior quality. An asterisk (*) signifies that the means of the two distributions are significantly different (p < 0.05). (a) Structural MRI (sMRI) quality control metrics: Signal-to-noise ratio (SNR), full width at half maximum (FWHM), contrast-to-noise ratio (CNR), and foreground-background energy ratio (FBER). (b) Resting-state functional MRI (rsfMRI) quality control metrics: SNR, temporal signal-to-noise ratio (tSNR), FWHM, and average framewise displacement (Average FD). (c) Diffusion MRI (dMRI) quality control metrics: Neighboring diffusion-weighted imaging correlation (NDC), Average FD, number of bad slices (NBS), and fiber coherence index (FCI).
We also evaluated extracted imaging phenotypes from the three modalities. Comparing a representative CATS subject to a representative HCP-D subject revealed similar spatial patterns across sMRI, rsfMRI, and dMRI measures (Fig. 4a). At the group level, the mean and variance of CTh, RSFC, FA, and MD showed strong concordance with the HCP-D dataset. Specifically, the spatial correlation between CATS and HCP-D was 0.89 for mean CTh and 0.76 for CTh variance (Fig. 4bc), 0.83 for mean RSFC and 0.85 for RSFC variance (Fig. 4bc), 0.87 for mean FA and 0.73 for FA variance (Fig. 4bc), and 0.87 for mean MD and 0.94 for MD variance (Fig. 4bc). These high inter-dataset correlations demonstrate that the CATS dataset reliably captures well-established neuroimaging phenotypes.
Comparison of extracted imaging phenotypes between the Chongqing Adolescent Twin Study (CATS) and the Lifespan Human Connectome Project Development (HCP-D) datasets. (a) Maps of individual-level cortical thickness (CTh; first column), resting-state functional connectivity (RSFC; second column), fractional anisotropy (FA; third column), and mean diffusivity (MD; third column) for a representative subject from the CATS dataset and a representative subject from the HCP-D dataset. (b) Group-average maps of CTh (first column), RSFC (second column), FA (third column), and MD (third column) for the CATS and HCP-D datasets. The spatial correlations between the two datasets for these group-average maps are: CTh, 0.89; RSFC, 0.83; FA, 0.87; MD, 0.87. (c) Variance maps of CTh (first column), RSFC (second column), FA (third column), and MD (third column) for the CATS and HCP-D datasets. The spatial correlations between the variance maps of the two datasets are: CTh, 0.76; RSFC, 0.85; FA, 0.73; MD, 0.94. In this figure, the unit of MD is 10−4 mm2/s. For visualization purposes, FA and MD maps display each white matter region of interest (ROI) in full size instead of only the Tract-Based Spatial Statistics (TBSS) skeleton. Abbreviations: Undef., undefined; DMN, default-mode network; PON, parieto-occipital network; FPN, frontoparietal network; SN, salience network; CON, cingulo-opercular network; PMN, parietal memory network; DAN, dorsal attention network; VAN, ventral attention network; VN, visual network; SMN, somatomotor network; L-SMN, lateral somatomotor network; AN, auditory network.
The CATS dataset also displayed biological patterns consistent with known developmental and sex-related differences, further supporting its quality. ICV remained stable across the studied age range (Fig. 5a), while CTh, SA, and LGI decreased with age, mirroring established developmental trajectories76,77 (Fig. 5b-d). Additionally, sex differences in sMRI measures aligned with known patterns76,77, with males exhibiting larger values typically associated with larger head sizes (Fig. 5).
Relationship between structural MRI (sMRI) imaging phenotypes, age, and sex. Scatter plots illustrating the correlations between age (x-axis) and various sMRI imaging phenotypes (y-axis), with data points colored by sex. (a) Estimated total intracranial volume (ICV). (b) Mean cortical thickness (CTh). (c) Mean cortical surface area (SA). (d) Mean local gyrification index (LGI).
Non-imaging phenotyping data
The CATS non-imaging data were collected from 18 standardized questionnaires (Table 2), totaling 673 items. With a low overall missing data rate of 7.42%, the dataset reflects excellent participant compliance. To assess data quality, we examined several representative phenotypes. The total behavior problem scores (TBPS) from the Youth Self-Report questionnaire (i.e., the adolescent version of CBCL)29,78, consisting of 112 items rated over the past six months, averaged 53.0 (SD = 20.4), closely matching the HCP-D sample (the HCP-D mean TBPS ± SD = 51.7 ± 17.3) of a similar age group, indicating high reliability and comparability (Fig. 6a). Sleep quality, assessed via the global PSQI scores, showed that 18.2% (24/132) of participants reported poor sleep79. This rate aligns with existing data on Chinese adolescents and further validates the dataset80,81,82 (Fig. 6b). Additionally, IQ assessments from the C-WISC showed stronger similarities among MZ twins compared to DZ twins (Fig. 6c), consistent with known genetic influences on IQ83. Collectively, these comparisons against the established dataset and literature confirm the robustness and reliability of the CATS non-imaging phenotyping data.
Technical validation of non-imaging phenotypes in the Chongqing Adolescent Twin Study (CATS) dataset. (a) Distribution of Total Behavior Problem Scores (TBPS): This panel displays the distribution of TBPS from the Youth Self-Report questionnaire (the adolescent version of the Child Behavior Checklist) within the CATS dataset. For comparison with the Lifespan Human Connectome Project Development (HCP-D) dataset, the red dashed line indicates the mean TBPS of the CATS dataset (mean = 53.0), while the blue dashed line represents the mean TBPS of the HCP-D dataset (mean = 51.7). (b) Distribution of Global Pittsburgh Sleep Quality Index (PSQI) Scores: This panel illustrates the distribution of global PSQI scores in the CATS dataset. A PSQI score above 7 is classified as indicative of poor sleep quality, which accounts for 18.2% of the subjects in the CATS dataset. (c) Scatter plots of Full-Scale Intelligence Quotient (IQ): This panel presents scatter plots of IQ scores obtained from the Chinese revision of the Wechsler Intelligence Scale for Children (C-WISC) questionnaire. The left plot corresponds to monozygotic (MZ) twins, and the right plot corresponds to dizygotic (DZ) twins, highlighting the relationship between IQ scores within twin pairs.
Usage Notes
To protect the confidentiality and privacy of adolescent participants, access to the CATS dataset is governed by a data use agreement (DUA)84. Prospective users should complete the following steps:
-
(1)
Create an account on the Brain Science Data Center website of the Chinese Academy of Sciences (https://www.braindatacenter.cn/).
-
(2)
Download, sign, and countersign the DUA. The lead recipient (typically the principal investigator) must sign the form and obtain a countersignature from an authorized institutional official. Email the signed DUA to the corresponding author of this manuscript.
-
(3)
Submit an online access request. On the dataset’s repository webpage73, click “Apply for access” in the File Downloads box. In the Describe your request field, please include a brief statement confirming that the signed DUA has been emailed to the corresponding author.
Access is typically granted within one week of fulfilling these requirements. All data recipients are expected to adhere to the terms specified in the DUA.
Code availability
The scripts used in this study are available online (https://github.com/layerConnectome/CATS).
References
Cortese, S., Aoki, Y. Y., Itahashi, T., Castellanos, F. X. & Eickhoff, S. B. Systematic Review and Meta-analysis: Resting-State Functional Magnetic Resonance Imaging Studies of Attention-Deficit/Hyperactivity Disorder. J Am Acad Child Psy 60, 61–75, https://doi.org/10.1016/j.jaac.2020.08.014 (2021).
Cattarinussi, G. et al. Cortical alterations in relatives of patients with bipolar disorder: A review of magnetic resonance imaging studies. Journal of Affective Disorders (2023).
Hashem, S. et al. Genetics of structural and functional brain changes in autism spectrum disorder. Transl Psychiat 10, 229 (2020).
Gong, J. Y. et al. Common and distinct patterns of intrinsic brain activity alterations in major depression and bipolar disorder: voxel-based meta-analysis. Transl Psychiat 10, 353, https://doi.org/10.1038/s41398-020-01036-5 (2020).
Arslan, A. Imaging genetics of schizophrenia in the post-GWAS era. Progress in Neuro-Psychopharmacology and Biological Psychiatry 80, 155–165 (2018).
Sullivan, P. F. et al. Psychiatric Genomics: An Update and an Agenda. Am J Psychiat 175, 15–27, https://doi.org/10.1176/appi.ajp.2017.17030283 (2018).
Perna, G., Alciati, A., Sangiorgio, E., Caldirola, D. & Nemeroff, C. B. Personalized Clinical Approaches to Anxiety Disorders. Adv Exp Med Biol 1191, 489–521, https://doi.org/10.1007/978-981-32-9705-0_25 (2020).
Strike, L. T. et al. Genetic Complexity of Cortical Structure: Differences in Genetic and Environmental Factors Influencing Cortical Surface Area and Thickness. Cereb Cortex 29, 952–962, https://doi.org/10.1093/cercor/bhy002 (2019).
Crone, E. A. & Dahl, R. E. Understanding adolescence as a period of social-affective engagement and goal flexibility. Nat Rev Neurosci 13, 636–650, https://doi.org/10.1038/nrn3313 (2012).
Paus, T., Keshavan, M. & Giedd, J. N. OPINION Why do many psychiatric disorders emerge during adolescence? Nat Rev Neurosci 9, 947–957, https://doi.org/10.1038/nrn2513 (2008).
Lo Iacono, L. & Carola, V. The impact of adolescent stress experiences on neurobiological development. Semin Cell Dev Biol 77, 93–103, https://doi.org/10.1016/j.semcdb.2017.09.040 (2018).
Powers, A. & Casey, B. J. The Adolescent Brain and the Emergence and Peak of Psychopathology. Journal of Infant, Child, and Adolescent Psychotherapy 14, 3–15, https://doi.org/10.1080/15289168.2015.1004889 (2015).
Insel, T. R. & Cuthbert, B. N. Brain disorders? Precisely. Science 348, 499–500, https://doi.org/10.1126/science.aab2358 (2015).
Hampel, H. et al. The foundation and architecture of precision medicine in neurology and psychiatry. Trends in Neurosciences 46, 176–198, https://doi.org/10.1016/j.tins.2022.12.004 (2023).
van Drunen, L., Dobbelaar, S., Crone, E. A. & Wierenga, L. M. Genetic and environmental influences on structural brain development from childhood to adolescence: A longitudinal twin study on cortical thickness, surface area, and subcortical volume. Dev Cogn Neuros-Neth 68, 101407, https://doi.org/10.1016/j.dcn.2024.101407 (2024).
Fu, Y. et al. Study on the Influence of Genetic and Environmental Factor on Personality of Chongqing Twins Children. Eur Psychiat 28 (2013).
Fu, Y. et al. Genetic influences on resting-state functional networks: A twin study. Hum Brain Mapp 36, 3959–3972, https://doi.org/10.1002/hbm.22890 (2015).
Ma, X. S. et al. Regional Cortical Surface Area in Adolescents: A Preliminary MRI Twin Study of Genetic and Environmental Contributions. Behav Genet 46, 205–216, https://doi.org/10.1007/s10519-015-9755-1 (2016).
Yang, S. et al. A Bivariate Twin Study of Cortical Surface Area and Verbal and Nonverbal Intellectual Skills in Adolescence. Neuroscience 530, 173–180, https://doi.org/10.1016/j.neuroscience.2023.04.009 (2023).
Somerville, L. H. et al. The Lifespan Human Connectome Project in Development: A large-scale study of brain connectivity development in 5-21 year olds. Neuroimage 183, 456–468, https://doi.org/10.1016/j.neuroimage.2018.08.050 (2018).
Van Essen, D. C. et al. The WU-Minn Human Connectome Project: an overview. Neuroimage 80, 62–79, https://doi.org/10.1016/j.neuroimage.2013.05.041 (2013).
Miller, K. L. et al. Multimodal population brain imaging in the UK Biobank prospective epidemiological study. Nat Neurosci 19, 1523–1536, https://doi.org/10.1038/nn.4393 (2016).
Yang, M. J., Tzeng, C. H., Tseng, J. Y. & Huang, C. Y. Determination of twin zygosity using a commercially available STR analysis of 15 unlinked loci and the gender-determining marker amelogenin–a preliminary report. Hum Reprod 21, 2175–2179, https://doi.org/10.1093/humrep/del133 (2006).
Gong, Y. & Cai, T. Manual of Wechsler Intelligence Scale for Children, Chinese revision (C-WISC). (1993).
Grant, D. A. & Berg, E. A. A behavioral analysis of degree of reinforcement and ease of shifting to new responses in a Weigl-type card-sorting problem. Journal of experimental psychology 38 4, 404–411 (1948).
Reading, R. Characteristics of the strengths and difficulties questionnaire in preschool children. Child: Care, Health and Development 39, 457–458, https://doi.org/10.1111/cch.12057_3 (2013).
Fu, Y. et al. 1393 – The study on validity and reliability of questionnaires related to family environment. European Psychiatry 28, 1, https://doi.org/10.1016/S0924-9338(13)76435-6 (2013).
Campbell, A., Converse, P. E. & Rodgers, W. L. The Quality of American Life: Perceptions, Evaluations, and Satisfactions. (Russell Sage Foundation, 1976).
Achenbach, T. M. & Edelbrock, C. Child behavior checklist. Burlington (vt) 7, 371–392 (1991).
Auerbach, R. P., Claro, A., Abela, J. R. Z., Zhu, X. & Yao, S. Understanding Risky Behavior Engagement Amongst Chinese Adolescents. Cognitive Therapy and Research 34, 159–167, https://doi.org/10.1007/s10608-009-9238-x (2010).
Gong, Y. Eysenck Personality Questionnaire Revised in China. Information on Psychological Sciences 4, 11-18 (1984).
Fu, Y. et al. Can parenting styles and dimensions questionnaire (PSDQ) be used in China? Psychology 4, 535 (2013).
Byles, J., Byrne, C., Boyle, M. H. & Offord, D. R. Ontario Child Health Study: reliability and validity of the general functioning subscale of the McMaster Family Assessment Device. Family process 27, 97–104 (1988).
MRPhillips. Preliminary evaluation of Chinese version of FACES II and FES:comparison of normal families and families of schizophrenic patients. Chinese Mental Health Journal (1991).
Buysse, D. J., Reynolds, C. F., Monk, T. H., Berman, S. R. & Kupfer, D. J. The Pittsburgh sleep quality index: A new instrument for psychiatric practice and research. Psychiatry Research 28, 193–213 (1989).
Goldberg, D. P. & Williams, P. A User’s Guide to the General Health Questionnaire. (NFER-Nelson, 1988).
Xie, Y. Reliability and validity of the simplified Coping Style Questionnaire. Chinese Journal of Clinical Psychology 6, 114–115 (1998).
William, W. K. Z. M. D. A rating instrument for anxiety disorders. Psychosom 12, 371–379 (1971).
Beck, A. T., Ward, C. H., Mendelson, M., Mock, J. & Erbaugh, J. An inventory for measuring depression. Archives of general psychiatry 4, 561–571 (1961).
Esteban, O. et al. MRIQC: Advancing the automatic prediction of image quality in MRI from unseen sites. Plos One 12, e0184661, https://doi.org/10.1371/journal.pone.0184661 (2017).
Dietrich, O., Raya, J. G., Reeder, S. B., Reiser, M. F. & Schoenberg, S. O. Measurement of signal-to-noise ratios in MR images: influence of multichannel coils, parallel imaging, and reconstruction filters. J Magn Reson Imaging 26, 375–385, https://doi.org/10.1002/jmri.20969 (2007).
Friedman, L. et al. Test-retest and between-site reliability in a multicenter fMRI study. Hum Brain Mapp 29, 958–972, https://doi.org/10.1002/hbm.20440 (2008).
Cox, R. W. AFNI: Software for Analysis and Visualization of Functional Magnetic Resonance Neuroimages. Computers and Biomedical Research 29, 162–173, https://doi.org/10.1006/cbmr.1996.0014 (1996).
Magnotta, V. A. & Friedman, L. Measurement of Signal-to-Noise and Contrast-to-Noise in the fBIRN Multicenter Imaging Study. J Digit Imaging 19, 140–147, https://doi.org/10.1007/s10278-006-0264-x (2006).
Zarrar, S., Steven, G., Qingyang, L., Yassine, B. & Cameron, C. The Preprocessed Connectomes Project Quality Assessment Protocol - a resource for measuring the quality of MRI data. Front Neurosci-Switz 9 (2015).
Atkinson, D., Hill, D. L., Stoyle, P. N., Summers, P. E. & Keevil, S. F. Automatic correction of motion artifacts in magnetic resonance images using an entropy focus criterion. IEEE Trans Med Imaging 16, 903–910, https://doi.org/10.1109/42.650886 (1997).
Krüger, G. & Glover, G. H. Physiological noise in oxygenation-sensitive magnetic resonance imaging. Magnetic Resonance in Medicine 46, 631–637, https://doi.org/10.1002/mrm.1240 (2001).
Jenkinson, M., Bannister, P., Brady, M. & Smith, S. Improved optimization for the robust and accurate linear registration and motion correction of brain images. Neuroimage 17, 825–841, https://doi.org/10.1016/s1053-8119(02)91132-8 (2002).
Power, J. D., Barnes, K. A., Snyder, A. Z., Schlaggar, B. L. & Petersen, S. E. Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. Neuroimage 59, 2142–2154, https://doi.org/10.1016/j.neuroimage.2011.10.018 (2012).
Saad, Z. S. et al. Correcting brain-wide correlation differences in resting-state FMRI. Brain Connect 3, 339–352, https://doi.org/10.1089/brain.2013.0156 (2013).
Cieslak, M. et al. QSIPrep: an integrative platform for preprocessing and reconstructing diffusion MRI data. Nat Methods 18, 775–778, https://doi.org/10.1038/s41592-021-01185-5 (2021).
Yeh, F.-C. et al. Differential tractography as a track-based biomarker for neuronal injury. Neuroimage 202, 116131, https://doi.org/10.1016/j.neuroimage.2019.116131 (2019).
Schilling, K. G. et al. A fiber coherence index for quality control of B-table orientation in diffusion MRI scans. Magnetic Resonance Imaging 58, 82–89, https://doi.org/10.1016/j.mri.2019.01.018 (2019).
Fischl, B. FreeSurfer. Neuroimage 62, 774–781, https://doi.org/10.1016/j.neuroimage.2012.01.021 (2012).
Rosen, A. F. G. et al. Quantitative assessment of structural image quality. Neuroimage 169, 407–418, https://doi.org/10.1016/j.neuroimage.2017.12.059 (2018).
Tukey, J. Exploratory Data Analysis. Addison-Wesley Pub. Co.
Ma, Z., Reich, D. S., Dembling, S., Duyn, J. H. & Koretsky, A. P. Outlier detection in multimodal MRI identifies rare individual phenotypes among more than 15,000 brains. Hum Brain Mapp 43, 1766–1782, https://doi.org/10.1002/hbm.25756 (2022).
Destrieux, C., Fischl, B., Dale, A. & Halgren, E. Automatic parcellation of human cortical gyri and sulci using standard anatomical nomenclature. Neuroimage 53, 1–15, https://doi.org/10.1016/j.neuroimage.2010.06.010 (2010).
Dickie, E. W. et al. Ciftify: A framework for surface-based analysis of legacy MR acquisitions. Neuroimage 197, 818–826, https://doi.org/10.1016/j.neuroimage.2019.04.078 (2019).
Glasser, M. F. et al. The minimal preprocessing pipelines for the Human Connectome Project. Neuroimage 80, 105–124, https://doi.org/10.1016/j.neuroimage.2013.04.127 (2013).
Esteban, O. et al. fMRIPrep: a robust preprocessing pipeline for functional MRI. Nat Methods 16, 111–116, https://doi.org/10.1038/s41592-018-0235-4 (2019).
Mehta, K. et al. XCP-D: A robust pipeline for the post-processing of fMRI data. Imaging Neuroscience 2, 1–26, https://doi.org/10.1162/imag_a_00257 (2024).
Gordon, E. M. et al. Generation and Evaluation of a Cortical Area Parcellation from Resting-State Correlations. Cereb Cortex 26, 288–303, https://doi.org/10.1093/cercor/bhu239 (2016).
Jenkinson, M., Beckmann, C. F., Behrens, T. E., Woolrich, M. W. & Smith, S. M. FSL. Neuroimage 62, 782–790, https://doi.org/10.1016/j.neuroimage.2011.09.015 (2012).
Smith, S. M. et al. Tract-based spatial statistics: voxelwise analysis of multi-subject diffusion data. Neuroimage 31, 1487–1505, https://doi.org/10.1016/j.neuroimage.2006.02.024 (2006).
Mori, S. et al. Stereotaxic white matter atlas based on diffusion tensor imaging in an ICBM template. Neuroimage 40, 570–582, https://doi.org/10.1016/j.neuroimage.2007.12.035 (2008).
Gaser, C. et al. CAT: a computational anatomy toolbox for the analysis of structural MRI data. Gigascience 13, https://doi.org/10.1093/gigascience/giae049 (2024).
Yan, C.-G., Wang, X.-D., Zuo, X.-N. & Zang, Y.-F. DPABI: data processing & analysis for (resting-state) brain imaging. Neuroinformatics 14, 339–351 (2016).
Friston, K. J., Williams, S., Howard, R., Frackowiak, R. S. & Turner, R. Movement-related effects in fMRI time-series. Magn Reson Med 35, 346–355 (1996).
Andersson, J. L. & Sotiropoulos, S. N. An integrated approach to correction for off-resonance effects and subject movement in diffusion MR imaging. Neuroimage 125, 1063–1078 (2016).
Jenkinson, M., Beckmann, C. F., Behrens, T. E., Woolrich, M. W. & Smith, S. M. Fsl. Neuroimage 62, 782–790, https://doi.org/10.1016/j.neuroimage.2011.09.015 (2012).
Gulban, O. F. et al. poldracklab/pydeface: PyDeface v2.0.2 (v2.0.2). (Zenodo, 2022).
Ma, Z. & Fu, Y. An Integrative Dataset from the Chongqing Adolescent Twin Study. Brain Science Data Center, Chinese Academy of Sciences, https://doi.org/10.12412/bsdc.1736128526.40001 (2025).
Gorgolewski, K. J. et al. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Sci Data 3, 160044, https://doi.org/10.1038/sdata.2016.44 (2016).
Milchenko, M. & Marcus, D. Obscuring Surface Anatomy in Volumetric Imaging Data. Neuroinformatics 11, 65–75, https://doi.org/10.1007/s12021-012-9160-3 (2013).
Tamnes, C. K. et al. Development of the Cerebral Cortex across Adolescence: A Multisample Study of Inter-Related Longitudinal Changes in Cortical Volume, Surface Area, and Thickness. J Neurosci 37, 3402–3412, https://doi.org/10.1523/jneurosci.3302-16.2017 (2017).
Bethlehem, R. A. I. et al. Brain charts for the human lifespan. Nature 604, 525–533, https://doi.org/10.1038/s41586-022-04554-y (2022).
Achenbach, T. M. & Rescorla, L. Multicultural supplement to the Manual for the ASEBA school-age forms & profiles: child behavior checklist for ages 6-18, teacher’s report form, youth self-report: an integrated system fo multi-informant assessment. (ASEBA distributor, 2007).
Liu, X. C. & Tang, M. Q. Reliability and validity of the Pittsburgh Sleep Quality Index. Chinese Journal of Psychiatry 29, 29103–29107 (1996).
Chung, K. F. & Cheung, M. M. Sleep-wake patterns and sleep disturbance among Hong Kong Chinese adolescents. Sleep 31, 185–194, https://doi.org/10.1093/sleep/31.2.185 (2008).
Liu, X., Zhao, Z., Jia, C. & Buysse, D. J. Sleep Patterns and Problems Among Chinese Adolescents. Pediatrics 121, 1165–1173, https://doi.org/10.1542/peds.2007-1464 (2008).
Xu, Z. et al. Sleep quality of Chinese adolescents: Distribution and its associated factors. Journal of Paediatrics and Child Health 48, 138–145, https://doi.org/10.1111/j.1440-1754.2011.02065.x (2012).
Heutink, P., Verhuls, F. C. & Boomsma, D. I. A longitudinal twin study on IQ, executive functioning, and attention problems during childhood and early adolescence. Acta neurol. belg 106, 191–207 (2006).
Ma, Z. CATS Data Use Agreement. Zenodo https://doi.org/10.5281/zenodo.15335779 (2025).
Acknowledgements
This work was supported in part by funding from the Natural Science Foundation of Shanghai (24ZR1451500, Z.M.), Shanghai Scientific Instruments and Chemical Reagents Project (24142201100, Z.M.), the startup grant from ShanghaiTech University (Z.M.), and the National Natural Science Foundation of China (81101025, Y.F.). Research reported in this publication was also supported by the National Institute of Mental Health of the National Institutes of Health under Award Number U01MH109589 and by funds provided by the McDonnell Center for Systems Neuroscience at Washington University in St. Louis. The HCP-Development 2.0 Release data used in this report came from DOI: 10.15154/1520708. This work used the high-performance computing platform at ShanghaiTech University.
Author information
Authors and Affiliations
Contributions
Y.F. and Z.M. designed the study; Y.F., X.M., X.H. and T.L. collected the data; Y.Z., R.W. and Z.M. performed the analyses; Y.Z., J.H. and Z.M. drafted the manuscript and all the authors revised the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhu, Y., Fu, Y., Han, J. et al. The Chongqing Adolescent Twin Study: An Integrative Multimodal Brain Imaging and Non-imaging Dataset. Sci Data 12, 1211 (2025). https://doi.org/10.1038/s41597-025-05449-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-025-05449-z