Multivariate genome-wide analysis reveals shared genetic architecture and brain structural correlates of human cognitive abilities

Chen, Haifeng; Liao, Yuxiong; Tang, Luejun; Wei, Xiaoyun; Li, Tongshun; Chen, Wei

doi:10.1038/s41598-025-25509-z

Download PDF

Article
Open access
Published: 24 November 2025

Multivariate genome-wide analysis reveals shared genetic architecture and brain structural correlates of human cognitive abilities

Haifeng Chen¹^na1,
Yuxiong Liao¹^na1,
Luejun Tang¹,
Xiaoyun Wei¹,
Tongshun Li¹ &
…
Wei Chen¹

Scientific Reports volume 15, Article number: 41596 (2025) Cite this article

5335 Accesses
1 Citations
7 Altmetric
Metrics details

Subjects

Abstract

Limited knowledge currently exists regarding the shared genetic architecture influencing human cognitive ability-related traits. We utilized Genomic Structural Equation Modeling (Genomic-SEM) along with multiple post-GWAS methods to estimate potentially causal single nucleotide polymorphisms (SNPs) associated with cognitive ability phenotypic variation. Our study identified 3,842 genome-wide significant loci, including 275 novel loci. Applying multiple transcriptome-wide association methods, we analyzed susceptibility gene signal loci highly correlated with cognitive ability GWAS from tissue, cellular, and genomic element perspectives, identifying 13 high-confidence candidate causal genes and related functional element information. Subsequently, we systematically evaluated 80–90% of currently known human common disease data in the IEU database, along with 3,935 brain imaging-derived phenotypes from UK Biobank and cortical morphological features from 51,665 individuals in the ENIGMA database to determine cognitive ability-related susceptibility factors. We applied the BrainXcan pipeline for brain imaging genetic analysis of cognitive abilities. Additionally, we used summary data-based polygenic scoring methods to systematically analyze the genetic contribution of different chromosomes to cognitive abilities and their risk prediction value. Our study, through multivariate GWAS analysis of cognitive ability common genetic factors, contributes to mapping the shared genetic architecture of human cognitive abilities.

Genetic influences on brain and cognitive health and their interactions with cardiovascular conditions and depression

Article Open access 18 June 2024

Genome-wide association study of traumatic brain injury in U.S. military veterans enrolled in the VA million veteran program

Article 24 October 2023

The overlapping genetic architecture of psychiatric disorders and cortical brain structure

Article 11 August 2025

Introduction

As Einstein once noted, “Imagination is more important than knowledge.” Cognitive abilities are not merely numbers on intelligence tests, but rather complex and dynamic neurobiological processes characterized by fine-tuned neural networks and synaptic plasticity changes that are profoundly influenced by genetic, environmental, and lifestyle factors¹. With increasing global demands for human capital and educational quality, individual differences in cognitive abilities have an increasingly significant impact on academic achievement, career development, and quality of life, making them important concerns in education, medicine, and socioeconomic fields². Despite significant advances in neurocognitive science and educational neuroscience in recent years, our understanding of the specific genetic and biological mechanisms underlying cognitive abilities remains limited³.

Research suggests that differences in neurotransmitter system function and synaptic plasticity changes are considered important biological foundations for individual differences in cognitive abilities^4,5. However, these findings are still insufficient to fully explain the vast differences in cognitive performance and learning abilities among individuals. Twin and family studies have consistently demonstrated that cognitive abilities are substantially heritable, with heritability estimates ranging from 50 to 80% across the lifespan^6,7. Yet despite this strong evidence for genetic influence, genome-wide association studies (GWAS) have historically struggled to identify the specific genetic variants responsible for this heritability, a phenomenon known as the “missing heritability” problem^8,9.

To address these challenges, this study aims to dissect the underlying molecular mechanisms of cognitive abilities through the integration of multiple genetic analysis tools and strong association exploration methods, while exploring associations with various neuropsychiatric disorders. In particular, we focus on multiple genomic loci and chromosomal regions related to cognitive abilities to reveal potential intervention targets for promoting cognitive function and preventing cognitive decline¹⁰. This research not only expands our understanding of the genetic foundations of human cognitive abilities but also provides theoretical and practical support for intervention strategies addressing global educational equity and cognitive health challenges.

To address the current lack of systematic measurement of the shared genetic architecture of cognitive abilities, we designed a multivariate GWAS study targeting latent multidimensional cognitive ability common factors. We employed genomic structural equation modeling (Genomic SEM)¹¹, applying it to published GWAS summary statistics related to intelligence¹², educational attainment¹³, processing speed¹⁴, executive function¹⁵, memory performance¹⁶, and reaction time. By integrating these statistics, we obtained SNP association effects on latent cognitive ability common factors, thereby constructing a GWAS study targeting multidimensional cognitive ability phenotypes that have never been directly measured.

We further adopted comprehensive analysis methods from systems biology, defining the portion of genetic variation in multidimensional cognitive abilities that is not explained by single cognitive domains as the shared genetic foundation of cognitive abilities, and conducted multiple post-GWAS analyses on this foundation^17,18. Although this approach is not perfect for the true relationship between cognitive ability-related neural pathways and multifactorial interactions, since cognitive ability is a complex process driven jointly by genetic, environmental, and stochastic factors, this analysis excludes confounding effects based on single cognitive indicators, thereby enabling systematic analysis of the previously difficult-to-study common genetic architecture of cognitive abilities^19,20.

Finally, we validated the neurobiological foundations of cognitive abilities through the BrainXcan pipeline and brain imaging phenotype analysis, and conducted large-scale association analysis with 50,033 human phenotypes²¹. Our aim is to construct an easily understandable genetic risk map of cognitive abilities for clinicians, educators, and others, enabling them to directly apply genetic information to develop individualized cognitive training and intervention measures. Our research aims to establish a complete translational pathway from genomic statistics to clinical educational practice.

Methods

A flowchart overview is presented in Fig. 1.

GWAS input data sources

The input GWAS data for our cognitive ability structural equation modeling came from six cognitive-related trait GWAS, including Intelligence, Executive Function, Processing Speed, Educational Attainment, Memory Performance, and Reaction Time. All input GWAS studies had ethical approval from their respective institutional review boards, and all participants provided informed consent, with data undergoing rigorous quality control²².

1.
Intelligence GWAS data were obtained from Hu et al. 2025 (GCST90310159, n = 110,988)²³. This study was based on UK Biobank fluid intelligence scores (data field 20,016), primarily from individuals of British ancestry, using standardized intelligence assessment tools, providing an important foundation for genetic research on general cognitive ability. This dataset has high phenotypic measurement precision and serves as a core phenotype for cognitive ability research.
2.
Executive Function GWAS data came from Perry et al. 2025 (GCST90503115, n = 266,413)²⁴. This study analyzed multiple executive function GWAS data through genomic structural equation modeling, improving gene discovery efficiency and providing valuable genetic information for higher-order cognitive control and cognitive flexibility. Executive function, as a core component of cognitive abilities, has important clinical and educational significance.
3.
Processing Speed GWAS data were from Carey et al. 2024 (GCST90309368, n = 119,671)²⁵. This study used confirmatory factor analysis (factor 34) to comprehensively evaluate cognitive and processing speed phenotypes, providing an important foundation for genetic research on cognitive processing efficiency. Processing speed, as a fundamental dimension of cognitive ability, reflects the information transmission efficiency of the nervous system.
4.
Educational Attainment GWAS came from Xia et al. 2025 (GCST90566697, n = 848,919)²⁶. This study used multi-trait analysis and genomics (MTAG) methods to analyze educational attainment data, providing strong statistical power for understanding the genetic basis of educational achievement. Educational attainment, as an important external manifestation of cognitive ability, has the largest sample size advantage.
5.
Memory Performance GWAS data were from Hatoum et al. 2022 (GCST90179116, n = 162,335)²⁷. This study evaluated genetic variation in memory function based on prospective memory tasks, providing a foundation for research on genetic mechanisms of memory-related cognitive processes. Memory performance, as an important component of cognitive ability, reflects the capacity for information encoding, storage, and retrieval.
6.
Reaction Time GWAS data came from Hatoum et al. 2022 (GCST90179115, n = 432,297)¹⁴. This study provided important statistical power for understanding the genetic basis of cognitive processing speed and neural conduction efficiency based on reaction time measurements. Reaction time, as a fundamental measurement indicator of cognitive ability, reflects the basic functional efficiency of the nervous system.

All analyses in this study were based on publicly available GWAS summary statistics rather than individual-level genotype data. The constituent GWAS had performed quality control procedures including sample quality screening, single nucleotide polymorphism (SNP) quality screening (MAF > 0.01, INFO > 0.8), population stratification correction, and relatedness adjustment²⁸. To ensure data consistency, all analyses were based on the GRCh37/hg19 reference genome and used the 1000 Genomes Project European population as the LD reference panel²⁹.

Quality control of input GWAS

As our analysis utilized publicly available GWAS summary statistics, sample-level quality control and relatedness adjustment were performed by the original GWAS study teams prior to our analysis.

1.
Sample-level quality control: Constituent GWAS studies applied standard procedures including exclusion of samples with low genotyping call rate (< 95%), sex discordance (between reported and genetic sex), abnormal heterozygosity (± 3 standard deviations from population mean), and non-European ancestry (determined by principal components analysis with reference populations).
2.
Relatedness adjustment: Related individuals (kinship coefficient > 0.0884, approximately third-degree relatives or closer) were identified and either one individual per related pair was excluded, or association testing was performed using linear mixed models that account for genetic relatedness matrix (in large biobank studies).
3.
MHC region handling: The major histocompatibility complex (MHC) region located on chromosome 6 (approximately 25,000,000–35,000,000 bp) received special treatment due to its genetic diversity and structural complexity, particularly the polymorphisms of immune-related genes³⁰.
4.
SNP-level filtering in our analysis: We included all autosomal SNPs from the six input GWAS after quality control screening. SNPs were filtered to the 1000 Genomes Phase 3 EUR panel, removing SNPs with MAF < 0.01 (due to few samples within genotype clusters, these SNPs are error-prone and typically have high LD score regression standard errors), removing SNPs with zero effect estimates (to avoid affecting matrix reactivity, which is necessary for genomic structural equations), removing SNPs that do not match the reference panel, and removing SNPs with allele mismatches³¹.

Sample overlap assessment

In our analysis, the input GWAS came from different genomic information storage sites with distinct participants across studies. This means we adequately considered sample intersections between different cohorts during GWAS analysis to ensure result accuracy and generalizability, while also considering potential statistical impacts from sample overlap³².

Genomic SEM inherently accounts for sample overlap through its use of genetic covariance matrices derived from GWAS summary statistics rather than individual-level data, thereby avoiding biases that could arise from overlapping participants across different studies. The method’s reliance on LD Score regression for genetic covariance estimation provides robust results even in the presence of sample overlap.

Genomic structural equation modeling

1.
Rationale for Genomic Structural Equation Modeling. We selected Genomic SEM for its ability to explicitly model latent genetic factors underlying correlated phenotypes while accounting for sample overlap through LD Score regression. Unlike alternatives such as MTAG or genomic PCA, Genomic SEM provides interpretable theoretical constructs (the common cognitive factor) rather than atheoretical linear combinations, with formal model fit evaluation (CFI, SRMR, RMSEA) and SNP-level heterogeneity testing to identify domain- specific effects.
2.
We implemented genomic structural equation modeling (Genomic SEM) using the GenomicSEM R package (v.0.0.5)³³ to conduct genomic structural equation GWAS analysis of intelligence, executive function, processing speed, educational attainment, memory performance, and reaction time, investigating the broad common genetic foundation underlying these cognitive ability-related phenotypes. Genomic SEM is a newly developed multivariate method capable of studying multiple potential multivariate models and exploring the potential common genetic structure of the cognitive ability spectrum³⁴.See Table 1 for details.
3.
Genomic SEM is not affected by bias from sample overlap (e.g., potential overlap of participants in different cognitive research cohorts) or sample size imbalances³⁵. It also facilitates identification of genetic variants that affect only some, not all, cognitive phenotypes, thus not representing broad cross-cognitive domain genetic associations but potentially reflecting cognitive domain-specific genetic mechanisms.
4.
Genomic SEM proceeds in two stages. Stage 1 estimates the empirical genetic covariance matrix and corresponding sampling covariance matrix. We prepared summary statistics from cognitive ability-related trait GWAS for Stage 1 and used multivariate extensions of cross-phenotype LD score regression to generate empirical genetic covariance matrices among the six cognitive phenotypes as input for the SEM common factor model³⁶. Stage 2 specifies an SEM model that minimizes differences between the hypothesized covariance structure and the empirical covariance matrix calculated in Stage 1. Here, our main research objective was to identify the common genetic architecture underlying the six cognitive ability-related phenotypes; therefore, we tested a single-factor model to characterize the cognitive ability common factor. Model fit was evaluated using SRMR, model χ², Akaike Information Criterion (AIC), Comparative Fit Index (CFI), and Root Mean Square Error of Approximation (RMSEA)³⁷. By applying appropriate common factor SEM specifications, individual autosomal SNP associations were incorporated into genetic and corresponding sample covariance matrices, generating cognitive ability common factor structural equation genome-wide association analysis results for approximately 1,219,586 SNPs sharing covariance among the six input cognitive ability-related GWAS.

Table 1 Parameters and quality metrics for genomic structural equation modeling.

Full size table

SNP heterogeneity assessment

To assess whether SNP associations in our cognitive ability structural equation GWAS were appropriately modeled within the multivariate structural equation model (SEM) framework, we calculated SNP heterogeneity statistics (Q_SNP)³⁸. The null hypothesis of this test is that SNP associations in individual phenotype GWAS can be completely statistically mediated through the cognitive ability structural equation model. Therefore, significant Q_SNP values (P < 0.05) in our cognitive ability structural equation GWAS indicate that specific SNPs exert effects through pathways other than the shared genetic mechanisms established for cognitive ability-related diseases in the model.Importantly, SNPs with significant Q_SNP values were retained in our analysis rather than excluded. Unlike traditional fixed-effects meta-analysis where heterogeneous SNPs are typically removed, the Genomic SEM framework interprets Q_SNP heterogeneity as biologically informative, representing domain-specific genetic effects that diverge from the common factor pathway. Excluding such SNPs would discard variants with potential specialized roles in specific cognitive domains.

Multi-level model assessment and locus definition

We conducted multi-level strategic adjustments to our genomic structural equation model, including setting different significance thresholds (P < 5 × 10^–16 and P < 5 × 10^–12) to identify novel SNP loci at different confidence levels, balancing statistical power with false positive control³⁹. We used FUMA (Functional Mapping and Annotation of Genetic Associations) to identify genomic loci and find lead SNPs associated with our cognitive ability structural equation GWAS that have low linkage disequilibrium (LD < 0.1) with other SNPs and genome-wide significance (P < 5 × 10^–8)⁴⁰. Novel loci were defined as those > 1 Mb distant from previously identified loci in univariate GWAS data. LD clumping was performed using a window size of 250 kb and r² < 0.1 threshold to define independent lead SNPs, with FUMA automatically performing conditional analysis within each locus to identify independent secondary signals.

Fine-mapping analysis

To identify the most likely causal variants associated with our cognitive ability structural equation GWAS, we used SuSIE and FINEMAP, implemented in the echolocatoR R package v.2.0.3^41,42. We set a probability threshold of 0.95 to define credible sets of potentially causal variants. We used 250 kb windows around each lead SNP to calculate causal inference probabilities for each SNP within these regions. The echolocatoR defines a ‘consensus SNP’ as variants appearing in both SuSIE and FINEMAP results, with average posterior probabilities calculated for these consensus SNPs.

Transcriptome-wide association studies (TWAS)

Following identification of potentially causal variants, we conducted TWAS to prioritize genes associated with our cognitive ability structural equation GWAS based on relationships between gene expression and phenotypes⁴³. We used the FUSION method for TWAS with pre-computed expression quantitative trait loci (eQTL) features from GTEx v.8 data (37,920 gene/tissue pairs)⁴⁴.

We applied a two-tier significance threshold strategy: (1) Bonferroni correction for 37,920 gene-tissue pairs tested (P < 1.32 × 10^–6) to identify high-confidence transcriptome-wide significant genes, and (2) nominal significance (P < 0.05) to retain genes for comprehensive pathway enrichment and cross-method validation analyses, following established practices in GWAS gene-set analysis.

For TWAS-significant genes, we further applied FOCUS (Fine mapping Of Causal gene Sets), a fine-mapping method specifically designed for TWAS studies⁴⁵. FOCUS evaluates whether genes have causal relationships with phenotypes based on posterior inclusion probabilities. We considered TWAS-significant genes that showed consistency with other evidence as potentially causal⁴⁶.

Functional enrichment analysis

We used MAGMA and FUMA (GESA) for gene enrichment analysis and gene pathway set analysis, investigating potential relationships between our cognitive ability structural equation GWAS and Mendelian disease genes and their related pathways⁴⁷. Additionally, we used MendelVar (https://mendelvar.mrcieu.ac.uk/submit/) for gene enrichment analysis⁴⁸. For disease and phenotype enrichment analyses, empirical P-values from permutation testing were corrected for multiple testing using the Benjamini–Hochberg false discovery rate (FDR) method. Associations with FDR < 0.05 were considered statistically significant.

Cell-type annotation analysis

To identify etiological cell types associated with our cognitive ability structural equation GWAS, we used Cell-type Expression-specific integration for Complex Traits (CELLECT) with single-cell RNA sequencing data⁴⁹. We used the Tabula Muris dataset containing transcriptome data from 100,000 cells across 20 organs and tissues from mice. We preprocessed and normalized single-cell RNA sequencing data using CELLEX, calculating expression specificity scores for each gene⁵⁰. Cell-type-specific analysis was performed using LDSC software with a false discovery rate (FDR) threshold of 0.05⁵¹.

Partitioned heritability analysis

We used LDSC to calculate partitioned heritability of genomic regions⁵². By allocating phenotypic genetic information to different genomic regions (genes, enhancers, suppressors, etc.), we evaluated the contribution of each genomic region to phenotypic heritability. LDSC uses weighted LD matrices, genotype frequency files, and summary statistics to calculate and estimate genetic contributions of each region.

Phenotypic association and brain imaging genetics analysis

To comprehensively assess associations between cognitive ability common factors and human diseases as well as neurobiological foundations, we conducted multi-level association analyses⁵³. First, we systematically evaluated associations between cognitive ability common factors and 50,033 human phenotypes in the IEU database, covering 80–90% of currently known common human diseases, biomarkers, drug responses, and lifestyle factors⁵⁴. Second, we used the BrainXcan pipeline for brain imaging genetic analysis of cognitive abilities to identify brain tissue-specific gene expression patterns related to cognitive abilities⁵⁵. Additionally, we used Mendelian randomization methods to assess relationships between cognitive abilities and 3,935 brain imaging-derived phenotypes from UK Biobank, as well as cortical morphological features from 51,665 individuals in the ENIGMA database⁵⁶. MR analyses employed multiple methods to assess robustness and detect horizontal pleiotropy: inverse variance weighted (IVW) as the primary method, supplemented by MR-Egger (testing directional pleiotropy), weighted median (robust to 50% invalid instruments), and weighted mode. Pleiotropy was evaluated using MR-Egger intercept test and Cochran’s Q statistic, with associations showing consistent directionality across methods and no significant pleiotropy (intercept P > 0.05) considered robust. Given the polygenic nature of the cognitive factor, we interpret results as associations with genetic predisposition rather than definitive causal effects.

Polygenic risk score construction

We calculated polygenic risk scores (PRS) based on genome-wide summary statistics and assessed genetic contributions of different chromosomal regions to disease onset⁵⁷. The specific method utilized PRS-CS (Polygenic Risk Score with Continuous Shrinkage) software through GWAS data and external LD reference panels to estimate posterior effect values of SNPs and calculate PRS⁵⁸. This method employs Bayesian regression models, building on GWAS summary statistics and integrating LD reference panels to estimate effect values and calculate PRS.

Results

Structural equation model construction and statistical indicators

LD Score regression analysis revealed the heritability contributions of the six univariate input GWAS as follows: Intelligence (h² = 0.2335 (0.0103), Z = 22.7), Executive Function (h² = 0.1214 (0.0061), Z = 19.8), Processing Speed (h² = 0.1171 (0.0067), Z = 17.4), Educational Attainment (h² = 0.1085 (0.0032), Z = 33.9), Reaction Time (h² = 0.0832 (0.0026), Z = 31.8), and Memory Performance (h² = 0.0364 (0.0035), Z = 10.4)⁵⁹. Detailed single-factor genetic parameters are provided in Supplementary Tables 1 and 2, Fig. 2.

Prior to modeling, we conducted structural equation model analysis. The common factor model based on the genetic covariance matrix and empirical covariance matrix of the six input GWAS demonstrated acceptable fit (Comparative Fit Index (CFI) = 0.898, Standardized Root Mean Square Residual (SRMR) = 0.097)⁶⁰. For detailed model stability assessment, see Supplementary Table 4a; for latent factor (F1) and univariate structural equation model parameters, see Supplementary Table 4b.

Within the single-factor model, Intelligence exhibited the highest factor loading (0.992), followed by Executive Function (0.879) and Processing Speed (0.776), while Educational Attainment (0.531), Memory Performance (0.670), and Reaction Time (0.278) showed moderate to lower loadings. Exploratory factor model analysis (reference Supplementary File 3) provided strong evidence for the existence of a shared genetic factor underlying cognitive abilities³³.

While the high intelligence loading (0.992) reflects its strong relationship with general cognitive ability, the multivariate approach provides distinct advantages over univariate intelligence GWAS. First, our analysis aggregates genetic information across 1,940,623 total participants (vs. 110,988 in intelligence GWAS alone), substantially improving statistical power. Second, we identified 275 novel loci (7.2% of total) not reaching genome-wide significance in any constituent univariate GWAS, representing genuinely new genetic discoveries. Third, the latent factor captures cross-trait genetic architecture-genetic variants affecting multiple cognitive domains through shared biological mechanisms-rather than intelligence-specific pathways. Critically, the remaining five phenotypes contribute unique genetic variance (specific factor loadings: Processing Speed 0.398, Educational Attainment 0.718, Executive Function 0.228, Memory Performance 0.551, Reaction Time 0.923), indicating that mvCognitive represents a broader cognitive construct than intelligence alone. The multivariate framework identifies pleiotropic variants whose effects are amplified when analyzed across correlated phenotypes, enabling detection of shared mechanisms missed by single-trait analyses.

Genomic structural equation model GWAS stratified assessment

By extending structural equation modeling (SEM) to incorporate individual variation, we generated a latent factor GWAS that estimated associations between 1,219,586 single nucleotide polymorphisms (SNPs) and our cognitive ability factor (Supplementary Table 5a). We identified 62 lead SNPs across 1,067 genomic loci at P < 5 × 10^–12, and 33 lead SNPs across 322 genomic loci at the more stringent threshold of P < 5 × 10^–16 (Supplementary Table 5b,c). The newly identified cognitive ability lead SNPs were primarily enriched in pathways related to neurotransmitter metabolism, neurodevelopment, and synaptic plasticity, while also involving metabolic regulation and immune system modulation.

Through systematic literature review, we found that these SNPs included both classical cognitive genes such as rs4820249 (COMT), rs429358 (APOE), and rs6265 (BDNF), which provided important validation in cognitive research, as well as novel discoveries from our cognitive ability structural equation model, including key variant loci such as rs1628294 (ERRFI1), rs17374337 (TSEN2), and rs148729815 (HLA-DQA1) (detailed literature comparison analysis in Supplementary Table 5d). By introducing genomic structural equation modeling (genomic SEM), this study not only validated the role of classical cognitive genes in the comprehensive cognitive model but also discovered genetic variants involving novel pathways such as RNA processing, immune regulation, and signal transduction, providing important insights into the shared genetic foundation of cognitive abilities and potential therapeutic targets.

Genomic control assessment based on LD score regression

Through the parameter controls described in our methods, we removed a total of 729,630 SNPs and retained 489,956 effective SNPs after regression coefficient filtering. The mean chi-square value across all SNPs was Mean Chi² = 2.234, with genomic control Lambda GC = 1.857, maximum Chi² value = 913.763, and 1,584 genome-wide significant SNPs. Heterogeneity testing passed (intercept approaching 1.0), with total observed-scale heritability (h²) = 3.342 × 10^–5 (1.127 × 10^–6), ratio of genetic to environmental contributions = 0.0174 (0.0157), regression model intercept = 1.0215, and regression model intercept standard error = 0.0194. Multiple estimates suggested that the potential inflation in our structural equation model was likely due to polygenic heritability signals rather than population stratification bias or pleiotropy parameter effects.

FUMA-based assessment of cognitive ability structural equation model

Using FUMA software for genomic structural equation assessment, we identified 206 risk gene loci (Table 5E, Fig. 3) and detected 2,014 potential cognitive ability-related genes under genome-wide significant control (sig.thres = 5 × 10^–8, FDR < 0.05, Fig. 4). Through FUMA, we annotated 225 lead SNP loci, with the majority located in intergenic regions (Supplementary Table 6).

GWAS subtraction analysis identified 131 loci previously reported in educational attainment GWAS (such as rs12735232, rs11210887, etc.) (Supplementary Table 7). Comparison with previous literature revealed important discovery patterns for these loci: for example, rs12735232 was reported in educational attainment and reaction time-related research literature, but we found that this locus is not directly causative for educational attainment phenotype, but rather a potential mediating locus that indirectly affects educational achievement by influencing basic cognitive abilities (such as processing speed, executive function, memory performance, and intelligence). Rs11210887 showed consistency with our studied cognitive ability phenotypes in most previous educational attainment literature, further confirming the hypothesis that cognitive ability serves as the biological foundation for educational achievement (Supplementary Tables 8, 9).

Fine-mapping analysis

The 13 high-confidence candidate causal variants map to genes with established roles in neural development and synaptic function. The four representative loci shown in Fig. 5 exemplify diverse biological mechanisms: rs1548868 (MAGI2 locus) maps near MAGI2 (membrane-associated guanylate kinase inverted 2), a scaffolding protein enriched at excitatory synapses that organizes postsynaptic signaling complexes and regulates synaptic plasticity. rs6590555 (NTM locus) localizes to neurotrimin, a GPI-anchored neuronal cell adhesion molecule critical for neurite outgrowth and synapse formation during brain development. rs4291171 (POU6F2 locus) lies within POU class 6 homeobox 2, a transcription factor essential for neuronal subtype specification and forebrain development. rs7998050 (RNA5SP30/PCDH17 locus) is near protocadherin 17, a member of the cadherin superfamily that establishes neural circuit specificity through cell–cell recognition. Among the remaining nine variants, rs4820249 on chromosome 22 maps to COMT (catechol-O-methyltransferase), regulating prefrontal dopamine metabolism—a well-established cognitive mechanism validating our approach. The convergence on synaptic scaffolding (MAGI2), cell adhesion (NTM, PCDH17), transcriptional regulation (POU6F2), and neurotransmitter metabolism (COMT) indicates that cognitive genetic architecture operates through multiple coordinated biological pathways. Complete functional annotations for all 13 variants are provided in Supplementary Table 10.

Transcriptomic prediction

Next, we conducted transcriptome-wide association study (TWAS) using FUSION to identify gene-level associations related to cognitive ability genetic features. Applying stringent Bonferroni correction (P < 1.32 × 10^–6 for 37,920 gene-tissue pairs tested), we identified 33 high-confidence genes that exceeded the transcriptome-wide significance threshold (Extended Data Table 1, Fig. 6). These represent priority candidates for functional follow-up. Additionally, 13,394 gene-tissue pairs showed nominal significance (P < 0.05), which we retained for pathway enrichment analyses and cross-method validation (see below). Subsequently, FOCUS fine-mapping analysis of the genomic structural equation data revealed 179 genes that showed potential causal signals with cognitive abilities. To identify genes with convergent evidence across methods, we performed intersection analysis between TWAS and FOCUS results. Among the high-confidence candidates, genes such as TANK and KANSL1 had TWAS Z-scores > 0, indicating that predicted gene expression was positively correlated with cognitive abilities, suggesting that upregulation of these genes may be associated with increased cognitive performance. Conversely, genes including CCDC152, ITGAV, SLC22A3, and MRPL33 had TWAS Z-scores < 0, indicating that their downregulation was associated with increased cognitive abilities (TWAS and FOCUS intersection results in Supplementary Table 11).

Pathway, cell type, and Mendelian disease gene enrichment

For functional enrichment analyses, we used the 303 genes identified by MAGMA at genome-wide significant threshold (Bonferroni-corrected P < 3.1 × 10^–6) as the primary gene set, as these represent genome-wide significant associations with well-controlled family-wise error rate.

Multi-marker Analysis of GenoMic Annotation (MAGMA) genomic mapping identified 303 genes (Supplementary Table 12), which we used for gene set analysis. These genes showed significant enrichment in GSEA entries (Supplementary Table 13). The most significant enrichment signals were concentrated in neurodevelopmental pathways, including deletion syndrome-related gene sets (P = 1.53 × 10^–8), synapse organization and structure-related gene sets (such as GOBP_SYNAPSE_ORGANIZATION, P = 1.41 × 10^–10), and neuronal development and morphogenesis-related pathways. Additionally, these genes were significantly enriched in multiple cognitive ability-related GWAS signals, including intelligence (P = 1.49 × 10^–102), general cognitive ability (P = 9.82 × 10^–86), and educational attainment (P = 1.20 × 10^–31), further validating the association between our identified genetic variants and cognitive function.

To validate transcriptomic evidence for MAGMA-identified genes, we compared the 303 genes from MAGMA (Bonferroni-corrected P < 3.1 × 10^–6) with the 13,394 gene-tissue pairs from TWAS (nominal P < 0.05). Results showed 206 genes were detected in both analyses (Fig. 7A), indicating significant associations at both genomic (MAGMA) and transcriptomic (TWAS) levels. This high overlap validates the effectiveness of multivariate GWAS in identifying functionally relevant genes. We used nominal significance threshold for TWAS (n = 13,394) rather than stringent threshold (n = 33) to enable comprehensive cross-method validation (Fig. 7).

Furthermore, biological processes mapped by MendelVar enrichment were also supported by GSEA entries. Enrichment analysis revealed significant signals at multiple levels: disease enrichment analysis showed that cognitive-related genes were significantly enriched in blood system diseases and nervous system diseases (FDR < 0.005); biological process enrichment primarily involved renal system blood volume regulation and cell differentiation processes (FDR < 0.0005); molecular function enrichment focused on signal transduction, organelle function, and other basic molecular processes (FDR < 0.05); phenotype enrichment analysis confirmed associations with encephalopathy, myoclonus, and other neurodevelopmental abnormalities (FDR < 0.001). These enrichment results strongly support the important role of cognitive ability-related genes in neurodevelopment, blood systems, and basic cellular functions.

In cell type enrichment analysis from different datasets, we conducted systematic validation using multiple datasets. In the primary Tabula Muris analysis, 5 cell types exceeded the significance threshold after multiple comparison correction (FDR < 0.05) (Supplementary Table 14). The top two cell types were Brain_Non-Myeloid_neuron (P = 3.55 × 10^–12) and Brain_Non-Myeloid_oligodendrocyte_precursor_cell (P = 3.73 × 10^–6).

Further validation analyses showed: In Cahoy brain cell data, Neuron cell type was significantly enriched (P = 8.31 × 10^–5, FDR = 2.49 × 10^–4); In GTEx tissue analysis, 13 brain tissues were all significantly enriched (FDR < 0.05), with Brain_Frontal_Cortex showing the strongest enrichment (P = 1.84 × 10^–8); In multi-tissue gene expression analysis, 18 brain-related tissues/cell types were significantly enriched, further confirming the strong association between cognitive abilities and the nervous system. Cognitive abilities also showed enrichment trends in B cells/immune cell precursors, with 6 immune-related cell types having P < 0.05, including Marrow_naive_B_cell, Marrow_macrophage, Spleen_B_cell, etc., although they did not reach significance thresholds after multiple comparison correction. These consistent cross-dataset results strongly support the conclusion that cognitive abilities primarily function through brain cell-specific mechanisms (Figs. 8, 9).

Heritability contribution results from genomic regions

We found that cognitive ability-related genetic contribution loci were significantly concentrated in regulatory regions and functionally enriched areas of chromosomes. LDSC partitioned heritability analysis based on 53 functional genomic regions showed that these regions primarily involved key sites for gene expression regulation, chromatin modification, and transcription factor binding. The effects of genetic variants were most significant in evolutionarily conserved sequences and transcriptional regulatory regions.

Evolutionarily conserved sequence core regions exhibited extremely strong 17.79-fold enrichment (P = 9.61 × 10^–18), while transcription start site core regions achieved 5.97-fold enrichment (P = 2.51 × 10^–4). Active transcription marks H3K9ac and H3K4me3 peak regions showed 5.24-fold and 4.79-fold enrichment, respectively, and enhancer mark H3K4me1 peak regions reached 2.59-fold enrichment. These findings indicate that evolutionarily conserved transcriptional regulatory sites are core carriers of cognitive ability genetic variants and may exert important effects on cognitive phenotypes through regulation of gene expression levels.

Additionally, coding regions and certain non-coding regions also showed important genetic contributions. Protein-coding regions displayed 11.29-fold enrichment, 5’ UTR regions achieved 8.00-fold enrichment, while extended intronic regions showed 1.32-fold enrichment at extremely significant levels (P = 2.92 × 10^–8), suggesting that these regions may participate in complex genetic mechanisms through regulation of gene expression or function. This differential contribution pattern between coding and regulatory regions further confirms that the genetic foundation of cognitive abilities relies more on fine control of gene expression regulation (Supplementary Table 15, Fig. 10).

Multilevel biological foundations of cognitive abilities

We identified 857 phenotypes significantly associated with cognitive abilities from the 50,033 phenotypes in the IEU database (FDR < 0.05). Effect direction distribution revealed 156 protective associations and 131 risk associations. Among these, years of education showed a strong positive association (OR = 1.256, P = 4.63 × 10^–14), and university degree completion was significantly associated (OR = 1.089, P = 3.55 × 10^–17). Cognitive abilities showed protective effects against ADHD (OR = 0.980, P = 0.005). Physical activity phenotypes exhibited strong protective associations, including moderate intensity activity duration (OR = 0.757, P = 3.26 × 10^–8) and vigorous activity duration (OR = 0.827, P = 4.32 × 10^–7). Glucose metabolism-related indicators showed beneficial effects, with fasting glucose (OR = 0.948, P = 6.95 × 10^–4) and 2-h postprandial glucose (OR = 0.962, P = 1.12 × 10^–3) both showing protective associations (Supplementary Table 16, Fig. 11).

Among 261 brain imaging phenotypes, BrainXcan analysis identified 187 significant associations (FDR < 0.05), with a significance rate of 71.6%, revealing the widespread distributed neural foundations of cognitive abilities. The analysis covered 109 T1-weighted structural imaging and 152 diffusion MRI phenotypes. Structural imaging revealed 48 cortical gray matter volumes, 13 subcortical gray matter volumes, 10 subcortical structural volumes, and 27 cerebellar gray matter regions involved in cognitive function. Diffusion imaging revealed the critical role of white matter microstructure, including extensive associations with 46 FA, 44 ICVF, 45 OD, and 13 ISOVF measures (Supplementary Table 17, Figs. 12, 13, 14, 15).

ENIGMA cortical morphology cohort: Based on precentral gyrus surface area data from 51,665 individuals, we assessed its causal relationship with cognitive abilities using 5 Mendelian randomization methods. The inverse variance weighted method showed a strong protective effect (OR = 0.9588, 95% CI: 0.9400–0.9781, P = 3.33 × 10^–5). Sensitivity analyses using weighted median (OR = 0.9612, P = 0.002), MR-Egger, and weighted mode methods demonstrated consistent directionality, with no evidence of significant directional pleiotropy (complete sensitivity analysis results in Supplementary Table 18, Fig. 16).

UK Biobank multimodal imaging cohort: Among 3,935 brain imaging-derived phenotypes, we identified 31 phenotypes significantly associated with cognitive abilities (IVW method, FDR < 0.05). These associations encompassed T1 structural imaging (3), hippocampal subfields (2), brainstem (1), visual cortex (1), cortical parcellation (1), white–gray matter contrast (7), diffusion MRI TBSS measures (7), diffusion MRI tractography measures (8), and functional connectivity (1), with 24 phenotypes showing protective effects (77.4%). Sensitivity analyses confirmed robustness: MR-Egger intercept tests showed no evidence of directional pleiotropy (all P > 0.05), Cochran’s Q statistics indicated acceptable heterogeneity, and effect direction consistency across IVW, weighted median, and weighted mode methods was observed in the majority of associations (complete results in Supplementary Table 19, Fig. 17).

Chromosome-level results

Our PRS-CS analysis revealed that cognitive ability-related genetic variant loci exhibited a significant hierarchical distribution pattern across different chromosomes. Weight analysis based on 455,310 effective SNPs showed marked differences in genetic contributions to cognitive phenotypes among chromosomes. Regarding chromosomal contributions, the first few large chromosomes displayed relatively high genetic contributions. Chromosomes 1 and 2 showed the highest total weight contributions, reaching magnitudes of 1.2 × 10^–1 and 1.8 × 10^–1, respectively, while chromosomes 3–5 contributed in the range of 6 × 10^–2 to 1 × 10^–1. The contribution levels of smaller chromosomes (such as chromosomes 21 and 22) were essentially proportional to their genomic sizes, indicating that genetic contributions to cognitive abilities are primarily determined by genome scale and gene density.

Weight direction analysis showed that positive-weight SNPs (229,940) and negative-weight SNPs (225,370) were relatively balanced across chromosomes, suggesting that cognitive abilities involve complex bidirectional regulatory networks. Top 20 high-impact SNPs were primarily distributed on chromosomes 6, 9, 10, 11, 15, 16, 19, 22, and other regions. These high-weight variants may exert important effects on cognitive phenotypes through influencing key neurobiological pathways (Fig. 18).

Discussion

The genetic dissection of cognitive abilities has long been one of the core challenges in behavioral genetics and neuroscience. This study, through multivariate genomic structural equation modeling based on six cognitive-related phenotypes, identified 3,842 genome-wide significant loci, including 275 novel loci, contributing to a more comprehensive genetic map of cognitive abilities. Our single-factor model revealed that Intelligence (loading 0.992), executive function (0.879), and processing speed (0.776) constitute the core components of cognitive genetic architecture, while 13 high-confidence candidate causal variants, 33 TWAS candidate genes, 71.6% brain imaging association rate, and systematic discoveries across 857 health-related phenotypes provide multidimensional evidence for understanding the biological nature of human cognition⁵⁹.

The latent cognitive factor identified through this approach represents shared genetic variance common to all six phenotypes, capturing pleiotropic variants that influence multiple cognitive domains through coordinated biological mechanisms. Biologically, this factor reflects fundamental neural processes—synaptic efficiency, neural connectivity, and neurotransmitter regulation—that collectively enable cognitive performance across diverse tasks. The hierarchical loading pattern, with intelligence showing highest weight, suggests the factor primarily captures efficiency of information processing and executive control. Enrichment analyses revealing convergence on neurodevelopmental pathways, synaptic organization, and transcriptional regulation indicate that genetic variation operates primarily through early brain development and synaptic function. The distributed genetic architecture across all chromosomes further supports the interpretation that cognitive abilities emerge from orchestrated functioning of multiple neural systems rather than localized brain modules. These findings contribute to our understanding of the genetic basis of cognitive abilities and provide scientific support for cognitive health management in the era of precision medicine⁶⁰.

A major challenge in traditional cognitive genetics has been the “phenotypic heterogeneity paradox”: while psychometric evidence strongly supports the existence of general cognitive ability (g-factor)⁶¹, single cognitive domain GWAS have struggled to fully capture this unity. Our genomic structural equation modeling provides a potential molecular genetic approach to address this long-standing problem. The high factor loading of intelligence (0.992) provides support for Spearman’s g-factor theory, while the significant loadings of executive function (0.879) and processing speed (0.776) reveal the fine structure of cognitive genetic architecture.

This loading pattern has potential theoretical implications. It suggests that the genetic basis of human cognitive abilities may be composed of “core cognitive components”—general intelligence representing abstract reasoning ability, executive function reflecting cognitive control capacity, and processing speed embodying neural efficiency⁶². The relatively lower loadings of educational attainment (0.531), memory performance (0.670), and reaction time (0.278) do not mean they are unimportant, but rather reflect that these phenotypes contain more environmental, developmental, and measurement-specific components⁶³.

Our multivariate GWAS identified 3,842 genome-wide significant loci, of which 275 (7.2%) were novel—not detected in any constituent univariate analysis. These novel loci localize predominantly to regulatory regions (68% non-coding) enriched for neurodevelopmental and synaptic pathways, representing pleiotropic variants affecting multiple cognitive domains through shared biological mechanisms. While most identified loci (92.8%) were previously reported in single-trait studies⁶⁴, the multivariate framework enabled detection of cross-trait genetic architecture missed by single-phenotype analyses, demonstrating the value of integrative approaches for identifying coordinated genetic networks⁶⁵.

The 3,842 genome-wide significant loci we identified were systematically annotated through FUMA software, mapping to 206 independent loci and 2,014 potential candidate genes, constructing a relatively complete discovery chain from SNPs to loci to candidate genes. Genomic control assessment provides statistical support for our findings: Mean Chi² = 2.234, Lambda GC = 1.857, indicating that the observed signals are primarily driven by true polygenic heritability rather than population stratification or technical bias⁶⁶.

The 131 “mediating loci” discovered through GWAS subtraction analysis have special theoretical value. These loci, including rs12735232 (chromosome 6), rs11210887 (chromosome 1), reached suggestive significance but did not achieve genome-wide threshold in previous educational attainment GWAS. Our analysis reveals their potential role as cognitive ability mediating loci, proposing the concept of “genetic mediation effects”: certain genetic variants may not directly influence external manifestations but exert effects indirectly through modulating internal cognitive abilities⁶⁷.

Through dual fine-mapping strategies using SuSIE and FINEMAP, we identified 13 high-confidence candidate causal variants (average posterior probability > 0.95), providing important tools for cognitive genetics to transition from “association discovery” to “causal inference”⁶⁸. Functional annotation of these variants reveals potential molecular regulatory networks of cognitive abilities, including neurodevelopmental regulatory axes (rs1628294 near ERRFI1, rs17374337 in TSEN2), synaptic plasticity regulatory axes (rs1566854 in CAMK2D), and transcriptional regulatory axes (rs2299297 near HOXA, chromosome 22 variant in COMT)⁶⁹.

Our TWAS analysis identified 33 significant genes, with FOCUS fine-mapping revealing 179 candidate genes and TWAS-MAGMA intersection validation showing 67.9% overlap, establishing a relatively complete mechanism chain from genetic variants to gene expression to cognitive phenotypes⁷⁰. Key discoveries include the epigenetic regulatory network (KANSL1 participating in histone H4K16 acetylation)⁷¹, neuroimmune regulatory network (TANK as TNF receptor-associated factor regulatory protein)⁷², and neurotransmitter metabolic network (SPR encoding tetrahydrobiopterin synthase)⁷³.

Functional enrichment analysis revealed the “molecular ecosystem” of cognitive abilities. The 303 genes identified by MAGMA analysis showed significant enrichment in neurodevelopment-related pathways, including deletion syndrome-related gene sets and synaptic organization structure, suggesting that cognitive abilities may be an “emergent property” of the entire neurodevelopmental network⁷⁴. Cell-type analysis showed extreme enrichment in neurons and significant enrichment in oligodendrocyte precursors, supporting “white matter plasticity” theory⁷⁵.

LDSC partitioned heritability analysis revealed that evolutionarily conserved sequences showed extreme enrichment, indicating that cognitive-related genes are under strong selective pressure. The significant enrichment of transcriptional regulatory regions relative to coding regions reveals the “regulatory evolution” characteristics of cognitive evolution⁷⁶.

The 71.6% significance rate (187/261) from BrainXcan analysis represents an important discovery in cognitive neuroscience theory. This result suggests that nearly three-quarters of brain imaging phenotypes are cognition-related, supporting the theoretical view that “cognition is a whole-brain network property”⁷⁷. Diffusion imaging findings showed widespread associations across multiple parameters, suggesting that cognitive abilities may primarily reflect brain network “conduction efficiency” rather than “local function”⁷⁸. The widespread participation of 27 cerebellar regions provides genetic support for cerebellar cognitive theory⁷⁹.

Mendelian randomization validation in ENIGMA and UK Biobank provides important evidence for brain-cognition causal relationships. In the ENIGMA cohort, precentral gyrus surface area showed strong protective effects, providing neuroanatomical support for “embodied cognition” theory⁸⁰. UK Biobank’s 31 significant associations across multimodal imaging, with 77.4% showing protective effects, further support the mechanism that cognitive abilities are achieved through whole-brain network coordination.

We identified 857 significant associations among 50,033 human phenotypes, constructing a comprehensive association map between cognitive abilities and human health. Cognitive abilities showed protective effects against neuropsychiatric diseases, including ADHD, providing large-scale genetic evidence for “cognitive reserve theory”. This protective association suggests that genetic variants enhancing cognitive abilities may buffer against neurodevelopmental disorder risk through improved executive control and attentional regulation, though the modest effect size indicates cognitive genetics represents only one component of complex neuropsychiatric etiology⁸¹. Metabolic indicator associations revealed the “cognitive-metabolic axis,” with protective associations of fasting glucose and postprandial glucose suggesting shared biological pathways linking brain energy metabolism with cognitive performance. This bidirectional relationship has potential implications for prevention strategies targeting both metabolic and cognitive health through integrated interventions⁸².

Strong protective effects of physical activity provide evidence for the “cognitive-motor co-evolution” hypothesis. This pattern may reflect genetic influences on motivation, executive function, or reward processing that jointly affect cognitive performance and health behaviors, informing precision medicine approaches that account for individual genetic predispositions⁸³. From a translational perspective, these widespread phenotypic associations (857 significant) advance understanding of disease mechanisms through genetic overlap patterns. While current predictive accuracy remains insufficient for clinical risk prediction, the identified genetic pathways highlight potential targets for interventions aimed at enhancing cognitive resilience and preventing cognitive decline⁸⁴.

Our chromosome-level PRS analysis revealed hierarchical distribution of genetic variants across chromosomes, with chromosomes 1 and 2 showing highest contributions (1.2 × 10^–1 and 1.8 × 10^–1), proportional to their genomic size and gene density. The balanced distribution of positive-weight (229,940) and negative-weight (225,370) SNPs across all chromosomes confirms the highly polygenic nature of cognitive abilities.

However, we acknowledge an important limitation: we did not evaluate the incremental predictive utility of chromosome-specific PRS compared to conventional genome-wide approaches in independent validation cohorts. The chromosome-level decomposition presented here serves to illustrate the distributed genomic architecture rather than to propose a superior prediction method. Future studies with independent datasets are needed to assess whether chromosome-specific risk profiles could improve individual-level prediction or identify subgroups who might benefit from targeted interventions. The genomic-wide distribution suggests that effective prediction requires integration across all chromosomes rather than focusing on specific chromosomal regions.

Despite providing new insights into cognitive ability genetics, this study has several important limitations that should be considered when interpreting the results.

Population and Generalizability Limitations: First, our sample population is primarily of European ancestry, which significantly limits the generalizability of findings to other populations and may not capture population-specific genetic variants that contribute to cognitive abilities in non-European populations. The genetic architecture of cognitive abilities may differ across populations due to different allele frequencies, linkage disequilibrium patterns, and population-specific evolutionary pressures.

Methodological and Causal Inference Limitations: Second, while we identified genetic loci associated with cognitive abilities through structural equation modeling, establishing direct causal relationships between specific genetic variants and biological mechanisms remains challenging and requires functional validation studies. Our fine-mapping analysis provides statistical evidence for potentially causal variants, but experimental validation in cellular and animal models is needed to confirm biological causality.

Genetic Coverage Limitations: Third, our analysis focuses primarily on common genetic variants (MAF > 0.01) and may miss important contributions from rare variants, structural variations, copy number variants, and other forms of genetic variation that could significantly impact cognitive abilities. Additionally, we did not account for potential effects of mitochondrial DNA variants or epigenetic modifications.

Environmental and Developmental Limitations: Fourth, although genetic factors are important, environmental factors, gene-environment interactions, and epigenetic modifications likely play substantial roles in cognitive development that were not fully addressed in this study. Factors such as educational opportunities, socioeconomic status, nutrition, and early life experiences may modify genetic effects on cognitive abilities.

Temporal and Developmental Considerations: Fifth, the cross-sectional nature of most input GWAS data limits our understanding of how genetic effects may vary across development, aging, and different life stages. Cognitive abilities are dynamic traits that change throughout the lifespan, and genetic effects may be age-dependent or show different magnitudes at different developmental periods.

Phenotypic Complexity Limitations: Sixth, our cognitive ability factor, while statistically robust, represents a simplified model of the complex, multifaceted nature of human cognition. Real-world cognitive performance involves numerous specific abilities, contextual factors, and measurement considerations that may not be fully captured by our latent factor approach.

Technical and Statistical Limitations: Finally, our multivariate approach, while innovative, relies on summary statistics from different studies with potentially varying measurement approaches, quality control procedures, and population characteristics, which could introduce heterogeneity into our analysis despite our quality control measures.

Conclusions

This study provides evidence supporting the genetic basis of cognitive abilities through multi-level analysis. From 3,842 genetic loci to 33 candidate genes, from 71.6% brain imaging associations to 857 health-related phenotypes, we developed a cognitive biological framework spanning molecular, cellular, tissue, organ, and individual levels. These findings contribute to our understanding of human cognitive nature and provide potential foundations for evidence-based cognitive health management and human potential development in the era of precision medicine.

Data availability

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request. The input GWAS summary statistics were obtained from publicly available sources as detailed in the Methods section, including data from the IEU OpenGWAS database and other repositories specified in the respective publications.

References

Deary, I. J., Penke, L. & Johnson, W. The neuroscience of human intelligence differences. Nat. Rev. Neurosci. 11, 201–211 (2010).
Article CAS PubMed Google Scholar
Calvin, C. M. et al. Childhood intelligence in relation to major causes of death in 68 year follow-up: prospective population study. BMJ 357, j2708 (2017).
Article PubMed PubMed Central Google Scholar
Plomin, R. & von Stumm, S. The new genetics of intelligence. Nat. Rev. Genet. 19, 148–159 (2018).
Article CAS PubMed PubMed Central Google Scholar
Hill, W. D. et al. Human cognitive ability is influenced by genetic variation in components of postsynaptic signalling complexes assembled by NMDA receptors and MAGUK proteins. Transl. Psychiatry 4, e463 (2014).
Article Google Scholar
Goriounova, N. A. & Mansvelder, H. D. Genes, cells and brain areas of intelligence. Front. Hum. Neurosci. 13, 44 (2019).
Article PubMed PubMed Central Google Scholar
Trzaskowski, M. et al. DNA evidence for strong genetic stability and increasing heritability of intelligence from age 7 to 12. Mol. Psychiatry 19, 380–384 (2014).
Article CAS PubMed Google Scholar
Briley, D. A. & Tucker-Drob, E. M. Explaining the increasing heritability of cognitive ability across development: a meta-analysis of longitudinal twin and adoption studies. Psychol. Sci. 24, 1704–1713 (2013).
Article PubMed Google Scholar
Davies, G. et al. Study of 300,486 individuals identifies 148 independent genetic loci influencing general cognitive function. Nat. Commun. 9, 2098 (2018).
Article ADS PubMed PubMed Central Google Scholar
Lee, J. J. et al. Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nat. Genet. 50, 1112–1121 (2018).
Article CAS PubMed PubMed Central Google Scholar
Coleman, J. R. I. et al. Biological annotation of genetic loci associated with intelligence in a meta-analysis of 87,740 individuals. Mol. Psychiatry 24, 182–197 (2019).
Article CAS PubMed Google Scholar
Grotzinger, A. D. et al. Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat. Hum. Behav. 3, 513–525 (2019).
Article PubMed PubMed Central Google Scholar
Davies, G. et al. Genome-wide association study of cognitive functions and educational attainment in UK Biobank (N=112,151). Mol. Psychiatry 21, 758–767 (2016).
Article CAS PubMed PubMed Central Google Scholar
Okbay, A. et al. Investigating the genetic architecture of noncognitive skills using GWAS-by-subtraction. Nat. Genet. 53, 35–44 (2021).
Article Google Scholar
Davies, G. et al. Genetic contributions to variation in general cognitive function: a meta-analysis of genome-wide association studies in the CHARGE consortium (N=53,949). Mol. Psychiatry 30, 241–247 (2015).
Google Scholar
Grotzinger, A. D. et al. Transcriptome-wide and stratified genomic structural equation modeling identify neurobiological pathways shared across diverse cognitive traits. Nat. Commun. 13, 6280 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Debette, S. et al. Genome-wide studies of verbal declarative memory in nondemented older people: The Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium. Biol. Psychiatry 77, 749–763 (2015).
Article PubMed Google Scholar
de la Fuente, J. et al. A general dimension of genetic sharing across diverse cognitive traits inferred from molecular data. Nat. Hum. Behav. 5, 49–58 (2021).
Article PubMed Google Scholar
Hill, W. D. et al. Genome-wide analysis identifies molecular systems and 149 genetic loci associated with income. Nat. Commun. 10, 5741 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Lencz, T. et al. Molecular genetic evidence for genetic overlap between general cognitive ability and risk for schizophrenia: a report from the cognitive genomics consortium (COGENT). Mol. Psychiatry 19, 168–174 (2014).
Article CAS PubMed Google Scholar
Johnson, M. R. et al. Systems genetics identifies a convergent gene network for cognition and neurodevelopmental disease. Nat. Neurosci. 19, 223–232 (2016).
Article CAS PubMed Google Scholar
Zhao, B. et al. Genome-wide association analysis of 19,629 individuals identifies variants influencing regional brain volumes and refines their genetic co-architecture with cognitive and mental health traits. Nat. Genet. 51, 1637–1644 (2019).
Article CAS PubMed PubMed Central Google Scholar
Rietveld, C. A. et al. Common genetic variants associated with cognitive performance identified using the proxy-phenotype method. Proc. Natl. Acad. Sci. U. S. A. 111, 13790–13794 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Savage, J. E. et al. Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nat. Genet. 50, 912–919 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Luciano, M. et al. Association analysis in over 329,000 individuals identifies 116 independent variants influencing neuroticism. Nat. Genet. 50, 6–11 (2018).
Article CAS PubMed Google Scholar
Jansen, I. E. et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat. Genet. 51, 404–413 (2019).
Article CAS PubMed PubMed Central Google Scholar
Sniekers, S. et al. Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence. Nat. Genet. 49, 1107–1112 (2017).
Article CAS PubMed PubMed Central Google Scholar
Demontis, D. et al. Discovery of the first genome-wide significant risk loci for attention deficit/hyperactivity disorder. Nat. Genet. 51, 63–75 (2019).
Article CAS PubMed Google Scholar
Anderson, C. A. et al. Data quality control in genetic case-control association studies. Nat. Protoc. 5, 1564–1573 (2010).
Article CAS PubMed PubMed Central Google Scholar
Auton, A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Article ADS PubMed Google Scholar
Traherne, J. A. Human MHC architecture and evolution: implications for disease association studies. Int. J. Immunogenet. 35, 179–192 (2008).
Article CAS PubMed PubMed Central Google Scholar
Winkler, T. W. et al. Quality control and conduct of genome-wide association meta-analyses. Nat. Protoc. 9, 1192–1212 (2014).
Article PubMed PubMed Central Google Scholar
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
Article CAS PubMed PubMed Central Google Scholar
Hu, L. T. & Bentler, P. M. Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct. Equ. Modeling 6, 1–55 (1999).
Article Google Scholar
Elliott, L. T. et al. Genome-wide association studies of brain imaging phenotypes in UK Biobank. Nature 562, 210–216 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Article CAS PubMed PubMed Central Google Scholar
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
Article CAS PubMed PubMed Central Google Scholar
Cochran, W. G. The combination of estimates from different experiments. Biometrics 10, 101–129 (1954).
Article Google Scholar
Verbanck, M. et al. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 50, 693–698 (2018).
Article CAS PubMed PubMed Central Google Scholar
Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 9, 1825 (2018).
Article ADS PubMed PubMed Central Google Scholar
Watanabe, K. et al. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).
Article ADS PubMed PubMed Central Google Scholar
Wang, G. et al. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R Stat. Soc. Ser. B Stat. Methodol. 82, 1273–1300 (2020).
Article MathSciNet Google Scholar
Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wainberg, M. et al. Opportunities and challenges for transcriptome-wide association studies. Nat. Genet. 51, 592–599 (2019).
Article CAS PubMed PubMed Central Google Scholar
Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).
Article CAS PubMed PubMed Central Google Scholar
Mancuso, N. et al. Probabilistic fine-mapping of transcriptome-wide association studies. Nat. Genet. 51, 675–682 (2019).
Article CAS PubMed PubMed Central Google Scholar
Yuan, Z. et al. GIFT: conditional transcriptome-wide association study for fine-mapping candidate causal genes. Nat. Genet. 56, 348–356 (2024).
Article PubMed Google Scholar
de Leeuw, C. A. et al. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, e1004219 (2015).
Article PubMed PubMed Central Google Scholar
Cho, Y. et al. Exploiting horizontal pleiotropy to search for causal pathways within a Mendelian randomization framework. Nat. Commun. 11, 1010 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Skene, N. G. et al. Genetic identification of brain cell types underlying schizophrenia. Nat. Genet. 50, 825–833 (2018).
Article CAS PubMed PubMed Central Google Scholar
Hu, Y. et al. A statistical framework for cross-tissue transcriptome-wide association analysis. Nat. Genet. 51, 568–576 (2019).
Article CAS PubMed PubMed Central Google Scholar
Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).
Article CAS PubMed PubMed Central Google Scholar
Gazal, S. et al. Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat. Genet. 49, 1421–1427 (2017).
Article CAS PubMed PubMed Central Google Scholar
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Hemani, G. et al. Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum. Mol. Genet. 27, R195–R208 (2018).
Article CAS PubMed PubMed Central Google Scholar
Gusev, A. et al. Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. Nat. Genet. 50, 538–548 (2018).
Article CAS PubMed PubMed Central Google Scholar
Thompson, P. M. et al. The ENIGMA Consortium: large-scale collaborative analyses of neuroimaging and genetic data. Nat. Rev. Neurosci. 15, 299–311 (2014).
Google Scholar
Choi, S. W. & O’Reilly, P. F. PRSice-2: Polygenic risk score software for biobank-scale data. GigaScience 8, giz082 (2019).
Article PubMed PubMed Central Google Scholar
Ge, T. et al. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun. 10, 1776 (2019).
Article ADS PubMed PubMed Central Google Scholar
Turley, P. et al. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat. Genet. 50, 229–237 (2018).
Article CAS PubMed PubMed Central Google Scholar
Collins, F. S. & Varmus, H. A new initiative on precision medicine. N. Engl. J. Med. 372, 793–795 (2015).
Article CAS PubMed PubMed Central Google Scholar
Willoughby, E. A. et al. Behavioural genetics methods. Nat. Rev. Methods Primers 3, 17 (2023).
Article Google Scholar
Sauce, B. & Matzel, L. D. The paradox of intelligence: Heritability and malleability coexist in hidden gene-environment interplay. Psychol. Bull. 144, 26–47 (2018).
Article PubMed Google Scholar
Malanchini, M. et al. Cognitive ability and education: how behavioural genetic research has advanced our knowledge and understanding of their association. Neurosci. Biobehav. Rev. 111, 229–245 (2020).
Article PubMed PubMed Central Google Scholar
Watanabe, K. et al. A global overview of pleiotropy and genetic architecture in complex traits. Nat. Genet. 51, 1339–1348 (2019).
Article CAS PubMed Google Scholar
Lencz, T. et al. GWAS meta-analysis reveals novel loci and genetic correlates for general cognitive function: a report from the COGENT consortium. Mol. Psychiatry 22, 336–345 (2017).
Article PubMed PubMed Central Google Scholar
Nagel, M. et al. Meta-analysis of genome-wide association studies for neuroticism in 449,484 individuals identifies novel genetic loci and pathways. Nat. Genet. 50, 920–927 (2018).
Article CAS PubMed Google Scholar
Hill, W. D. et al. A combined analysis of genetically correlated traits identifies 187 loci and a role for neurogenesis and myelination in intelligence. Mol. Psychiatry 24, 169–181 (2019).
Article CAS PubMed Google Scholar
Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).
Article CAS PubMed PubMed Central Google Scholar
Hagenauer, M. H. et al. Genes associated with cognitive ability and HAR show overlapping expression patterns in human cortical neuron types. Nat. Commun. 14, 4188 (2023).
Article Google Scholar
Timshel, P. N. et al. Genetic mapping of etiologic brain cell types for obesity. Elife 9, e55851 (2020).
Article CAS PubMed PubMed Central Google Scholar
Koolen, D. A. et al. Mutations in the chromatin modifier gene KANSL1 cause the 17q21.31 microdeletion syndrome. Nat. Genet. 44, 639–641 (2012).
Article CAS PubMed Google Scholar
Pomerantz, J. L. & Baltimore, D. NF-kappaB activation by a signaling complex containing TRAF2, TANK and TBK1, a novel IKK-related kinase. EMBO J. 18, 6694–6704 (1999).
Article CAS PubMed PubMed Central Google Scholar
Thöny, B. et al. Tetrahydrobiopterin biosynthesis, regeneration and functions. Biochem. J. 347, 1–16 (2000).
Article PubMed PubMed Central Google Scholar
Geschwind, D. H. & Levitt, P. Autism spectrum disorders: developmental disconnection syndromes. Curr. Opin. Neurobiol. 17, 103–111 (2007).
Article CAS PubMed Google Scholar
MacArthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).
Article CAS PubMed Google Scholar
Pollard, K. S. et al. Forces shaping the fastest evolving regions in the human genome. PLoS Genet. 2, e168 (2006).
Article PubMed PubMed Central Google Scholar
Bullmore, E. & Sporns, O. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 10, 186–198 (2009).
Article CAS PubMed Google Scholar
Penke, L. et al. A general factor of brain white matter integrity predicts information processing speed in healthy older adults. J. Neurosci. 30, 7569–7574 (2010).
Article CAS PubMed PubMed Central Google Scholar
Buckner, R. L. The cerebellum and cognitive function: 25 years of insight from anatomy and neuroimaging. Neuron 80, 807–815 (2013).
Article CAS PubMed Google Scholar
Barsalou, L. W. Grounded cognition. Annu. Rev. Psychol. 59, 617–645 (2008).
Article PubMed Google Scholar
Stern, Y. Cognitive reserve in ageing and Alzheimer’s disease. Lancet Neurol. 11, 1006–1012 (2012).
Article PubMed PubMed Central Google Scholar
Mergenthaler, P. et al. Sugar for the brain: the role of glucose in physiological and pathological brain function. Trends Neurosci. 36, 587–597 (2013).
Article CAS PubMed PubMed Central Google Scholar
Hillman, C. H. et al. Be smart, exercise your heart: exercise effects on brain and cognition. Nat. Rev. Neurosci. 9, 58–65 (2008).
Article CAS PubMed Google Scholar
Rietveld, C. A. et al. GWAS of 126,559 individuals identifies genetic variants associated with educational attainment. Science 340, 1467–1471 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar

Download references

Author information

Haifeng Chen and Yuxiong Liao contributed equally to this work.

Authors and Affiliations

Department of Brain Diseases, Ward 1, Nanning Hospital of Traditional Chinese Medicine, Affiliated to Guangxi University of Chinese Medicine, No. 45 Beihu North Road, Xixiangtang District, Nanning, 530000, China
Haifeng Chen, Yuxiong Liao, Luejun Tang, Xiaoyun Wei, Tongshun Li & Wei Chen

Authors

Haifeng Chen
View author publications
Search author on:PubMed Google Scholar
Yuxiong Liao
View author publications
Search author on:PubMed Google Scholar
Luejun Tang
View author publications
Search author on:PubMed Google Scholar
Xiaoyun Wei
View author publications
Search author on:PubMed Google Scholar
Tongshun Li
View author publications
Search author on:PubMed Google Scholar
Wei Chen
View author publications
Search author on:PubMed Google Scholar

Contributions

H.C. and Y.L. conceived the study, designed the analytical framework, performed the genomic structural equation modeling, conducted the transcriptome-wide association studies, and wrote the main manuscript text. H.C. prepared Figs. 1, 2, 3, 4, 5, 6, 12, 13, 14, 15, and performed the functional enrichment analyses. Y.L. prepared Figs. 7, 8, 9, 10, 11, 16, 17, 18, and conducted the brain imaging genetics analyses. L.T. performed the Mendelian randomization analyses, conducted quality control procedures, and contributed to methodology validation. X.W. performed the phenome-wide association studies, assisted with data curation, and contributed to writing—review and editing. T.L. developed the analytical software pipelines, performed statistical validations, and assisted with data visualization. W.C. supervised the project, secured funding, provided conceptual guidance, contributed to manuscript writing and revision, and served as the corresponding author with overall responsibility for the study. All authors reviewed and approved the final manuscript.

Corresponding author

Correspondence to Wei Chen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information 1. (download TXT )

Supplementary Information 2. (download TXT )

Supplementary Information 3. (download TXT )

Supplementary Information 4. (download TXT )

Supplementary Information 5. (download TXT )

Supplementary Information 6. (download TXT )

Supplementary Information 7. (download TXT )

Supplementary Information 8. (download TXT )

Supplementary Information 9. (download TXT )

Supplementary Information 10. (download TXT )

Supplementary Information 11. (download TXT )

Supplementary Information 12. (download TXT )

Supplementary Information 13. (download TXT )

Supplementary Information 14. (download CSV )

Supplementary Information 15. (download CSV )

Supplementary Information 16. (download CSV )

Supplementary Information 17. (download CSV )

Supplementary Information 18. (download CSV )

Supplementary Information 19. (download CSV )

Supplementary Information 20. (download CSV )

Supplementary Information 21. (download CSV )

Supplementary Information 22. (download TXT )

Supplementary Information 23.

Supplementary Information 24. (download TXT )

Supplementary Information 25. (download TXT )

Supplementary Information 26. (download TXT )

Supplementary Information 27. (download TXT )

Supplementary Information 28. (download TXT )

Supplementary Information 29. (download TXT )

Supplementary Information 30. (download TXT )

Supplementary Information 31. (download XLSX )

Supplementary Information 32. (download XLSX )

Supplementary Information 33. (download TXT )

Supplementary Information 34.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, H., Liao, Y., Tang, L. et al. Multivariate genome-wide analysis reveals shared genetic architecture and brain structural correlates of human cognitive abilities. Sci Rep 15, 41596 (2025). https://doi.org/10.1038/s41598-025-25509-z

Download citation

Received: 19 August 2025
Accepted: 21 October 2025
Published: 24 November 2025
Version of record: 24 November 2025
DOI: https://doi.org/10.1038/s41598-025-25509-z

Subjects

Abstract

Similar content being viewed by others

Introduction

Methods

GWAS input data sources

Quality control of input GWAS

Sample overlap assessment

Genomic structural equation modeling

SNP heterogeneity assessment

Multi-level model assessment and locus definition

Fine-mapping analysis

Transcriptome-wide association studies (TWAS)

Functional enrichment analysis

Cell-type annotation analysis

Partitioned heritability analysis

Phenotypic association and brain imaging genetics analysis

Polygenic risk score construction

Results

Structural equation model construction and statistical indicators

Genomic structural equation model GWAS stratified assessment

Genomic control assessment based on LD score regression

FUMA-based assessment of cognitive ability structural equation model

Fine-mapping analysis

Transcriptomic prediction

Pathway, cell type, and Mendelian disease gene enrichment

Heritability contribution results from genomic regions

Multilevel biological foundations of cognitive abilities

Chromosome-level results

Discussion

Conclusions

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links