Genetic Insights into Head-to-Body Ratios Via Deep Learning-Based Image Segmentation and Implications for Common Diseases

Shi, Wei; Dong, Shan-Shan; Zhu, Ren-Jie; Tang, Shi-Hao; Wang, Jia-Hao; Jiang, Feng; Wu, Hao; Duan, Yuan-Yuan; Guo, Jing; Liu, Kai; Li, Zheng-Qiang; Li, Meng; Wang, Jianzhong; Guo, Yan; Yang, Tie-Lin

doi:10.1038/s41467-025-67578-8

Download PDF

Article
Open access
Published: 24 December 2025

Genetic Insights into Head-to-Body Ratios Via Deep Learning-Based Image Segmentation and Implications for Common Diseases

Nature Communications volume 17, Article number: 864 (2026) Cite this article

4860 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Head-to-body ratios (HBRs) are important anthropometric traits with direct relevance to human growth, development, and disease risk. However, the role of the proportions between head and body remains understudied, with the genetic basis of HBRs remaining largely unexplored. By applying deep learning models to 38,202 whole-body dual-energy X-ray absorptiometry images from the UK Biobank, we generated 10 distinct HBR phenotypes based on head (length/width) and various body dimensions. Our genome-wide association analyses identify 245 significant loci, with SNP-based heritability estimates ranging from 25% to 43%. Functional annotations show that genes prioritized for HBRs are enriched in chondrocytes in skeletal tissues and oligodendrocytes across multiple brain regions. Polygenic risk scores and mendelian randomization analyses further showed that HBRs are significantly associated with risks for cardiovascular, metabolic, musculoskeletal, and neuropsychiatric diseases, underscoring their potential value as health-related biomarkers. Evolutionary analyses show that HBR-associated variants are enriched in conserved genomic regions and human accelerated regions, particularly those influencing brain development. Overall, our study provides insights into the genetic architectures of HBRs, establishes their relevance to major human diseases, and offers evolutionary context for their biological significance.

Exome sequencing and analysis of 44,028 British South Asians enriched for high autozygosity

Article Open access 27 March 2026

Structural variants in the Chinese population and their impact on phenotypes, diseases and population adaptation

Article Open access 11 November 2021

Genetic architecture of the structural connectome

Article Open access 04 March 2024

Introduction

Mammals and birds have evolved a substantially larger brain for a given body size compared to other vertebrates^1,2. Among these, humans exhibit an exceptional degree of brain enlargement relative to body size^3,4. Head-to-body ratio (HBR) is an important morphological indicator for describing relative brain size and reflecting normal growth and development. It is also linked to certain genetic disorders or evolutionary adaptations. A larger HBR is often thought to reflect evolutionary selection for enhanced cognitive abilities⁵. Abnormal HBR is associated with neurological and physical impairments that impact quality of life. For example, autism spectrum disorder is linked to atypical rates of head growth relative to height during early childhood^6,7, and excessive early head size growth may be associated with an increased risk of cancer⁸. Therefore, understanding the factors influencing HBR could provide valuable insights into relevant traits and diseases. Most phenotypes are influenced by both environmental and genetic factors. While evidence from fossil records and global paleoclimatic reconstructions indicated that environmental factors account for only a limited degree of variation in body and brain size⁹, the extent to which genetic factors contribute to variation in HBR remains unclear and warrants further investigation.

The main elements of the HBR include head length, head width, height, shoulder width, trunk length, hip width, and leg length. However, except for height, these other elements are rarely measured in large sample sizes. While genome-wide association studies (GWAS) have successfully identified numerous loci associated with height^10,11, the genetic basis of HBR remains largely understudied. Recently, deep-learning techniques to noninvasive medical imaging has been proven to be an effective approach for accurately and efficiently extracting anthropometric indicators. Furthermore, the combination of genetic, phenotypic, and imaging data by national biobanks facilitates the exploration for image-derived phenotypes (IDPs) with sufficiently large sample sizes. Several genetic studies have successfully applied computer vision to generate IDPs of the retina¹², distribution of body fat¹³, skeletal measure¹⁴, pelvic form¹⁵, and heart structure¹⁶, linking significant loci to various disorders. Benefiting from these, we can avoid large-scale manual measurements and obtain skeletal measurements with greater accuracy, enabling a deeper investigation into the genetic architectures of HBR.

In this study, we applied computer vision methods to obtain height-adjusted measurements of human head length, head width, and body dimensions (including height, shoulder width, trunk length, hip width, and leg length) from biobank-scale whole-body dual-energy X-ray absorptiometry (DXA) images. We generated 10 ratio phenotypes between the head and the body measurements. We then performed genome-wide scans on these HBR phenotypes to provide a more comprehensive assessment of the genetic factors contributing to inter-individual variations in head-related body proportions. Functional annotations showed that the prioritized HBR genes are significantly enriched in somatotropes of the pituitary gland and oligodendrocytes in multiple brain regions. Specifically, we found that loci associated with HBR phenotypes are significantly enriched both in human accelerated regions and regulatory elements of differentially expressed genes between humans and great apes during development. Additionally, we evaluated the phenotypic correlations, genetic risks, and causal relationships between HBRs and common diseases, with a focus on cardiovascular, musculoskeletal, metabolic, and neuropsychiatric disorders. The overall design is shown in Fig. 1.

Results

HBRs computation from biobank-scale imaging data using deep-learning

We originally obtained 152,326 whole-body DXA images from UK Biobank (application number 46387). We developed a series of ResNet-152¹⁷ models to perform image quality control procedures, including selecting whole-body transparent images, removing cropping artifacts, excluding images with contrast abnormality, and eliminating non-frontal head view images. To enhance the processing capability of models, we processed the images with black and white backgrounds separately (detailed in the Methods). After quality control, a total of 48,410 images were remained (Fig. 2a). To facilitate subsequent GWAS analysis, we only kept the images derived from British white individuals with available genetic data. Finally, data from 38,202 individuals were remained. These individuals are aged between 46 and 86 years, reflecting adult morphology. We report baseline information about this analyzed cohort in Supplementary Data 1.

**Fig. 2: Deep learning extraction of HBR phenotypes and phenotypic feature analysis.**

After data quality control, we manually annotated the head mask, which highlights the region of the head in the image, and six pixel-level landmarks (two shoulder joints, two hip joints and two ankle joints) on 400 images under the guidance from orthopedic doctors as training data. We applied computer vision architectures based on the U-Net framework¹⁸, using ResNet-152¹⁷ as the encoder, for head segmentation and landmarks estimation (Fig. 2b). Upon training, both head segmentation models for background images achieved a Dice loss below 0.0146 and an average intersection-over-union (IoU) score above 0.9744 on the test set. The six landmark models achieved Dice losses ranging from 0.0615 to 0.0857, and IoU scores ranging from 0.8479 to 0.8849. Each model achieved comparable performance on the validation and test datasets, demonstrating that the models were not overfitted and possessed good generalization capabilities (Supplementary Data 3). The detailed workflow for image processing and segmentation is shown in Supplementary Fig. 1.

After training and validating the deep-learning model on the 400 manually annotated images, we applied this model to segment the head and 6 landmarks on the remaining 48,410 whole-body DXA images. We defined height as the distance from the upper edge of the head to the ankle landmark. Head length was defined as the distance from the crown to the chin, while head width was measured as the maximum horizontal span across the skull, taken at the point where the head is widest. Shoulder width was defined as the distance between the two shoulder joints, and trunk length was defined as the vertical distance from the chin to the hip joint. Hip width was defined as the distance between the two hip joints, and leg length was defined as the vertical distance from the hip joint to the ankle joint. Using these measurements, we calculated 10 HBR phenotypes, namely the head length to body height ratio (LHR), head width to body height ratio (WHR), head length to shoulder width ratio (LSR), head width to shoulder width ratio (WSR), head length to trunk length ratio (LTR), head width to trunk length ratio (WTR), head length to hip width ratio (LHiR), head width to hip width ratio (WHiR), head length to leg length ratio (LLeR), head width to leg length ratio (WLeR), as shown in Fig. 2c and Supplementary Data 7.

Validation of HBRs estimates

To validate the robustness of the 10 HBR phenotypes, we first compared predictions of our models against values derived from manually annotated masks on a 40-image test set. This comparison yielded Pearson correlation coefficients all above 0.98 and normalized mean squared errors between 0.0202 and 0.0374 (Supplementary Fig. 3a, Supplementary Data 6). We further evaluated the reproducibility of our measurements through three tests. First, the correlation between torso length and leg length derived from the left and right limbs was 0.9250 and 0.9938, respectively, indicating high internal consistency (Supplementary Fig. 3b). Second, using repeat scans from 3632 individuals obtained at an average interval of two years, the test-retest correlation for HBRs ranged from 0.8435 to 0.9685 (Supplementary Fig. 3c). Third, to assess robustness to technical variations, we examined 500 samples with images containing two different background colors and found that the HBR correlations remained high, ranging from 0.8487 to 0.9728 (Supplementary Fig. 3d). We also downloaded the GWAS summary data of head width generated by Xu et al.¹⁵, the genetic correlation analysis result with our GWAS data for head width showed a relatively high genetic correlation with r = 0.7705. These results demonstrate that the IDPs generated by our deep learning model are highly reproducible.

With the reliability of the HBR measures established, we next investigated their population-level characteristics. We observed that all the HBR values conform to the characteristics of a normal distribution (Supplementary Fig. 3e), and a moderate negative correlation between HBR phenotypes and height (r² ranged from 0.0447 to 0.4974, Supplementary Fig. 3f).

Genome-wide association analyses on HBRs

We performed GWASs using imputed genotype data in UKB to identify variants associated with the 10 HBR phenotypes. After quality control (Supplementary Data 8), 38,202 individuals of white ancestry from British and 7,459,980 common biallelic single-nucleotide polymorphisms (SNPs) were included in our analyses.

Across the 10 HBR phenotypes, our GWAS identified a total of 7394 significant (p < 5 × 10⁻⁸) SNPs located at 245 independent loci (Fig. 3a-b, Supplementary Fig. 4). After conditioning on all lead SNPs of independent loci discovered in a saturated GWAS for height¹¹, 46 loci remained significant (Supplementary Data 10). The minimal deviation of univariate LD Score regression (LDSC)¹⁹ intercepts from 1.0 suggested that this inflation was attributed to polygenicity rather than to confounding (Fig. 3b). Based on the generated summary statistics for each HBR, the proportion of phenotype variance explained by SNPs for all HBR phenotypes ranged from 25.27% to 42.86%, indicating that HBRs are moderately heritable (Fig. 3c).

To investigate the extent of genetic overlap among the HBRs, we calculated the genetic correlation between each pair. All HBRs showed positive phenotypic correlations (r ranged from 0.1516 to 0.9436), and the genetic correlations between HBRs were also positive, ranging from 0.1968 to 0.9309 (Supplementary Fig. 5, Supplementary Data 11). We observed that phenotypes divided by trunk length and leg length were indeed highly correlated with those divided by height (\({r}_{g}\) > 0.8). To be concise, the results for four HBRs based on trunk and leg length are only presented in the Supplementary Data 12–25.

To identify causal variants associated with HBRs, we conducted fine-mapping and identified 61 loci that had five or fewer causal variants within the 95% credible set, and two loci that had one causal variants (Supplementary Fig. 6, Supplementary Data 12). For example, FINEMAP²⁰ nominated rs41271299 as the sole putative causal variant in the 6q22.3 locus associated with LLeR. Moreover, it remained significantly associated with LLeR in height-conditioned analyses, suggesting its association with LLeR is independent from height.

Gene prioritization of HBRs

We used four methods for gene prioritization: genomic annotation using ANNOVAR²¹, transcriptome-wide association study (TWAS²²), GWAS colocalization analysis (COLOC²³), and summary data-based mendelian randomization (SMR²⁴) by integrating GWAS and eQTL data on the brain and whole blood from the GTEx project (V8). For TWAS and SMR, we retained genes that remained significant after multiple testing correction. For colocalization analysis, we set the threshold of posterior probability of hypothesis 4 (PPH4, indicating shared causal association between GWAS and eQTL) as 0.9. Finally, we identified 608 protein-coding genes associated with HBRs supported by at least one of the four analysis results (Fig. 4a and Supplementary Data 13).

**Fig. 4: Results of gene prioritization and functional interpretation.**

Nearly half (285/608) of these prioritized genes are associated with more than one HBR, and more than 20% of these genes are linked to at least three HBRs. For example, WNT16 was found to be associated with WHR, WHiR, and WSR, as shown in Fig. 4b. WNT16 gene has been proposed to signal via the non-canonical pathway, influencing bone mineral density, cortical bone thickness, and bone strength²⁵. Additionally, WNT16 had lead variants in the GWAS of brain structural connectome²⁶, and includes GWAS signals associated with brain volume measurement²⁷.

Biological insights from HBRs associations

To identify relevant biological processes or pathway for the prioritized genes, we performed gene set enrichment analysis for each HBR separately using data from the Gene Ontology (biological process category, GO-BP)²⁸ and Kyoto Encyclopedia of Genes and Genomes (KEGG)²⁹ database. This analysis revealed that the prioritized genes are enriched in 32 GO-BP terms (Fig. 4c), most of which are related to organ development, skeletal and connective tissue development, and growth hormone signaling (Supplementary Data 14). The prioritized gene sets are also enriched in 11 KEGG pathways directly associated with body development and metabolism (Fig. 4d), such as growth hormone synthesis, secretion, and action pathway³⁰. These genes are also enriched in the cardiovascular JAK-STAT signaling pathway, whose dysregulation may lead to inflammation and cardiovascular diseases³¹ (Supplementary Data 15).

We further evaluated whether the prioritized genes are associated with skeletal or brain-related phenotypes in the mouse using the Mouse Genome Informatics (MGI) database³². The results indicate that, in addition to being broadly annotated with various phenotypes related to mouse body length, the HBR prioritized genes are also associated with six distinct brain-related phenotypes, including “abnormal forebrain development” and “abnormal neurocranium morphology” (Fig. 4e, Supplementary Data 16).

Cell types associated with HBRs

To identify relevant cells exhibiting excess expression across disease-associated genes implicated by GWAS, we performed single-cell disease-relevance score (scDRS)³³ analysis using single-cell RNA sequencing data from 11 brain regions classified according to conventional brain structures^34,35, as well as five human skeletal tissues (cranium, skull base, shoulder, hip, knee)³⁶. The single-cell data comprised 3750 to 134,262 cells for brain-related datasets and 49,263 to 100,869 cells for skeletal-related datasets.

We identified 44 cell type-phenotype association pairs across brain regions and skeletal tissues (Supplementary Data 13). Notably, oligodendrocytes from five brain regions (cerebral cortex, cerebral nuclei, hippocampal formation, hypothalamus, and thalamic complex) were found to be associated with six HBRs (LHR, LHiR, LLeR, WHR, WHiR and WLeR). In addition, somatotropes in the pituitary tissue were also associated with WHR, and WSR. In skeletal tissues, WHR was associated with chondrocytes in all five skeletal tissues, while LSR was linked to early progenitors and endothelial cells in the shoulder. Furthermore, the two types of bone tissue cells were associated with corresponding body proportion phenotypes, specifically, LHiR was linked to chondrogenic and osteogenic cells in the hip, and WLeR was associated with chondrogenic cells in the knee (Fig. 5). These findings highlight the critical role of these cell types in the development of HBR phenotypes.

**Fig. 5: Cell type association analysis of the prioritized HBR genes.**

Evolutionary analysis

Both mammals and birds have evolved substantially larger brains relative to their body size compared to other vertebrates³⁷, which might be related to the shared evolutionary characteristics between the two groups. Therefore, we investigated whether the variants associated with HBR phenotypes are enriched in genomic regions conserved between birds and mammals. We excluded genomic regions shared with reptiles and amphibians, focusing only on the genomic regions conserved between mammals and birds³⁸ (Fig. 6a left). By comparing the distribution of HBR-associated variants to the matched randomly selected background SNPs, we found that HBR-associated variants are significantly enriched in the conserved genomic regions of mammals and birds, excluding those shared with reptiles and amphibians (p < 0.002, Fig. 6a right, Supplementary Data 18). These results suggest that mammals and birds may have evolved larger brains independently, but through similar genetic mechanisms, implying that convergent evolution might be at play, where different lineages adopt similar genetic pathways to achieve a common phenotypic trait of an increased brain size relative to body size.

Considering relatively larger brain size in humans contributes to superior cognitive abilities compared to non-human primates³⁹, we also investigated whether variants associated with HBR phenotypes overlapped with human accelerated regions (HARs) more than expected. HARs are genomic elements that are highly conserved across vertebrate and great ape evolution but exhibit significantly accelerated substitution rates in human⁴⁰. The results showed that genetic signals from five HBR phenotypes (WHR, WSR, WTR, LHR, and WLeR), particularly those related with head width, are significantly enriched in HARs (FDR-adjusted p < 0.05, Fig. 6b, Supplementary Data 19). Similarly, traits related to neurological and cognitive disorders (e.g., autism spectrum disorder and cognitive abilities) are also enriched in HARs, although not as significantly as the HBR phenotypes. In contrast, skeletal diseases such as joint pain and rheumatoid arthritis (RA) are not significantly enriched. These results suggest that genetic variants associated with HBR may be more closely linked to the evolution of human brain development.

We further examined the enrichment of genomic annotations reflecting divergence at various evolutionary time points from great apes to Homo sapiens. These annotations include regions that exhibit differences in epigenetic elements (such as enhancers and promoters) between humans and primates during early developmental stages⁴¹, as well as that acquired novel functions in the adult brain after the human divergence from rhesus macaques and chimpanzees⁴². Except for LSR, LTR, and WSR, most HBR phenotypes showed extensive enrichment in regulatory regions linked to human-ape differences during both fetal development and adulthood (Fig. 6c, Supplementary Data 20). This enrichment indicates that genetic variants associated with HBR phenotypes may result from evolutionary changes at critical developmental stages, influencing early growth and adult traits in humans compared to great apes. Moreover, we examined the enrichment of selection in the modern human lineage since diverging from the common ancestor with Neanderthals and Denisovans⁴³.

The results showed that all head-width-related HBRs were significantly enriched (adjusted p < 0.05) in depleted regions of archaic human genomes, whereas head-length-related HBRs were not (Fig. 6d, Supplementary Data 20). These results suggest that the further evolution of head-width-related HBRs after the divergence from great apes is associated with specific adaptations in modern humans, while head-length-related HBRs may have a weaker functional relationship with adaptive evolution. This disparity may be due to the more direct influence of head width on brain capacity compared to head length.

Phenotypic and genetic association of HBRs with common diseases

We mainly focus on diseases of the five major systems, including psychiatric^44,45, neurological^44,45, skeletal⁴⁶, cardiovascular^50,51, and metabolic diseases^47,48, which may have potential relationships with HBRs according to published studies. We utilized the ICD-10 diagnoses from the UK Biobank and only common diseases with more than 1000 cases in the cohort were included in our analyses (Supplementary Data 21).

As shown in Fig. 7a and Supplementary Data 22, among the 133 significant association pairs, 78.95% exhibit negative correlations. For example, all 10 HBRs were negatively associated with atrial fibrillation (AF), and previous research corroborates this by demonstrating an association between AF and smaller brain volume⁴⁹. We also observed that WHR was positively associated with ten diseases, such as hypertension (OR = 1.09, adjusted p = 1.07 × 10^-8) and mononeuropathies of the upper limb (ULMN, OR = 1.26, adjusted p = 1.99 × 10^-11). These results suggest that the HBR phenotypes are biologically meaningful and can reflect underlying health traits.

**Fig. 7: Association and causality between HBRs and diseases.**

We next investigated whether the PRS generated from our GWAS results could predict the status of common diseases. To avoid overlap between the GWAS cohort and the tested individuals, data from 315,392 UKB participants of white British ancestry who did not have DXA image data were used for PRS analyses. PRS were generated using Bayesian regression with a continuous shrinkage prior⁵⁰. Logistic regressions were then used to examine the relationship between the generated PRS and disease outcomes. As shown in Fig. 7b and Supplementary Data 23, after applying the Benjamini-Hochberg (BH) correction, most of the HBR phenotypes are associated with a trend toward lower disease risk (84.78%). This includes cardiovascular diseases such as AF, heart failure, and varicose veins, as well as metabolic or musculoskeletal disorders like type 2 diabetes and RA. However, eight HBR phenotypes are linked to an increased risk of developing ULMN, the plausibility of this association is supported by previous studies that have described connections between peripheral nerve conditions and brain structure^51,52. These findings suggest that PRS effectively capture genetic risk factors of HBR associated with these diseases.

Mendelian randomization to assess the causal effects of HBRs on disease status

We further applied two-sample Mendelian randomization (MR) to investigate the causal effects of HBR phenotypes on 46 common diseases from five physiological systems. Our analysis identified a total of 46 significant causal associations (Fig. 7c, Supplementary Data 25). Consistent with the phenotype correlation and PRS analysis, the majority (75.76%) of these associations are protective. Specifically, the significant causal effects of HBR phenotypes on cardiovascular, metabolic, and musculoskeletal system diseases were generally protective. LHR, WHR, LSR, and WSR were consistently negatively associated with risks of AF, RA, and osteoarthritis. However, the effects of HBR phenotypes on psychiatric disorders were predominantly risk-increasing, with a notable exception being WHiR, which was negatively associated with the risk of schizophrenia (OR = 0.95, 95% CI: 0.93 to 0.98, adjusted p = 0.006). For example, WHR, WHiR, and LTR were risk factors for Alzheimer’s disease, and WSR was positively associated with the risk of bipolar disorder (OR = 1.06, 95% CI: 1.02 to 1.09, adjusted p = 0.002).

Discussion

In this study, we analyzed 38,202 whole-body DXA images from the UKB using deep learning to extract various HBRs data, and explored the genetic basis of HBRs. Our GWAS identified 245 independent genetic loci associated with HBRs. Through integrative analyses, we identified 608 coding genes potentially related to HBRs, characterized their primary functions, and pinpointed the main cell types where these genes are active. Moreover, we linked HBRs to various conserved or specific genomic regions involved in human evolution. Finally, we examined association and causality between HBRs and common diseases such as cardiovascular, metabolic, musculoskeletal, and neuropsychiatric diseases.

To better reflect our primary objective, we analyzed HBR by dividing head size by body measurements. Compared with metrics that examine only the head or the body, HBR provides a more comprehensive perspective. It is important not only as a key morphological trait in human evolutionary adaptation^53,54, but also as a crucial indicator of growth and development, with established associations with a range of psychiatric^44,45, neurological^6,55, and skeletal disorders⁴⁶. We compared our GWAS results on HBRs with the phenotypes adjusting for body measurements as covariates, following previous studies^14,15. Although there are global genetic correlations between the ratio-based and covariate-adjustment phenotypes, the overlapped GWAS significant loci were limited, indicating the divergence at the locus-specific level (Supplementary Data 9).

Our HBR GWAS uncovers genetic insights linked to fundamental biological processes. Notably, over one-sixth of the loci identified in the GWAS on HBRs were conditionally independent of loci in previous GWASs related to height^11,56. Enrichment analysis and MGI phenotypes annotation of the prioritized genes highlighted their important roles in body development-related pathways. These genes also exhibited substantial enrichment in key signaling pathways essential for cellular processes, such as cell proliferation, differentiation, and survival. and are closely associated with various common human diseases, including cardiovascular diseases⁵⁷, chronic inflammatory diseases³¹, and metabolic disorders^58,59.

We found that HBR-related genes are associated with somatotropes in the pituitary and oligodendrocytes in the hypothalamus. The hypothalamic–neurohypophyseal system regulates essential physiological processes such as growth, metabolism, and reproduction by secreting various hormones, which directly influence cardiovascular diseases^60,61, insulin sensitivity⁶², and inflammation⁶³. Dysfunction of oligodendrocytes has been linked to various psychiatric^64,65,66 and neurological⁶⁷ disorders. HBR-related genes are also associated with early progenitors and endothelial cells in multiple skeletal tissues. Early progenitors provide a direct cellular source for skeletal growth⁶⁸, while endothelial cells play a crucial role in the onset and progression of RA⁶⁹, which may explain the association between HBRs and RA. These findings suggests that HBR-related genes may influence broad regulatory mechanisms underlying human health and diseases by modulating complex physiological processes in living organisms.

We further investigated the associations and potential causality between HBR and diseases of the five major systems. We found that HBRs generally have a protective effect against cardiovascular, musculoskeletal, and metabolic diseases, which is consistent with previous neurological research^47,48,49. Given the significant inverse correlation between HBR and height, this observation is interesting as greater height is also reportedly protective against these diseases^70,71,72. A potential explanation is that HBR reflects some characteristics or developmental features of the head that directly influence health risks, as head size shows a stronger association with these diseases than height (Supplementary Data 24), suggesting that cranial development is the primary driver of this protective effect. Conversely, our study identified a higher HBR as a significant risk factor for certain neuropsychiatric disorders. This is in agreement with existing studies on conditions like Alzheimer’s disease^73,74,75,76 and developing ULMN^51,52. These findings indicate that HBRs are complex biomarkers that are associated with beneficial physical outcomes but also increase the risk of neurological problems, and they highlight their potential as accessible markers for disease risk stratification, which may support public health efforts through proactive management and targeted prevention strategies.

Compared to other vertebrates, both mammals and birds have evolved substantially larger brains relative to their body size³⁷, a trait associated with their advanced cognitive functions⁷⁷. Specifically, birds with the most complex vocal learning abilities were also the best problem solvers and had the largest brains relative to body size⁷⁸. In addition, both of them could maintain a relatively stable body temperature⁷⁹, which may be associated with the role of HBR in thermoregulation. Our evolutionary analysis revealed that HBR-related variants are enriched in conserved genomic regions shared by mammals and birds, highlighting their crucial role in the evolution of brain development beyond body size constraints. Additionally, HBR-related variants are also enriched in human accelerated regions, indicating that HBR has undergone significant evolutionary divergence from other great apes during the long course of human evolution. We also observed differences in the HBR regulatory elements between humans and great apes during the embryonic stage, suggesting that species-specific early development drives the differences in HBR. Fossil reconstructions reveal that the shape of the modern human brain evolved to become nearly spherical⁸⁰, with the width of the head narrowing throughout evolution^81,82. As human body size increased, the HBR gradually decreased during human evolution, resulting in a substantial difference in HBR between humans and great apes⁸³. These findings underscore the complex evolutionary changes that have shaped human body proportions, particularly the head-to-body ratio.

Our study has several potential limitations. First, the GWASs were restricted to individuals with European ancestry, limiting the generalizability of our findings to other populations. Further studies are needed to evaluate the transferability of the results across diverse ancestries. Secondly, although we employed deep learning to standardize image data with the head facing forward as much as possible, the impact of posture on head measurements could not be entirely eliminated. Additionally, due to challenges in correcting for foot positioning, we excluded the extraction of height below the ankles. This resulted in a partial omission of total body height, causing the calculated LHR and WHR to be slightly larger than conventional values, with averages of approximately 0.12 and 0.09, respectively (Supplementary Fig. 3e). However, considering the relationship between ankle height and head length, HBRs we calculated are consistent with conventional adult proportions⁸⁴.

Taken together, our study systematically investigated the genetic basis of various human HBRs in a large-scale sample. We identified genetic loci associated with HBR phenotypes, along with corresponding coding genes and regulatory pathways. We pinpointed the cell types influencing HBRs and linked these findings to specific diseases. Furthermore, we uncovered genomic evidence of evolutionary changes underlying differences in HBRs between humans and great apes. These findings enhance our understanding of human body structure and its functional significance, highlighting potential impacts on human health and diseases.

Methods

UKB participants and dataset

All analyses were performed using data from the UKB unless otherwise specified. The UKB is a large-scale, prospective, population-based cohort study that recruited 500,000 participants in the United Kingdom via mailed invitations starting in 2006⁸⁵. For this study, we analyzed data from 486,737 participants with available genetic data who had not withdrawn their consent as of August 20, 2020. Among these participants, approximately 80,000 had DXA imaging data released by the UK Biobank under bulk data field ID 20158, with 76,320 having accessible data as of November 13, 2023, along with baseline metadata including age, sex, and other relevant study variables.

The DXA images were acquired using an iDXA instrument (GE-Lunar, Madison, WI). For each participant, a series of up to 8 types images were captured: two whole-body images - one focusing on the skeleton and the other on adipose tissue, as well as images of the lumbar spine, the lateral spine from L4 to T4, each knee, and each hip. The bulk download provided 76,320 zip files, each corresponding to a specific patient identifier, commonly referred to as each EID of patient. Each zip file contains several DXA images of the patient, though not necessarily all, as described above.

Sample size selection for deep-learning models

Our sample size selection procedure was conducted as a two-stage process to ensure both statistical validity and practical utility. First, we employed different models to sequentially remove different categories of unqualified images (e.g., incomplete, abnormal contrast or resolution, and exhibited non-frontal head posture). The proportions of different types of unqualified images are different in the dataset. To ensure robust feature learning, we manually annotated varying sample sizes per model to include at least 50 instances of each unqualified category, following the guidance of previous studies^86,87. Next, using the manually annotated data, we employed a progressive sampling strategy: iteratively expanding training sets until performance gains plateaued. Specifically, for each sample size, we performed 30 iterations of randomly splitting the dataset (80% training, 10% validation, 10% testing). In each iteration, we recorded the model’s loss and evaluation metric (AUC or IoU) on the test set of each task.

As shown in the Supplementary Fig. 2a and Supplementary Fig. 2b, the model performance (AUC↑ or IoU ↑ , Loss ↓ ) improved significantly with more data initially but eventually plateaued, indicating that further sample increases yielded negligible gains. The final sample size for each model was chosen as the largest number within this stable performance plateau, ensuring we leveraged our annotated data efficiently. This resulted in the final sample sizes: 300 for body-class model, 500 for crop-class model, 800 for contrast-class and pose-class model, and 200 for all segmentation models

Classification to extract whole-body skeleton DXA images

We utilized Pydicom⁸⁸ to extract metadata from the DICOM files, screened 152,326 “Total Body” images from 71,167 participants, converted the images into PNG and NPY formats, and excluded low-quality images with PNG file sizes of less than 40 KB. To extract the skeleton radiographs, we built a U-Net-based 2-classification model to classify two whole-body images.

We randomly selected 300 images for manual annotation, and randomly divided them into training, validation, and testing datasets in an 8:1:1 ratio (the same dataset partitioning strategy was used for all subsequent models). Using ResNet-152¹⁷ in the Python library Torchvision (v0.16.1)⁸⁹ as encoder, using cross entropy as loss function, we trained for up to 10 epochs, and kept the best model on the validation set, which also had 100% accuracy on the test set. This classifier was run on all DXA images obtained from the UKB. After classification and removal of images, we were left with 76,153 whole-body skeleton DXA images (Supplementary Data 2).

Removal of poorly cropped images

After we determined the final set of whole-body DXA images, we performed additional quality control to remove images that were improperly cropped or had portions of the ankles or shoulders cut off. To do this, we created a binary classifier using Torchvision to differentiate between cropped and non-cropped images. We randomly selected 500 images for manual annotation. A CNN based on the ResNet-152 architecture was trained on this data for up to 30 epochs using cross-entropy as the loss function. The results had an area under the ROC curve (AUC) of 1.00 on validation data and testing data (Supplementary Data 3). Removal of all the cropped images resulted in a total of 69,025 whole-body images that we used for analysis.

Image standardization and background differentiation

The whole-body DXA images are varied in both pixel dimension and background, which will limit the performance of the segmentation model. In all the subsequent deep learning models, we trained two separate series of models, one for black background and one for white background. Broadly, the images comprised two main size specifications: black-background images were typically 681-811 by 272 pixels, while white-background images were generally 936-943 by 316-372 pixels. All images were padded to a standardized maximum height and width (960 × 384 pixels) using the methods described in ref. ¹⁴. After removing 15 images with abnormal background pixels, we retained 51,432 black background and 17,593 white background complete whole-body skeleton DXA images, respectively.

Removal of images with abnormal contrast

In the whole-body images with both backgrounds, we found that some images had too high contrast due to the UKB image processing, making it difficult to distinguish the details of the bones. To remove these images, we created a binary classifier using Torchvision for each background type. We randomly selected and manually annotated 800 images for each background. These images were trained for up to 30 epochs using a CNN based on the ResNet-152 architecture, with cross-entropy employed as the loss function. The results had an AUC of 1.00 on validation data and testing data of both backgrounds (Supplementary Data 3). After removing images with contrast anomalies, we retained 45,514 black background and 14,597 white background complete whole-body skeleton DXA images, respectively.

Classification of head pose

Since the head posture of the participants will affect the projected shape of the head when acquiring DXA images, we built a binary classification model to distinguish between images with direct gaze and non-direct gaze postures. We also randomly selected and manually annotated 800 images for each background. These images trained for up to 30 epochs using a CNN based on the ResNet-152 architecture, with cross-entropy employed as the loss function. The results show that the validation set AUC is 0.9732 and the test set AUC is 0.9683 in the black background images, and the validation set AUC is 0.9769 and the test set AUC is 1.00 in the white background images (Supplementary Data 3). After removing non-frontal images, we retained 36,319 black background and 12,091 white background complete whole-body skeleton DXA images, respectively. To ensure these filtering steps did not introduce selection bias, we then analyzed its effect on key phenotypes. Statistical analysis confirmed a negligible effect on both sex and age distributions (Tables S4 and S5). Although the Chi-squared tests for sex yielded significant p-values, the corresponding Cramér’s V values⁹⁰ were all near or below 0.1. Likewise, Cohen’s d values⁹¹ from t-tests for age were all below 0.2, indicating a negligible impact.

Deep learning-based image segmentation models for identifying head and joint landmarks

To train our deep learning models, we manually annotated a total of 400 images, with 200 on a black background and 200 on a white background, under the guidance of the orthopedic doctors. We used 160 images of each type for training, and the rest were evenly split for validation and testing. The images that were chosen for this training dataset had an equal number of male and female individuals, were from the white British population group, and sampled equally across the age distribution of the UKB cohort.

We used the open-source annotation tool LabelMe⁹² in polygon mode to outline the head, and to label six body landmarks (left shoulder, right shoulder, left hip, right hip, left ankle, and right ankle). For masking each of these landmarks, the locations specified below were chosen because they were the easiest and most consistent to identify across all the images, which featured slightly different poses. The center of the head of the humerus was chosen to be masked for each shoulder landmarks. The center point of the tibial plateau was selected as the masking point for each hip joint landmark. The point where the ends of the tibia, fibula, and talus converge was chosen to be masked for each ankle landmarks. An example of the annotation of one image is shown in Fig. 2b with landmarks placed at each of the locations listed above.

We applied the U-Net architecture models with ResNet-152 encoders from the library Segmentation Models Pytorch (SMP, v 0.3.3) to perform image segmentation on the head and the six joint landmarks. The default pre-trained weights provided by SMP were used. Dice loss was employed as the loss function, and training was conducted for up to 50 epochs, retaining the model with the best performance on the validation set for testing.

Obtaining head and body measurements and calculating HBRs

A major issue in integrating our analysis across different input pixel ratios was that these ratios corresponded to varying resolution scales, likely due to the differing distances at which the scanner was held above the patient. For instance, in one image, a pixel might represent 0.44 cm, while in another it could represent 0.46 cm. To address this scaling issue and standardize the images, we decided to regress the height directly measured on the image using the top of the head and the midpoint of the two ankle landmarks, which could be consistently identified across all image pixel ratios. We also incorporated overall height data extracted from the DICOM meta information. Although the height measurements we used did not include the area below the ankles, they were relative measurements, and we utilized them to derive a scaling factor for the pixel ratio in each image to achieve standardization.

We calculated the head length and head width based on the bounding box around the head, and the shoulder width, trunk length, hip width, and leg length based on the center points of the landmarks. Individuals whose length measurements deviated from the mean by more than 3 standard deviations were excluded from the analysis. We calculated the ratios of head length and head width to the corresponding body measurements. A list of these HBRs can be found in Supplementary Data 7.

Participant and Genetic data quality control

For genome-wide association analyses, we filtered participants with correctly labeled whole-body DXA images (FID 20158) to include only Caucasian individuals (FID 22006) from the White British population, as determined by genetic PCA (FID 21000). We removed individuals who were outliers for heterozygosity or genotype missingness rates, as determined by UKB quality control of sample processing and DNA preparation for genotyping (FID 22027), individuals with missingness rates greater than 2% (FID 22005), and individuals with a kinship coefficient greater than 0.0442 (corresponding to at least one third- to fourth-degree relative). A total of 38,202 individuals remained (Supplementary Data 8).

Imputed genetic data for 486,737 individuals was downloaded from UKB for chromosomes 1 to 22 (FID 22828) and subsequently filtered to the quality-controlled subset using PLINK2⁹³. All duplicate SNPs were excluded (--rm-dup exclude-all) and only biallelic sites were retained (--snps-only just-acgt), with a maximum of 2 alleles (--max-alleles 2), a minor allele frequency of at least 1% (--maf 0.01), and genotype missingness no greater than 2% (--geno 0.02). A total of 7,459,980 SNPs remained in the final dataset.

GWAS and heritability analysis

Each HBR was used as a ratio phenotype in a GWAS performed with BOLT-LMM⁹⁴. Covariates included the first 20 genetic principal components (FID 22009) provided by UKB, sex (FID 31), age (FID 21003), and “BACKGROUND”. Additionally, the DXA scanner serial number and the software version used for image processing were combined into a single categorical covariate with seven categories. Independent SNPs in each resulting GWAS were calculated using GCTA COJO²⁴ (--cojo-slct) with a significance threshold of 5.0 × 10⁻⁸, and a window size of 1 M bp for SNP heritability. Genetic correlations were calculated using LDSC. We then obtained all genome-wide significant, independent loci (n = 7209) from Yengo et al.¹¹, which were used in a conditional analysis with GCTA COJO for each of the HBR phenotypes.

Gene prioritization and functional annotation

For gene prioritization, we employed TWAS, COLOC, and SMR using eQTL data from GTEx project(V8)⁹⁵. For each of the 10 HBRs, we utilized whole blood and 14 head-related and brain-related tissues (details in Supplementary Data 13). All tissues were pooled and corrected together, retaining genes that were significant in any tissue with an adjusted p-value after BH correction of less than 0.01. For COLOC, genes with a PPH4 > 0.9 were retained. We performed TWAS, SMR, and COLOC based on inferred cis-regulated gene expression using MR-JTI²², coloc²³ and GCTA²⁴ with default settings. The gene expression weights for TWAS and COLOC were downloaded from https://zenodo.org/records/3842289, and the eQTL summary data for SMR were downloaded from https://yanglab.westlake.edu.cn/software/smr/#DataResource. Pathway enrichment analysis was separately performed for each individual HBR using the R package clusterProfiler⁹⁶. The resulting terms were adjusted for multiple comparisons using BH correction method, and terms with fewer than five hit genes were excluded. Relevant MGI phenotypes were manually selected from the mouse phenotype descriptions that included the terms “bone,” “skeletal,” “head,” and “brain.” Subsequently, we counted the number of genes annotated to these phenotypes within each HBR.

Associating cell types with HBR phenotypes

In this study, we reanalyzed the publicly available human embryonic skeletal snRNA-seq data³⁶, human embryonic pituitary scRNA-seq dataset³⁵ and adult human brain snRNA-seq dataset³⁴, retaining the major tissues and cell types as defined in the original study. We used scDRS (v1.0.2) to link the scRNA-seq data with polygenic risk at a single-cell resolution, independent of cell type. For each type of HBR, polygenic scores were computed based on GWAS z-scores and scRNA-seq expression values. Cell-specific association p-values were determined by comparing the normalized disease scores to an empirical distribution of normalized scores generated from all control gene sets across all cells. We retained HBR phenotype–cell type associations with “assoc mcp” values less than 0.05 in each tissue.

Enrichment analysis for regions of evolutionary context

The conserved regions in the bird genome³⁸ were converted to the hg19 reference genome using the UCSC liftOver tool, and the intersecting regions with mammalian conserved regions were identified. We extracted the intersection of alignments between five species of reptiles or amphibians and the human genome from the NCBI Comparative Genome Viewer. These genomic regions in reptiles were then marked within the shared conserved regions between birds and mammals. All other evolutionary genomic annotations were derived from Kun et al.^14,97. Additionally, as a control for HAR enrichment, we selected the psychiatric disorders mentioned earlier and performed the same enrichment permutation experiment on the significant loci.

We then used SNPsnap⁹⁸ to assess the enrichment of each HBR-associated SNP based on MAF, the number of SNPs in LD under different LD thresholds, distance to the nearest gene, and gene density, using 1000 sets of null-matched SNPs for comparison (p < 5 × 10⁻⁸). We extended the 500 bp upstream and downstream regions of both HBR SNPs and a random set of null-matched SNPs, respectively, to calculate their intersections with the annotation regions. The number of hits where the matched SNPs had equal or greater hits than the HBR SNPs was counted, and the ratio of this number to the number of random iterations was used as the p-value for the one-sided permutation test. The odds ratio was calculated as median value of the permutation tests. Finally, all p-values for HBRs and genomic annotations were adjusted for multiple comparisons using the BH correction method.

Phenotypic association of HBR with common diseases

We defined disease based on the annotation information in diagnoses of ICD-10 (Field 41270) in UKBB. In diagnoses of ICD10, diseases are categorized into different classes. We selected 46 common diseases from five categories: psychiatric, neurological, musculoskeletal system, circulatory system, and metabolic system. The selection was performed at the second hierarchical level of the ICD-10 structure (three-character codes), except for Pian. Only common diseases with more than 1000 cases in the cohort were included in our analyses. Patients received a “1” if a disease code appeared in their hospital records, and a “0” otherwise. The ICD-10 code and details for the diseases used in our study are shown in Supplementary Data 21. We counted the number of samples for each disease, used sex, age, imaging equipment, image background and first 10 principal components as covariates, performed logistic regression using the logistic regression method “Logit” of the statsmodels⁹⁹ python library, calculated the exponent of the coefficients to get the odds ratio, and used the BH method to correct the p-value of the regressions.

Polygenic risk score analyses and logistic regression

We utilized PRS-CS⁵⁰ to construct polygenic risk score (PRS) prediction models. PRS-CS is a command-line tool based on Python that uses GWAS summary statistics and external LD reference panels to infer the posterior effect sizes of SNPs under a continuous shrinkage prior. In our model, we incorporated 1,117,425 SNPs from HapMap3 and performed clumping using PLINK2-clump with parameters “r2 = 0.01” and “kb=250”. For the reference LD panel, we used the 1000 Genomes (1000 G) data for the European population. The output from PRS-CS includes the chromosome, rs ID, base position, A1, A2, and posterior effect size estimates for each SNP. We concatenated the output files for all chromosomes and then used the PLINK2 --score command to calculate the PRS for all genotyped non-imaged individuals of white British ancestry (who had also undergone genetic QC), with a sample size of 315,404 participants after excluding related individuals. We subsequently performed logistic regression for 46 manually selected diseases using the same method as for the phenotypic logistic regression, based on the polygenic scores.

Mendelian randomization

We collected summary statistics of GWAS for 46 diseases across five categories from several sources, including FinnGen (release R11)¹⁰⁰, Million Veteran Program (MVP)¹⁰¹, the Psychiatric Genomics Consortium studies (PGC)¹⁰² and Social Science Genetic Association Consortium¹⁰³. For diseases with multiple available summary statistics, the dataset with the largest effective sample size was selected for our analysis (Supplementary Data 21). We investigated the causal association between these diseases and HBRs using two-sample Mendelian randomization.

We first selected independent SNPs for each exposure (r² = 0.001, window size = 1 Mb, and p < 5 × 10⁻⁸) using the clustering algorithm in PLINK2. The 1000 G European data were used as a reference for LD estimation. For the IVs, three key assumptions must hold: 1) the selected IVs must be associated with the exposure (relevance assumption); 2) the selected IVs are not associated with potential confounders (independence assumption), and 3) the IVs affect the outcome only through their effect on the exposure (exclusion restriction assumption). We used RadialMR¹⁰⁴ package to remove pleiotropic SNPs. The remaining SNPs were used to perform MR analysis. We used MR Steiger filtering to check whether the MR analysis estimates assessed the true causal direction¹⁰⁵. We performed MR testing using the “mr_ivw” method with the TwoSampleMR¹⁰⁶ R package, followed by BH correction for all MR test results.

For the significant MR results, we performed a further sensitivity analysis. First, we performed leave-one-out analysis to check whether the causal association was obviously driven by a single SNP (p-value < 0.05 was regarded as an outlier). Second, we conducted MR-PRESSO¹⁰⁷ to detect the presence of horizontal pleiotropy (p-value < 0.05). Third, we executed MR-Egger regression to examine the potential bias of directional pleiotropy. The intercept in the Egger regression indicates the mean pleiotropic effect of all genetic variants, which is interpreted as evidence of directional pleiotropy when the value differs from zero (p-value < 0.05). Cochran’s Q and Rucker’s Qʹ statistics were also calculated to check for the presence of heterogeneity for the Inverse-Variance Weighted and MR-Egger method, respectively.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The data from UK Biobank used in this study are available through a procedure described at: https://www.ukbiobank.ac.uk/enable-your-research. All public data sources are cited in the corresponding sections of the Methods. The GWAS summary statistics used in the MR analysis can be obtained from the URLs listed in Supplementary Data 21. The GWAS summary data generated in this study have been deposited in the Zenodo database under accession code 14835684 https://doi.org/10.5281/zenodo.17421944¹⁰⁸ and GWAS Catalog database under accession IDs GCST90702447-GCST90702456 (https://www.ebi.ac.uk/gwas/).

Code availability

The codes used for quality control of the DXA images and for performing deep learning-based segmentation of the head and key landmarks are available at https://github.com/weishaoxia/head-body-ratio and archived at Zenodo https://doi.org/10.5281/zenodo.17422184¹⁰⁹. We carried out all deep learning using the Python programming language (v3.9.23) with the PyTorch (v2.7.1+cu128) and cv2 (v4.12.0) libraries on NVIDIA RTX 5090 GPUs on the CentOS Linux system the CUDA (v12.8) toolkit.

References

Jerison, H. J. Evolution of the brain and intelligence: Comment on radinsky’s review. Evolution 30, 186–187 (1976).
PubMed Google Scholar
Striedter, G. F. Precis of principles of brain evolution. Behav. Brain Sci. 29, 12–36 (2006).
Google Scholar
Miller, I. F., Barton, R. A. & Nunn, C. L. Quantitative uniqueness of human brain evolution revealed through phylogenetic comparative analysis. Elife 8, e41250 (2019).
PubMed PubMed Central Google Scholar
Venditti, C., Baker, J. & Barton, R. A. Co-evolutionary dynamics of mammalian brain and body size. Nat. Ecol. Evolution 8, 1534–1542 (2024).
Google Scholar
Isler, K. & Van Schaik, C. P. How humans evolved large brains: comparative evidence. Evolut. Anthropol.: Issues, N., Rev. 23, 65–75 (2014).
Google Scholar
Lainhart, J. E. et al. Head circumference and height in autism: a study by the collaborative program of excellence in autism. Am. J. Med Genet A 140, 2257–2274 (2006).
PubMed PubMed Central Google Scholar
Sacco, R., Gabriele, S. & Persico, A. M. Head circumference and brain size in autism spectrum disorder: A systematic review and meta-analysis. Psychiatry Res 234, 239–251 (2015).
PubMed Google Scholar
Knol, M. J. et al. Genetic variants for head size share genes and pathways with cancer. Cell Rep. Med 5, 101529 (2024).
CAS PubMed PubMed Central Google Scholar
Will, M., Krapp, M., Stock, J. T. & Manica, A. Different environmental variables predict body and brain size evolution in Homo. Nat. Commun. 12, 4116 (2021).
CAS PubMed PubMed Central ADS Google Scholar
Lango Allen, H. et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–838 (2010).
CAS PubMed PubMed Central ADS Google Scholar
Yengo, L. et al. A saturated map of common genetic variants associated with human height. Nature 610, 704–712 (2022).
CAS PubMed PubMed Central ADS Google Scholar
Currant, H. et al. Correction: Genetic variation affects morphological retinal phenotypes extracted from UK Biobank optical coherence tomography images. PLoS Genet 17, e1009858 (2021).
PubMed PubMed Central Google Scholar
Agrawal, S. et al. BMI-adjusted adipose tissue volumes exhibit depot-specific and divergent associations with cardiometabolic diseases. Nat. Commun. 14, 266 (2023).
CAS PubMed PubMed Central ADS Google Scholar
Kun, E. et al. The genetic architecture and evolution of the human skeletal form. Science 381, eadf8009 (2023).
CAS PubMed PubMed Central Google Scholar
Xu, L. et al. The genetic architecture of and evolutionary constraints on the human pelvic form. Science 388, eadq1521 (2025).
CAS PubMed Google Scholar
Bai, W. et al. A population-based phenome-wide association study of cardiac and aortic structure and function. Nat. Med 26, 1654–1662 (2020).
CAS PubMed PubMed Central Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition 770-778 (IEEE, 2016).
Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (eds Navab, N., Hornegger, J., Wells, W. & Frangi, A.) 234–241 (Springer, 2015).
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet 47, 291–295 (2015).
CAS PubMed PubMed Central Google Scholar
Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).
CAS PubMed PubMed Central Google Scholar
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38, e164 (2010).
PubMed PubMed Central Google Scholar
Zhou, D. et al. A unified framework for joint-tissue transcriptome-wide association and mendelian randomization analysis. Nat. Genet 52, 1239–1246 (2020).
CAS PubMed PubMed Central Google Scholar
Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).
PubMed PubMed Central Google Scholar
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
CAS PubMed PubMed Central Google Scholar
Zheng, H.-F. et al. WNT16 influences bone mineral density, cortical bone thickness, bone strength, and osteoporotic fracture risk. PLoS Genet. 8, e1002745 (2012).
CAS PubMed PubMed Central Google Scholar
Wainberg, M. et al. Genetic architecture of the structural connectome. Nat. Commun. 15, 1962 (2024).
CAS PubMed PubMed Central ADS Google Scholar
Smith, S. M. et al. An expanded set of genome-wide association studies of brain imaging phenotypes in UK Biobank. Nat. Neurosci. 24, 737–745 (2021).
CAS PubMed PubMed Central Google Scholar
Ashburner, M. et al. Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat. Genet 25, 25–29 (2000).
CAS PubMed PubMed Central Google Scholar
Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y. & Morishima, K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic acids Res. 45, D353–D361 (2017).
CAS PubMed Google Scholar
Bergan-Roller, H. E. & Sheridan, M. A. The growth hormone signaling system: Insights into coordinating the anabolic and catabolic actions of growth hormone. Gen. Comp. Endocrinol. 258, 119–133 (2018).
CAS PubMed Google Scholar
Shen-Orr, S. S. et al. Defective signaling in the JAK-STAT pathway tracks with chronic inflammation and cardiovascular risk in aging humans. Cell Syst. 3, 374–384. e374 (2016).
CAS PubMed PubMed Central Google Scholar
Hayamizu, T. F., Mangan, M., Corradi, J. P., Kadin, J. A. & Ringwald, M. The Adult Mouse Anatomical Dictionary: a tool for annotating and integrating data. Genome Biol. 6, 1–8 (2005).
Google Scholar
Zhang, M. J. et al. Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data. Nat. Genet 54, 1572–1580 (2022).
CAS PubMed PubMed Central Google Scholar
Siletti, K. et al. Transcriptomic diversity of cell types across the adult human brain. Science 382, eadd7046 (2023).
CAS PubMed Google Scholar
Zhang, S. et al. Single-cell transcriptomics identifies divergent developmental lineage trajectories during human pituitary development. Nat. Commun. 11, 5275 (2020).
CAS PubMed PubMed Central ADS Google Scholar
To, K. et al. A multi-omic atlas of human embryonic skeletal development. Nature 635, 657–667 (2024).
CAS PubMed PubMed Central ADS Google Scholar
Olkowicz, S. et al. Birds have primate-like numbers of neurons in the forebrain. Proc. Natl. Acad. Sci. 113, 7255–7260 (2016).
CAS PubMed PubMed Central ADS Google Scholar
Zhang, G. et al. Comparative genomics reveals insights into avian genome evolution and adaptation. Science 346, 1311–1320 (2014).
CAS PubMed PubMed Central ADS Google Scholar
Oishi, T. [Species Differences in the Brain between Human and Non-Human Primates]. Brain Nerve 71, 807–813 (2019).
CAS PubMed Google Scholar
Doan, R. N. et al. Mutations in human accelerated regions disrupt cognition and social behavior. Cell 167, 341–354. e312 (2016).
CAS PubMed PubMed Central Google Scholar
Reilly, S. K. et al. Evolutionary genomics. Evolutionary changes in promoter and enhancer activity during human corticogenesis. Science 347, 1155–1159 (2015).
CAS PubMed PubMed Central ADS Google Scholar
Vermunt, M. W. et al. Epigenomic annotation of gene regulatory alterations during evolution of the primate brain. Nat. Neurosci. 19, 494–503 (2016).
CAS PubMed Google Scholar
Vernot, B. et al. Excavating neandertal and denisovan DNA from the genomes of melanesian individuals. Science 352, 235–239 (2016).
CAS PubMed PubMed Central ADS Google Scholar
Jones, G. H. & Lewis, J. E. Head circumference in elderly long-stay patients with schizophrenia. Br. J. Psychiatry 159, 435–438 (1991).
CAS PubMed Google Scholar
Ward, K. E., Friedman, L., Wise, A. & Schulz, S. C. Meta-analysis of brain and cranial size in schizophrenia. Schizophrenia Res. 22, 197–213 (1996).
CAS Google Scholar
Williams, C. A., Dagli, A. & Battaglia, A. Genetic disorders associated with macrocephaly. Am. J. Med. Genet. Part A 146, 2023–2037 (2008).
Google Scholar
Bucher, H., Prader, A. & Illig, R. Head circumference, height, bone age and weight in 103 children with congenital hypothyroidism before and during thyroid hormone replacement. Helvetica paediatrica acta 40, 305–316 (1985).
CAS PubMed Google Scholar
Renaud, D. L. Leukoencephalopathies associated with macrocephaly. Semin Neurol. 32, 34–41 (2012).
PubMed Google Scholar
Stefansdottir, H. et al. Atrial fibrillation is associated with reduced brain volume and cognitive function independent of cerebral infarcts. Stroke 44, 1020–1025 (2013).
PubMed PubMed Central Google Scholar
Ge, T., Chen, C. Y., Ni, Y., Feng, Y. A. & Smoller, J. W. Polygenic prediction via bayesian regression and continuous shrinkage priors. Nat. Commun. 10, 1776 (2019).
PubMed PubMed Central ADS Google Scholar
Maeda, Y. et al. Functional deficits in carpal tunnel syndrome reflect reorganization of primary somatosensory cortex. Brain 137, 1741–1752 (2014).
PubMed PubMed Central Google Scholar
Maeda, Y. et al. Altered brain morphometry in carpal tunnel syndrome is associated with median nerve pathology. Neuroimage Clin. 2, 313–319 (2013).
PubMed PubMed Central Google Scholar
Kappelman, J. The evolution of body mass and relative brain size in fossil hominids. J. Hum. evolution 30, 243–276 (1996).
Google Scholar
Grabowski, M. Bigger brains led to bigger bodies? The correlated evolution of human brain and body size. Curr. Anthropol. 57, 174–196 (2016).
Google Scholar
Chaste, P. et al. Adjusting head circumference for covariates in autism: clinical correlates of a highly heritable continuous trait. Biol. Psychiatry 74, 576–584 (2013).
PubMed PubMed Central Google Scholar
Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in approximately 700000 individuals of European ancestry. Hum. Mol. Genet 27, 3641–3649 (2018).
CAS PubMed PubMed Central Google Scholar
Muslin, A. J. MAPK signalling in cardiovascular health and disease: molecular mechanisms and therapeutic targets. Clin. Sci. 115, 203–218 (2008).
CAS Google Scholar
Grimaud, E., Heymann, D. & Rédini, F. Recent advances in TGF-β effects on chondrocyte metabolism: potential therapeutic roles of TGF-β in cartilage disorders. Cytokine growth factor Rev. 13, 241–257 (2002).
CAS PubMed Google Scholar
Ansari, M. D., Majid, H., Khan, A. & Sultana, Y. Clinical frontiers of metabolic bone disorders: a comprehensive review. Metab. Target Organ Damage 4, 2 (2023).
Google Scholar
Ntali, G., Markussis, V. & Chrisoulidou, A. An overview of cardiovascular risk in pituitary disorders. Medicina 60, 1241 (2024).
PubMed PubMed Central Google Scholar
Burford, N. G., Webster, N. A. & Cruz-Topete, D. Hypothalamic-pituitary-adrenal axis modulation of glucocorticoids in the cardiovascular system. Int J. Mol. Sci. 18, 2150 (2017).
PubMed PubMed Central Google Scholar
Schernthaner-Reiter, M. H., Wolf, P., Vila, G. & Luger, A. The interaction of insulin and pituitary hormone syndromes. Front. Endocrinol. 12, 626427 (2021).
Google Scholar
Chrousos, G. P. The hypothalamic–pituitary–adrenal axis and immune-mediated inflammation. N. Engl. J. Med 332, 1351–1363 (1995).
CAS PubMed Google Scholar
Zhou, B., Zhu, Z., Ransom, B. R. & Tong, X. Oligodendrocyte lineage cells and depression. Mol. psychiatry 26, 103–117 (2021).
PubMed Google Scholar
Liu, S.-H., Du, Y., Chen, L. & Cheng, Y. Glial cell abnormalities in major psychiatric diseases: A systematic review of postmortem brain studies. Mol. Neurobiol. 59, 1665–1692 (2022).
CAS PubMed Google Scholar
Galvez-Contreras, A. Y., Zarate-Lopez, D., Torres-Chavez, A. L. & Gonzalez-Perez, O. Role of oligodendrocytes and myelin in the pathophysiology of autism spectrum disorder. Brain Sci. 10, 951 (2020).
CAS PubMed PubMed Central Google Scholar
Spaas, J. et al. Oxidative stress and impaired oligodendrocyte precursor cell differentiation in neurological disorders. Cell. Mol. Life Sci. 78, 4615–4637 (2021).
CAS PubMed PubMed Central Google Scholar
Ono, N., Balani, D. H. & Kronenberg, H. M. Stem and progenitor cells in skeletal development. Curr. Top. Dev. Biol. 133, 1–24 (2019).
CAS PubMed PubMed Central Google Scholar
Totoson, P., Maguin-Gate, K., Prati, C., Wendling, D. & Demougeot, C. Mechanisms of endothelial dysfunction in rheumatoid arthritis: lessons from animal studies. Arthritis Res Ther. 16, 202 (2014).
PubMed PubMed Central Google Scholar
Nelson, C. P. et al. Genetically determined height and coronary artery disease. N. Engl. J. Med 372, 1608–1618 (2015).
CAS PubMed PubMed Central Google Scholar
Chen, Y., Yu, Z., Packham, J. C. & Mattey, D. L. Influence of adult height on rheumatoid arthritis: association with disease activity, impairment of joint function and overall disability. Plos One 8, e64862 (2013).
PubMed PubMed Central ADS Google Scholar
Lawlor, D., Ebrahim, S. & Davey Smith, G. The association between components of adult height and Type II diabetes and insulin resistance: British women’s heart and health study. Diabetologia 45, 1097–1106 (2002).
CAS PubMed Google Scholar
Schofield, P. W., Logroscino, G., Andrews, H. F., Albert, S. & Stern, Y. An association between head circumference and alzheimer’s disease in a population-based study of aging and dementia. Neurology 49, 30–37 (1997).
CAS PubMed Google Scholar
Borenstein Graves, A. et al. Head circumference and incident alzheimer’s disease: Modification by apolipoprotein E. Neurology 57, 1453–1460 (2001).
CAS PubMed Google Scholar
Perneczky, R. et al. Head circumference, atrophy, and cognition: implications for brain reserve in Alzheimer disease. Neurology 75, 137–142 (2010).
CAS PubMed PubMed Central Google Scholar
Kwon, O. D., Choi, S.-Y. & Bae, J. Association of head circumference with cognitive decline and symptoms of depression in elderly: a 3-year prospective study. Yeungnam Univ. J. Med. 35, 205–212 (2018).
PubMed PubMed Central Google Scholar
Emery, N. J. & Clayton, N. S. The mentality of crows: convergent evolution of intelligence in corvids and apes. Science 306, 1903–1907 (2004).
CAS PubMed ADS Google Scholar
Audet, J.-N., Couture, M. & Jarvis, E. D. Songbird species that display more-complex vocal learning are better problem-solvers and have larger brains. Science 381, 1170–1175 (2023).
CAS PubMed ADS Google Scholar
Koteja, P. The evolution of concepts on the evolution of endothermy in birds and mammals. Physiological biochemical Zool. 77, 1043–1050 (2004).
Google Scholar
Neubauer, S., Hublin, J. J. & Gunz, P. The evolution of modern human brain shape. Sci. Adv. 4, eaao5961 (2018).
PubMed PubMed Central ADS Google Scholar
Bramble, D. M. & Lieberman, D. E. Endurance running and the evolution of Homo. Nature 432, 345–352 (2004).
CAS PubMed ADS Google Scholar
Lacruz, R. S. et al. The evolutionary history of the human face. Nat. Ecol. Evol. 3, 726–736 (2019).
PubMed Google Scholar
Grabowski, M., Hatala, K. G., Jungers, W. L. & Richmond, B. G. Body mass estimates of hominin fossils and the evolution of human body size. J. Hum. Evol. 85, 75–93 (2015).
PubMed Google Scholar
Bogin, B. & Varela-Silva, M. I. Leg length, body proportion, and health: a review with a note on beauty. Int J. Environ. Res Public Health 7, 1047–1075 (2010).
PubMed PubMed Central Google Scholar
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
CAS PubMed PubMed Central ADS Google Scholar
Dong, Y., Pan, Y., Zhang, J. & Xu, W. Learning to read chest X-ray images from 16000+ examples using CNN. In 2017 IEEE/ACM international conference on connected health: applications, systems and engineering technologies (CHASE) 51-57 (IEEE, 2017).
Rajpurkar, P. et al. Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 15, e1002686 (2018).
PubMed PubMed Central Google Scholar
Mason, D. SU-E-T-33: pydicom: an open source DICOM library. Med. Phys. 38, 3493–3493 (2011).
Google Scholar
Marcel, S. & Rodriguez, Y. Torchvision the machine-vision package of torch. In Proceedings of the 18th ACM International Conference on Multimedia 1485-1488 (ACM, 2010).
Kim, H.-Y. Statistical notes for clinical researchers: Chi-squared test and Fisher’s exact test. Restor. Dent. Endod. 42, 152 (2017).
PubMed PubMed Central Google Scholar
Goulet-Pelletier, J.-C. & Cousineau, D. A review of effect sizes and their confidence intervals, Part I: The Cohen’sd family. Quant. Methods Psychol. 14, 242–265 (2018).
Google Scholar
Russell, B. C., Torralba, A., Murphy, K. P. & Freeman, W. T. LabelMe: a database and web-based tool for image annotation. Int. J. computer Vis. 77, 157–173 (2008).
Google Scholar
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
PubMed PubMed Central Google Scholar
Loh, P. R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet 47, 284–290 (2015).
CAS PubMed PubMed Central Google Scholar
Consortium, G. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
Google Scholar
Yu, G., Wang, L.-G., Han, Y. & He, Q.-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics: a J. Integr. Biol. 16, 284–287 (2012).
CAS Google Scholar
Kun, E., Sohail, M. & Narasimhan, V. M. The trait-specific timing of accelerated genomic change in the human lineage. Cell Genomics 5, 10.1016/j.xgen.2024.100740 (2025).
Pers, T. H., Timshel, P. & Hirschhorn, J. N. SNPsnap: a Web-based tool for identification and annotation of matched SNPs. Bioinformatics 31, 418–420 (2015).
CAS PubMed Google Scholar
Seabold, S. & Perktold, J. Statsmodels: econometric and statistical modeling with python. SciPy 7, https://www.statsmodels.org/stable/index.html (2010).
Kurki, M. I. et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613, 508–518 (2023).
CAS PubMed PubMed Central ADS Google Scholar
Verma, A. et al. Diversity and scale: Genetic architecture of 2068 traits in the VA Million Veteran Program. Science 385, eadj1182 (2024).
CAS PubMed Google Scholar
Sullivan, P. F. et al. Psychiatric genomics: an update and an agenda. Am. J. Psychiatry 175, 15–27 (2018).
PubMed Google Scholar
Lee, J. J. et al. Gene discovery and polygenic prediction from a 1.1-million-person GWAS of educational attainment. Nat. Genet. 50, 1112 (2018).
CAS PubMed PubMed Central Google Scholar
Bowden, J. et al. Improving the visualization, interpretation and analysis of two-sample summary data Mendelian randomization via the Radial plot and Radial regression. Int. J. Epidemiol. 47, 1264–1278 (2018).
PubMed PubMed Central Google Scholar
Fairley, S., Lowy-Gallego, E., Perry, E. & Flicek, P. The International Genome Sample Resource (IGSR) collection of open human genomic variation resources. Nucleic acids Res. 48, D941–D947 (2020).
CAS PubMed Google Scholar
Hemani, G. et al. The MR-Base platform supports systematic causal inference across the human phenome. elife 7, e34408 (2018).
PubMed PubMed Central Google Scholar
Verbanck, M., Chen, C.-Y., Neale, B. & Do, R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 50, 693–698 (2018).
CAS PubMed PubMed Central Google Scholar
Wei, S. Genetic Insights into Head-to-Body Ratios Via Deep Learning-Based Image Segmentation and Implications for Common Diseases. Zenodo. https://doi.org/10.5281/zenodo.17421944 (2025).
Wei, S. Genetic Insights into Head-to-Body Ratios Via Deep Learning-Based Image Segmentation and Implications for Common Diseases. Zenodo. https://doi.org/10.5281/zenodo.17422184 (2025).

Download references

Acknowledgements

This research has been conducted using the UK Biobank resource under application number 46387. We want to acknowledge the participants and investigators of the FinnGen and MVP study. This study is supported by grants from: National Natural Science Foundation of China (32470639, 82170896, and 82372458), Science Fund for Distinguished Young Scholars of Shaanxi Province (2025JC-JCQN-054), Inner Mongolia Autonomous Region “Talents for Inner Mongolia” Project Team (2025TYL12), Natural Science Foundation Project of Inner Mongolia Autonomous Region (2025ZD013), and Fundamental Research Funds for the Central Universities. This study is also supported by the High-Performance Computing Platform and Instrument Analysis Center of Xi’an Jiaotong University. Parts of the Fig. 1c, d, and Fig. 5 were drawn by using pictures from Servier Medical Art and changes were made to the pictures. Servier Medical Art by Servier is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/).

Author information

These authors contributed equally: Wei Shi, Shan-Shan Dong.

Authors and Affiliations

Biomedical Informatics & Genomics Center, Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi, P.R. China
Wei Shi, Shan-Shan Dong, Ren-Jie Zhu, Shi-Hao Tang, Jia-Hao Wang, Feng Jiang, Hao Wu, Yuan-Yuan Duan, Jing Guo, Zheng-Qiang Li, Yan Guo & Tie-Lin Yang
The second Affiliated Hospital of Inner Mongolia Medical University, Hohhot, Inner Mongolia, P. R. China
Kai Liu & Jianzhong Wang
Department of Orthopedics, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, P. R. China
Meng Li & Tie-Lin Yang

Authors

Wei Shi
View author publications
Search author on:PubMed Google Scholar
Shan-Shan Dong
View author publications
Search author on:PubMed Google Scholar
Ren-Jie Zhu
View author publications
Search author on:PubMed Google Scholar
Shi-Hao Tang
View author publications
Search author on:PubMed Google Scholar
Jia-Hao Wang
View author publications
Search author on:PubMed Google Scholar
Feng Jiang
View author publications
Search author on:PubMed Google Scholar
Hao Wu
View author publications
Search author on:PubMed Google Scholar
Yuan-Yuan Duan
View author publications
Search author on:PubMed Google Scholar
Jing Guo
View author publications
Search author on:PubMed Google Scholar
Kai Liu
View author publications
Search author on:PubMed Google Scholar
Zheng-Qiang Li
View author publications
Search author on:PubMed Google Scholar
Meng Li
View author publications
Search author on:PubMed Google Scholar
Jianzhong Wang
View author publications
Search author on:PubMed Google Scholar
Yan Guo
View author publications
Search author on:PubMed Google Scholar
Tie-Lin Yang
View author publications
Search author on:PubMed Google Scholar

Contributions

T.L.Y. and Y.G. conceptualized the study. W.S., S.S.D., R.J.Z., K.L., M.L., and J.W. developed the methodology. W.S., R.J.Z., S.H.T., and J.H.W. performed the programming. T.L.Y. and J.G. performed UKB data curation and analysis. W.S., F.J., and Z.Q.L. interpreted GWAS results. W.S., H.W., and Y.Y.D. created the visualizations. T.L.Y., Y.G., and J.W. provided the project administration and supervision. W.S. and S.S.D. wrote the original draft of the manuscript, with Y.G. and T.L.Y. providing review and editing.

Corresponding authors

Correspondence to Jianzhong Wang, Yan Guo or Tie-Lin Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Vagheesh Narasimhan and Gang Peng for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Description of Additional Supplementary Files (download PDF )

Supplementary Data 1-25 (download XLSX )

Reporting Summary (download PDF )

Transparent Peer Review file (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Shi, W., Dong, SS., Zhu, RJ. et al. Genetic Insights into Head-to-Body Ratios Via Deep Learning-Based Image Segmentation and Implications for Common Diseases. Nat Commun 17, 864 (2026). https://doi.org/10.1038/s41467-025-67578-8

Download citation

Received: 13 March 2025
Accepted: 03 December 2025
Published: 24 December 2025
Version of record: 22 January 2026
DOI: https://doi.org/10.1038/s41467-025-67578-8