Abstract
The chicken is a valuable model for understanding fundamental biology and vertebrate evolution and is a major global source of nutrient-dense and lean protein. Despite being the first non-mammalian amniote to have its genome sequenced, a systematic characterization of functional variation on the chicken genome remains lacking. Here, we integrated bulk RNA sequencing (RNA-seq) data from 7,015 samples, single-cell RNA-seq data from 127,598 cells and 2,869 whole-genome sequences to present a pilot atlas of regulatory variants across 28 chicken tissues. This atlas reveals millions of regulatory effects on primary expression (protein-coding genes, long non-coding RNA and exons) and post-transcriptional modifications (alternative splicing and 3′-untranslated region alternative polyadenylation). We highlighted distinct molecular mechanisms underlying these regulatory variants, their context-dependent behavior and their utility in interpreting genome-wide associations for 39 chicken complex traits. Finally, our comparative analyses of gene regulation between chickens and mammals demonstrate how this resource can facilitate cross-species gene mapping of complex traits.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout






Similar content being viewed by others
Data availability
All raw data analyzed in this study are publicly available for download without restrictions from the Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra) and National Genomics Data Center (NGDC) BioProject (https://bigd.big.ac.cn/bioproject) databases. The GRCg6a chicken reference genome (v.102) is available at Ensembl (https://www.ensembl.org). Details of RNA-seq, WGS, chromatin immunoprecipitation sequencing peaks and single-cell RNA-seq can be found in Supplementary Tables. All processed data, including metadata, the genotype imputation reference panel, molecular phenotypes, imputed genotypes and the summary statistics of molQTL and GWAS, are available on Zenodo (https://doi.org/10.5281/zenodo.14902956)102 and http://chicken.farmgtex.org. Source data are provided with this paper.
Code availability
All the computational scripts and codes for RNA-seq, WGS, single-cell RNA-seq and Hi-C dataset analyses as well as the respective quality control, molecular phenotype normalization, genotype imputation, molQTL mapping, functional enrichment, colocalization, SMR and TWAS are available on Zenodo (https://doi.org/10.5281/zenodo.14902956)102 and the FarmGTEx GitHub website (https://github.com/guandailu/ChickenGTEx_pilot_phase).
References
Hillier, L. W. & Miller, W. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432, 695–716 (2004).
Burt, D. W. Emergence of the chicken as a model organism: implications for agriculture and biology. Poult. Sci. 86, 1460–1471 (2007).
Beacon, T. H. & Davie, J. R. The chicken model organism for epigenomic research. Genome 64, 476–489 (2021).
Garcia, P., Wang, Y., Viallet, J. & Macek Jilkova, Z. The chicken embryo model: a novel and relevant model for immune-based studies. Front. Immunol. 12, 791081 (2021).
Wright, D. et al. The genetic architecture of domestication in the chicken: effects of pleiotropy and linkage. Mol. Ecol. 19, 5140–5156 (2010).
Flores-Santin, J. & Burggren, W. W. Beyond the chicken: alternative avian models for developmental physiological research. Front. Physiol. 12, 712633 (2021).
Brown, W. R. A., Hubbard, S. J., Tickle, C. & Wilson, S. A. The chicken as a model for large-scale analysis of vertebrate gene function. Nat. Rev. Genet. 4, 87–98 (2003).
Wu, Z. et al. Heterogeneity of a dwarf phenotype in Dutch traditional chicken breeds revealed by genomic analyses. Evol. Appl. 14, 1095–1108 (2021).
Wang, M.-S. et al. An evolutionary genomic perspective on the breeding of dwarf chickens. Mol. Biol. Evol. 34, 3081–3088 (2017).
van der Eijk, J. A. J. et al. Chicken lines divergently selected on feather pecking differ in immune characteristics. Physiol. Behav. 212, 112680 (2019).
Lillie, M. et al. Genomic signatures of 60 years of bidirectional selection for 8-week body weight in chickens. Poult. Sci. 97, 781–790 (2018).
Smith, J. et al. Fourth report on chicken genes and chromosomes 2022. Cytogenet. Genome Res. 162, 405–528 (2023).
Kern, C. et al. Functional annotations of three domestic animal genomes provide vital resources for comparative and agricultural research. Nat. Commun. 12, 1821 (2021).
Pan, Z. et al. An atlas of regulatory elements in chicken: a resource for chicken genetics and genomics. Sci. Adv. 9, eade1204 (2023).
The GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
Aguet, F. et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).
The GTEx Consortium. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).
Liu, S. & Fang, L. The CattleGTEx atlas reveals regulatory mechanisms underlying complex traits. Nat. Genet. 54, 1273–1274 (2022).
Liu, S. et al. A multi-tissue atlas of regulatory variants in cattle. Nat. Genet. 54, 1438–1447 (2022).
Teng, J. et al. A compendium of genetic regulatory effects across pig tissues. Nat. Genet. 56, 112–123 (2024).
Ellegren, H. et al. Faced with inequality: chicken do not have a general dosage compensation of sex-linked genes. BMC Biol. 5, 40 (2007).
Nicholas, F. W. Online Mendelian inheritance in animals (OMIA): a comparative knowledgebase of genetic disorders and other familial traits in non-laboratory animals. Nucleic Acids Res. 31, 275–277 (2003).
Wang, Z. et al. An EAV-HP Insertion in 5′ flanking region of SLCO1B3 causes blue eggshell in the chicken. PLoS Genet. 9, e1003183 (2013).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Guan, L., Yang, Q., Gu, M., Chen, L. & Zhang, X. Exon expression QTL (eeQTL) analysis highlights distant genomic variations associated with splicing regulation. Quant. Biol. 2, 71–79 (2014).
Li, Y. I. et al. Annotation-free quantification of RNA splicing using LeafCutter. Nat. Genet. 50, 151–158 (2018).
Cui, R. et al. Improving fine-mapping by modeling infinitesimal effects. Nat. Genet. 56, 162–169 (2024).
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).
Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931–934 (2015).
Robins, C. et al. Genetic control of the human brain proteome. Am. J. Hum. Genet. 108, 400–410 (2021).
Guan, D. et al. Profiling chromatin contacts at micro-scale in the chicken genome. In Int. Plant Anim. Genome Conf. https://pag.confex.com/data/abstract/pag/30/Paper_49134_abstract_25104_translated.html (2023).
Noda, D. et al. ELAC2, a putative prostate cancer susceptibility gene product, potentiates TGF-β/Smad-induced growth arrest of prostate cells. Oncogene 25, 5591–5600 (2006).
Hu, H. et al. AnimalTFDB 3.0: a comprehensive resource for annotation and prediction of animal transcription factors. Nucleic Acids Res. 47, D33–D38 (2019).
Urbut, S. M., Wang, G., Carbonetto, P. & Stephens, M. Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. Nat. Genet. 51, 187–195 (2019).
Wang, Y. et al. Genetic dissection of growth traits in a unique chicken advanced intercross line. Front. Genet. 11, 894 (2020).
Hukku, A., Sampson, M. G., Luca, F., Pique-Regi, R. & Wen, X. Analyzing and reconciling colocalization and transcriptome-wide association studies from the perspective of inferential reproducibility. Am. J. Hum. Genet. 109, 825–837 (2022).
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).
Barbeira, A. N. et al. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet. 15, e1007889 (2019).
Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 9, 1825 (2018).
Barbeira, A. N. et al. Exploiting the GTEx resources to decipher the mechanisms at GWAS loci. Genome Biol. 22, 49 (2021).
Mostafavi, H., Spence, J. P., Naqvi, S. & Pritchard, J. K. Systematic differences in discovery of genetic effects on gene expression and complex traits. Nat. Genet. 55, 1866–1875 (2023).
Connally, N. J. et al. The missing link between genetic association and regulatory function. eLife 11, e74970 (2022).
Sowa, A. S. et al. Karyopherin α-3 is a key protein in the pathogenesis of spinocerebellar ataxia type 3 controlling the nuclear localization of ataxin-3. Proc. Natl Acad. Sci. USA 115, E2624–E2633 (2018).
Cao, J. et al. Effect of combinations of monochromatic lights on growth and productive performance of broilers. Poult. Sci. 91, 3013–3018 (2012).
Pan, J., Yang, Y., Yang, B. & Yu, Y. Artificial polychromatic light affects growth and physiology in chicks. PLoS One 9, e113595 (2014).
Schob, C. et al. Dominant KPNA3 mutations cause infantile-onset hereditary spastic paraplegia. Ann. Neurol. 90, 738–750 (2021).
Santhanam, N. et al. RatXcan: a framework for cross-species integration of genome-wide association and gene expression data. Preprint at BioRxiv https://doi.org/10.1101/2022.06.03.494719 (2024).
Naqvi, S. et al. Conservation, acquisition, and functional impact of sex-biased gene expression in mammals. Science 365, eaaw7317 (2019).
Li, J. et al. Genome-wide association studies for egg quality traits in White Leghorn layers using low-pass sequencing and SNP chip data. J. Anim. Breed. Genet. 139, 380–397 (2022).
Qi, T. et al. Genetic control of RNA splicing and its distinct role in complex trait variation. Nat. Genet. 54, 1355–1363 (2022).
Li, L. et al. An atlas of alternative polyadenylation quantitative trait loci contributing to complex trait and disease heritability. Nat. Genet. 53, 994–1005 (2021).
Munro, D. et al. Multimodal analysis of RNA sequencing data powers discovery of complex trait genetics. Nat. Commun. 15, 10387 (2024).
Guan, D. et al. Prediction of transcript isoforms in 19 chicken tissues by Oxford Nanopore long-read sequencing. Front. Genet. 13, 997460 (2022).
Kuo, R. I. et al. Normalized long read RNA sequencing in chicken reveals transcriptome complexity similar to human. BMC Genomics 18, 323 (2017).
Thomas, S., Underwood, J. G., Tseng, E., & Holloway, A. K. Long-read sequencing of chicken transcripts and identification of new transcript isoforms. PLoS ONE 9, e94650 (2014).
Zhang, J. et al. Comprehensive analysis of structural variants in chickens using PacBio sequencing. Front. Genet. 13, 971588 (2022).
Kojima, S. et al. Mobile element variation contributes to population-specific genome diversification, gene regulation and disease risk. Nat. Genet. 55, 939–951 (2023).
Wicker, T. et al. The repetitive landscape of the chicken genome. Genome Res. 15, 126–136 (2005).
Li, M. et al. De novo assembly of 20 chicken genomes reveals the undetectable phenomenon for thousands of core genes on microchromosomes and subtelomeric regions. Mol. Biol. Evol. 39, msac066 (2022).
Morris, J. A. et al. Discovery of target genes and pathways at GWAS loci by pooled single-cell CRISPR screens. Science 380, eadh7699 (2023).
Mai, C. et al. Implications of gene inheritance patterns on the heterosis of abdominal fat deposition in chickens. Genes 10, 824 (2019).
Yi, G. et al. In-depth duodenal transcriptome survey in chickens with divergent feed efficiency using RNA-Seq. PLoS ONE 10, e0136765 (2015).
You, Z. et al. Integrated analysis of lncRNA and mRNA repertoires in Marek’s disease infected spleens identifies genes relevant to resistance. BMC Genomics 20, 245 (2019).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Jehl, F. et al. An integrative atlas of chicken long non-coding genes and their annotations across 25 tissues. Sci. Rep. 10, 20457 (2020).
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Kovaka, S. et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 20, 278 (2019).
Chen, C. et al. TBtools: an integrative toolkit developed for interactive analyses of big biological data. Mol. Plant 13, 1194–1202 (2020).
Yu, G., Smith, D. K., Zhu, H., Guan, Y. & Lam, T. T.-Y. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol. Evol. 8, 28–36 (2017).
van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Xia, Z. et al. Dynamic analyses of alternative polyadenylation from RNA-seq reveal a 3′-UTR landscape across seven tumour types. Nat. Commun. 5, 5274 (2014).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Zhong, C. et al. Age-dependent genetic architectures of chicken body weight explored by multidimensional GWAS and molQTL analyses. J. Genet. Genomics 51, 1423–1434 (2024).
Li, X. et al. Runs of homozygosity and selection signature analyses reveal putative genomic regions for artificial selection in layer breeding. BMC Genomics 25, 638 (2024).
Jin, J. et al. Calcium deposition in chicken eggshells: role of host genetics and gut microbiota. Poult. Sci. 103, 104073 (2024).
Degalez, F. et al. Enriched atlas of lncRNA and protein-coding genes for the GRCg7b chicken assembly and its functional annotation across 47 tissues. Sci. Rep. 14, 6588 (2024).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://doi.org/10.48550/arXiv.1303.3997 (2013).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. GigaScience 10, giab008 (2021).
Browning, B. L., Zhou, Y. & Browning, S. R. A one-penny imputed genome from next-generation reference panels. Am. J. Hum. Genet. 103, 338–348 (2018).
Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–507 (2012).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
Taylor-Weiner, A. et al. Scaling computational genomics to millions of individuals with GPUs. Genome Biol. 20, 228 (2019).
Ongen, H., Buil, A., Brown, A. A., Dermitzakis, E. T. & Delaneau, O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics 32, 1479–1485 (2016).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol. 57, 289–300 (1995).
R: a language and environment for statistical computing (R Foundation for Statistical Computing, 2022).
Yin, L. et al. rMVP: a memory-efficient, visualization-enhanced, and parallel-accelerated tool for genome-wide association study. Genomics Proteomics Bioinformatics 19, 619–628 (2021).
VanRaden, P. M. Efficient methods to compute genomic predictions. J. Dairy Sci. 91, 4414–4423 (2008).
Zou, Y., Carbonetto, P., Wang, G. & Stephens, M. Fine-mapping from summary data with the “sum of single effects” model. PLoS Genet. 18, e1010299 (2022).
Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).
Duong, D. et al. Applying meta-analysis to genotype-tissue expression data from multiple tissues to identify eQTLs and increase the number of eGenes. Bioinformatics 33, i67–i74 (2017).
Speagle, J. S. A conceptual introduction to Markov chain Monte Carlo methods. Preprint at https://doi.org/10.48550/arXiv.1909.12313 (2020).
Storey, J. D., Bass, A. J., Dabney, A. & Robinson, D. qvalue: Q-value estimation for false discovery rate control. GitHub https://github.com/StoreyLab/qvalue (2022).
Wen, X., Pique-Regi, R. & Luca, F. Integrating molecular QTL data into genome-wide genetic association analysis: probabilistic assessment of enrichment and colocalization. PLoS Genet. 13, e1006646 (2017).
Lee, Y., Luca, F., Pique-Regi, R. & Wen, X. Bayesian multi-SNP genetic association analysis: control of FDR and use of summary statistics. Preprint at https://doi.org/10.1101/316471 (2018).
Wen, X., Lee, Y., Luca, F. & Pique-Regi, R. Efficient integrative multi-SNP association analysis via deterministic approximation of posteriors. Am. J. Hum. Genet. 98, 1114–1129 (2016).
Wen, X. Molecular QTL discovery incorporating genomic annotations using Bayesian false discovery rate control. Ann. Appl. Stat. 10, 1619–1638 (2016).
Bhattacharya, A. et al. Best practices for multi-ancestry, meta-analytic transcriptome-wide association studies: lessons from the Global Biobank Meta-analysis Initiative. Cell Genomics 2, 100180 (2022).
Guan, D. Chicken genotype-tissue expression (ChickenGTEx) project. Zenodo https://doi.org/10.5281/zenodo.14902956 (2025).
Acknowledgements
We thank all the researchers who have contributed to the publicly available data used in this research. We extend special acknowledgment to the human GTEx consortium15 for their sharing of computational codes for some data analyses and figure generation. L.F. was supported by Agriculture and Food Research Initiative Competitive grants no. 2022-67015-36215 (H.Z. and L.F.) from the US Department of Agriculture (USDA) National Institute of Food and Agriculture, and seed-funding from CellFood Hub (Aarhus University Foundation, AUFF). D.G. was supported by Agriculture and Food Research Initiative Competitive grants nos. 2020-67015-31175 (H.Z.) and 2022-67015-36215 (H.Z. and L.F.) from the USDA National Institute of Food and Agriculture. H.Z. acknowledges funding from Agriculture and Food Research Initiative Competitive grant nos. 2015-67015-22940 (H.Z.) and 2022-67015-36215 (H.Z. and L.F.) from the USDA National Institute of Food and Agriculture, Multistate Research Project NRSP8 and NC1170 and the California Agricultural Experimental Station. N.Y. acknowledges fundings from the National Key Research and Development Program of China (2021YFD1300600 and 2022YFF1000204). X.H. acknowledges funding from the National Natural Science Foundation of China (31961133003) and the support of the high-performance computing platform of the National Research Facility for Phenotypic and Genotypic Analysis of Model Animals (Beijing). Yuzhe Wang was supported by the National Natural Science Foundation of China (32272862). S. Rong acknowledges funding from Jiangsu Agricultural Industry Technology System (JATS[2022]406). Zhe Zhang acknowledges fundings from the National Natural Science Foundation of China (32022078 to Zhe Zhang), the Local Innovative and Research Teams Project of Guangdong Province (2019BT02N630 to Q.N. and Zhe Zhang). Zhang Zhang acknowledges funding from National Natural Science Foundation of China (32030021), National Key Research and Development Program of China (2021YFF0703702) and Technical Support Talent Program of Chinese Academy of Sciences (awarded to D.Z.). Y.H. acknowledges funding from the National Key Research and Development Program of China (2024YFF1000100) and the Science and Technology Innovation 2030–Major Project (2022ZD04017). L.W., H.Q. and C.L. were supported by Science and Technology Planning Project of Guangzhou City (201504010017) and Natural Scientific Foundation of China (31402067). X. Zhao acknowledges funding from the Natural Sciences and Engineering Research Council of Canada Discovery grant (RGPIN-2022-03884). M.W. acknowledges funding from the Yunnan Fundamental Research Projects (202301AW070012, 202401AV070007). G.E.L. was supported in part by USDA NIFA AFRI grant nos. 2019-67015-29321 and 2021-67015-33409 and the appropriated project 8042-31000-112-00-D of the USDA Agricultural Research Service.
Author information
Authors and Affiliations
Consortia
Contributions
L. Fang, H. Zhou, D.G., X.H. and N.Y. conceived and designed the project. D.G., Y.Y., B.Z. and Z.P. performed bioinformatic analyses of RNA-seq data analysis. D.G., F.L., S.D., Y.G. and H.Y. conducted WGS data analysis. D. Zhu performed the deep-learning analysis. D.G. and Y.H. performed multi-omics and single-cell RNA-seq data analysis. D.G. conducted molQTL mapping. X. Zhu, C.Z. and D.G. performed GWAS integrative analysis. Z.B. and D.G. led the comparison of GTEx between chickens and mammals. L.F., H. Zhou, D.G., X. Zhu, Q.L., C.Z., Y.H., Yuzhe Wang, C.S., J.T., F.D., S.L., Ying Wang, M.W., M.P., D.R., M.C., J.S., K.W., A.J.B., W.W., L. Frantz, G.L., M.S.L., G.S., S.S., D.S., S.J.L., X. Zhao, X.R., S.L., B.L., H. Zhang and H.C. contributed to the critical interpretation of analytical results before and during manuscript preparation. Y.H., D. Zou, R.W., T.X. and Zhang Zhang built the ChickenGTEx web portal. H. Zhou, L. Fang, N.Y., X.H., G.E.L., Zhe Zhang, S.S., D.S., X. Zhao, Q.N., Z.L., W.L., H.Q., W.S. and C.L. contributed to the data and computational resources. D.G., Z.B., X. Zhu, C.Z., Y.W., Y.H. and L. Fang drafted the manuscript. All authors read, edited and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Characterization of gene expressions.
(a) Gene expression levels categorized by the number of tissues in which genes are expressed (defined as > 0.1 TPM in >80% of RNA-Seq samples). TPM: Transcripts per Million. (b) Functional enrichment analysis of tissue-specific genes using the Gene Ontology (GO) database. (c) Number of tissue-specific genes associated with at least one epigenetic regulator in the respective tissue. The annotation of epigenetic regulators was retrieved from Pan et al.14. (d) Proportion of tissue-specific genes linked to at least one regulator (Regulator-specific) across tissues. (e) Differentially expressed liver genes (n = 327, shown at x-axis) by sex. Genes were identified according to expression differences between males and females. Each bar represents a gene being tested. (f) Ideogram depicting the chromosomal locations of 17 genes (triangles) exhibiting sex-biased expression across 18 tissues. The color scale represents gene density within a 1 Mb window. (g) An example illustrating the expression of SLCO1B3 across 27 tissues (n = 5,273 samples) and the chromatin states surrounding it across 23 tissues14. SLCO1B3 expression in the liver (n = 903) is significantly higher than in the retina (Student’s two-sided t-test, Bonferroni-adjusted P = 1.12 × 10-239). In the boxplot, the central band represents the median, the box boundaries represent the 25th to 75th percentiles, and the whiskers extend 1.5 × the interquartile range. The middle panel displays the chromatin interaction map retrieved from the UCSC genome browser (http://genome.ucsc.edu/s/zhypan/galGal6_FAANG_V1)14. Colors ranging from light to dark indicate increasing chromatin interaction strength. TssA: strongly active promoters/transcripts, TssAHet: flanking active TSS without ATAC, TxFlnk: transcribed at gene, TxFlnkWk: weak transcribed at gene, TxFlnkHet: transcribed region without ATAC, EnhA: strong active enhancer, EnhAMe: medium enhancer with ATAC, EnhAWk: weak active enhancer, EnhAHet: active enhancer no ATAC (hetero), EnhPois: poised enhancer, ATAC_Is: ATAC island, TssBiv: bivalent/poised TSS, Repr: repressed polycomb, ReprWk: weak repressed polycomb.
Extended Data Fig. 2 Genotype imputation.
(a) Correlation between the number of high-quality SNPs within cis-windows ( ± 1 Mb of the transcriptional start site (TSS) and the corresponding gene expression level (log2 scaled). (b) Number of high-quality SNPs as a function of tissue specificity of gene expression, measured by the Tau value. P values were calculated via the asymptotic t approximation. (c) Percentage of high-quality SNPs within epigenetic regulators14. (d) Percentage of epigenomic regulators containing at least one high-quality SNP. (e) Proportion of imputed and reference genotypes categorized by variant type. (f) Genotype concordance and Spearman’s correlation between paired whole genome sequences (WGS) and RNA-Seq samples across 6 independent populations. CLB: Chinese local chicken breed (n = 280), Cobb (n = 21), Commercial: Commercial chicken population (n = 13), Indigenous: Chinese indigenous chicken populations (n = 18), WPB: white plumage broiler (n = 112), AIL: Advanced intercross line (n = 30). (g) Genotype concordance and Spearman’s correlation (mean ± s.d.) between paired WGS and RNA-Seq samples. (h) Number of imputed SNPs as a function of the number of SNPs directly called from WGS within cis-windows of 16,779 genes. Significance was assessed using a two-sided Student’s t-test. (i) Relationship between the median expression levels of genes and the ratio of imputed to observed SNPs. Significance was obtained with two-sided Student’s t-test. (j) Number of imputed and observed SNPs as a function of distance to the TSS, stratified by the median expression levels of genes across samples. In the boxplot, the central band represents the median, the box boundaries represent the 25th to 75th percentiles, and the whiskers extend 1.5 × the interquartile range. Significance was assessed using a two-sided Student’s t-test. *: P < 0.05; n.s.: not significant.
Extended Data Fig. 3 Molecular quantitative trait locus (molQTL) mapping across 28 chicken tissues.
(a) cis-heritability (cis-h2) of 219,728 eGene-tissue pairs and 46,303 non-eGene-tissue pairs. The central band in the boxplot represents the median, the box boundaries represent the 25th to 75th percentiles, and the whiskers extend 1.5 × the interquartile range. (b) Gene expression levels, measured as Transcripts per Million (TPM) for eGenes detected in at least one tissue and non-eGenes across 28 tissues. (c) Spearman correlations between lead eQTL effect sizes and gene expression across 28 tissues. NA (low expression.): not tested due to low expression levels; Uncorrelated: tested but not significantly correlated; Uninterpretable: significant but ambiguous correlation direction due to effect sizes crossing zero; Positive corr.: positively correlated; Negative corr.: negatively correlated. (d) Conditionally independent eQTLs across 28 tissues. The proportion of eGenes with different numbers of independent eQTLs is shown as blue stacked bars (left y-axis), and the mean number of independent eQTLs per eGene is represented by red dots (right y-axis). Tissues are sorted by increasing sample size. A tissue color legend is provided at the right. The figure is generated using human GTEx project codes15. (e) Linear regression slope as a function of allelic fold change (aFC) at log2 scale of eQTLs in the liver (n = 741). (f) Proportion of detected eQTLs across different effect size categories (from left to right panels) as a function of tissue sample size. Dot colors correspond to the legend in panel (d). (g)-(i) Down-sampling analyses of eGene and eQTLs. The panel (g) depicts the number of eGenes (left y-axis) detected and mean eQTLs per eGene (right y-axis) at different sample sizes. The middle panel (h) shows the proportion of detected eQTLs with large effects (absolute log2aFC ≥ 1, left y-axis) and small effects (absolute log2aFC ≤ 0.25, right y-axis). The right panel (i) presents the number of eGenes detected when the effect size of lead eQTL is large (absolute log2aFC ≥ 1, left y-axis) or small (absolute log2aFC ≤ 0.25, right y-axis). We carried out down-sampling analyses (10 replications per sample size) in liver and muscle. Error bars represent standard errors across the replications.
Extended Data Fig. 4 Comparison of eQTL mapping between RNA-Seq imputed and whole-genome sequence (WGS) called genotypes in 308 chickens.
(a) The overlap of eGenes detected using imputed and observed genotypes (directly called from whole-genome sequences). (b) Comparison of significance (-log10P) between eGenes detected from observed and imputed genotypes. (c) Effect size (that is, slope) of lead variants across different categories: Both (n = 10,047): eGenes detected by both imputed and observed genotypes; Imputed genotypes only (n = 380): eGenes detected only by imputed genotypes; Observed SNPs only (n = 571): eGenes detected only by observed genotypes; Neither (n = 3,120): Non-eGenes in both approaches. Statistical significance was assessed using a two-sided Student’s t-test. (d) The overlap of lead variants identified by imputed and observed genotypes. (e) Comparison of distribution of lead variants relative to gene transcription start sites (TSS) for imputed and observed genotypes. (f) Correlation of effect sizes for lead variants of eGenes identified using imputed and observed genotypes. Same lead: lead variants of shared eGenes in panel (a) are identical. Diff lead: lead variants of shared eGenes differ. P values were computed via the asymptotic t approximation. (g) Linkage disequilibrium (LD, r2) of different lead variants for the same eGenes (n = 2,307) detected by imputed and observed genotypes. The “Distance-matched random set” includes an equal number of SNP pairs with similar physical distances to lead variants. The central band in the boxplot represents the median, the box boundaries represent the 25% to 75% percentiles, and the whiskers extend 1.5 × the interquartile range. Significance was assessed using a two-sided Student’s t-test. (h) Percentage of eGenes detected as a function of LD (r2) between two different lead variants detected by observed and imputed genotypes. The exact number of eGenes is depicted on each bar. (i) Functional enrichment (log2Fold change, mean ± s.d.) of eQTLs detected using observed versus imputed genotypes across chromatin states14. (j) Venn diagram depicting the overlap of eGene-independent SNP pairs identified using imputed and observed genotypes. (k) Number of eGenes stratified by different number of credible sizes. (l) Percentage of fine-mapped variants detected using observed and imputed genotypes, shown as a function of different posterior inclusion probability thresholds. Manhattan plots for imputed (bottom) and observed SNPs (top) of three eGenes: EPHB2 (m), UBXN6 (n), and ENSGALG00000006465 (o).
Extended Data Fig. 5 eQTL conservation and validation.
(a) Correlations of lead variant effects (upper triangle) and cis-h2 (lower triangle) between chickens and mammals. The species tree was retrieved from the Timetree5 (https://timetree.org/). P values of Pearson’s correlations were computed using the asymptotic t approximation. (b) Internal validation of eQTLs. Bars in light blue represent the Spearman correlation coefficients of eQTL effect sizes between validation and discovery groups (left y-axis), while red dots represent the π1 statistic, estimating the replication rate of eQTL between groups (right y-axis). Each of the 15 tissues with over 100 individuals was randomly and evenly split into discovery and validation groups. The tissue color legend is shown on the right. (c) Correlation between eQTL effect sizes (x-axis, n = 2,396) and effect sizes from allele-specific expression (ASE) analysis (y-axis) in the liver. P values were computed via the asymptotic t approximation. (d) The proportion of regulatory variants predicted by DeepSEA (prediction score > 0.7) based on 310 functional profiles in chickens (Table S8). molQTL_set: conditionally independent molQTL across tissues; Random_set: randomly selected variants matched for minor allele frequency (MAF) with molQTL; Background: all 1.5 million tested variants. *** P < 0.001, based on a two-sided Student’s t-test.
Extended Data Fig. 6 External validation of eQTL.
(a) linear regression model (LRM)-based validation. (b) linear mixed model (LMM)-based validation. Validation was carried out in three tissues: hypothalamus (upper row), liver (middle row) and pituitary gland (bottom row). ChickenGTEx served as the discovery population, while an independent validation population consisted of commercial White Plymouth Rock chickens. P values were computed via the asymptotic t approximation. (c) Comparison of SNPs between the ChickenGTEx discovery population and validation populations. Only SNPs common (minor allele frequency, MAF > 0.05) in both discovery and validation populations were included. (d) The number of SNPs used in π1 calculation shown in panel (a). (e) Sample sizes of the ChickenGTEx discovery and validation populations. (f) The number of eGenes detected in the discovery and validation populations. (g) Effect size distribution of replicated and not-replicated eQTLs across tissues. Replicated eQTLs: SNPs significant in the discovery population that also meet the significant threshold in the validation population. Boxplot details: The central band represents the median, the box boundaries represent the interquartile range (25% to 75%), and the whiskers extend 1.5 × the interquartile range. Statistical significance was obtained using a two-sided Student’s t-test. (h) The π1 value plotted as a function of eQTL effect size (log2 transformed allelic fold change, log2aFC) in the discovery population (liver). (i) Histogram of eQTL nominal P values in the validation population. The nearest variant in the validation population to the corresponding lead variant in the discovery population was selected for each eGene. (j) Distribution of eQTL nominal P values in the validation population. For each eGene identified in the discovery population, the top lead variant within the same LD block was selected.
Extended Data Fig. 7 Functional enrichment of molecular quantitative trait loci (molQTL).
(a) Fold enrichment (mean ± s.d.) of molQTL in strong enhancers (E6) and super-enhancers. Error bars represent the standard errors of enrichment across 17 tissues common to this study and Pan et al.14. (b) Fraction of eGene-eVariant pairs overlapping with regulatory elements-target gene pairs retrieved from Pan et al.14. (c) Percentage of eGene-eVariant pairs located within the same topologically associating domains (TAD) predicted from Hi-C data. (d) Enrichment of ePhenotype-molQTL pairs (log2 transformed odds ratio) within the same TAD, analyzed as a function of different distances to TSS. The error bars indicate standard errors of enrichment across 28 tissues. ePhenotype: molecular phenotypes regulated by at least one genetic variant. Odds ratio was obtained by fitting the linear model: SameTAD = eQTL + |TSS distance|+ eQTL* |TSS distance|, where SameTAD represents whether the eGene-eVariant pair is within the same TAD (coded as 1) or not (coded as 0). The symbols “–“ and “+” denote upstream and downstream eVariants relative to TSS, respectively. Boxplot details: The central band represents the median, the box boundaries represent the interquartile range (25% to 75%), and the whiskers extend 1.5 × the interquartile range. (e) Manhattan plot displaying SNP associations with TIMM17B gene expression in the brain. The lead SNP (rs317368746) is highlighted with a diamond. The bottom panel depicts regulatory elements annotations retrieved from Pan et al.14. TssA: strongly active promoters/transcripts, TssAHet: flanking active TSS without ATAC, TxFlnk: transcribed at gene, TxFlnkWk: weak transcribed at gene, TxFlnkHet: transcribed region without ATAC, EnhA: strong active enhancer, EnhAMe: medium enhancer with ATAC, EnhAWk: weak active enhancer, EnhAHet: active enhancer no ATAC (hetero), EnhPois: poised enhancer, ATAC_Is: ATAC island, TssBiv: bivalent/poised TSS, Repr: repressed polycomb, ReprWk: weak repressed polycomb.
Extended Data Fig. 8 Tissue-sharing properties of eQTL.
Absolute effect size (allelic fold change, aFC) of eQTLs (a), distance of eQTL to TSS (b), and minor allele frequency (c) as a function of the number of tissues in which the eGene is expressed. The black line represents the median estimate, and the grey shading indicates interquartile ranges. P values were computed via the asymptotic t-approximation. (d) Fraction of eQTLs around transcription start site (TSS) based on the number of tissues in which they are active in. (e) Fold enrichment (log2 scaled, mean ± s.d., y-axis) of tissue-specific and -shared eQTLs across 15 chromatin states. Error bars indicates standard errors of enrichment, with E1–E15 representing chromatin states defined previously14. (f) Manhattan plot for eQTL mapping of ALG3 in the liver, muscle, brain, and blood. The color legend represents the degree of linkage disequilibrium (LD, r2) between the top SNP and the rest. (g) and (h) Comparison of blood-specific eQTLs (Specific, n = 11,884) and the rest of blood eQTLs (Common, n = 9,089) in terms of Minor Allele Frequency (MAF) and effect size, respectively. Statistical significance was determined using a two-tailed Student’s t-test. (i) Opposite effects of lead variants on FBXO5 and ELAC2 in different tissues. Top panel: The effect of lead variant rs315639985 on FBXO5 in blood is opposite to that of rs312482960 in spleen. These two SNPs are in high LD (r2 = 0.92 in blood, r2 = 0.83 in spleen). Bottom panel: The effect of lead variant rs313608694 on ELAC2 in embryo is opposite to that in spleen. (j) Replication (π1) of eQTLs in liver across different breeds: Ross (n = 101), Leghorn (n = 74), Cobb (n = 47), and RIR (Rhode Island Red, n = 78). (k) Heatmap depicting eQTL effect sharing between breeds. (l) Expression of ENSGALG00000028174 (PRKCDBP) regulated by rs314795649 genotypes, consistent across all four breeds (n = 286 samples). Boxplot details: The central band represents the median, the box boundaries represent the 25% to 75% percentiles, and the whiskers extend 1.5 × the interquartile range.
Extended Data Fig. 9 Context-interaction eQTL.
Examples of sex-biased eGene TCLF5 (a), CPS1 (b) and SNAI2 (c), respectively, in the liver (n = 137). Boxplot details: The central band represents the median, the box boundaries represent the 25% to 75% percentiles, and the whiskers extend 1.5 × the interquartile range. (d) Dot plots of eGene ATP6V1A expression (y-axis) against the transcription factor TCF25 expression across three genotypes of rs313600592. (e) Number of cell-type interaction QTL (ci-eQTL) detected in each tissue-cell combination (FDR < 5%). (f) Distance of eQTLs and ci-eQTLs from the transcript start site (TSS). eQTLs are shown in red, and ci-eQTLs in blue. (g) Fold enrichment of eQTLs and ci-eQTLs across 15 chromatin states (log2 scaled, mean ± s.d.). Fold enrichment mean (dot) and standard deviation (error bars) were obtained from enrichment tests in five tissues (liver, muscle, heart, bursa and spleen)14. TssA: strongly active promoters/transcripts, TssAHet: flanking active TSS without ATAC, TxFlnk: transcribed at gene, TxFlnkWk: weak transcribed at gene, TxFlnkHet: transcribed region without ATAC, EnhA: strong active enhancer, EnhAMe: medium enhancer with ATAC, EnhAWk: weak active enhancer, EnhAHet: active enhancer no ATAC (hetero), EnhPois: poised enhancer, ATAC_Is: ATAC island, TssBiv: bivalent/poised TSS, Repr: repressed polycomb, ReprWk: weak repressed polycomb. (h) Histogram of linkage disequilibrium (LD) between lead variants of eQTLs and ci-eQTLs targeting the same genes. The vertical read line indicates the median LD (r2 = 0.02). (i) A ci-eQTL regulating MANSC1 in dendritic cells in muscle. The corresponding bulk tissue eQTL is shown on the right (tested in 517 samples). Boxplot detais: The central band represents the median, the box boundaries represent the 25% to 75% percentiles, and the whiskers extend 1.5 × the interquartile range. (j) Heatmap of Spearman’s correlation of ci-eQTL effect sizes across tissue-cell type combinations. Tissues are clustered based on dissimilarities (that is 1-d), where d is Euclidean distance calculated from ci-eQTL effects using the complete linkage method. The color legend of tissues and cell types is shown at the bottom.
Extended Data Fig. 10 Cross-species meta-TWAS analysis.
(a) Associations of gene expression and chicken body weight via single-species TWAS (upper panel) in chickens and cross-species (chicken, pig, and human) meta-TWAS (bottom panel) in muscle tissue. TWAS: transcriptome-wide association study. The cross-species meta-TWAS analysis was conducted by integrating chicken body weight TWAS data with corresponding TWAS results for mammalian growth-related traits in muscle. Physiologically similar traits to chicken body weight were defined arbitrarily as growth-related traits, such as backfat thickness and average daily gain (ADG) in pigs, and body weight and height in humans. Non-physiologically similar traits included growth-unrelated phenotypes, such as the number of mummified pigs and number of weaned piglets in pigs, and type 2 diabetes and heart failure in humans. Quantile–quantile (Q-Q) plots for TWAS and meta-TWAS results are depicted on the right. (b) Nominal TWAS associations of 14 genes identified by cross-species meta-TWAS (panel a) in independent chicken populations. Phenome-wide associations (that is PheWAS) of GOLGA3 with pig traits (c) and human traits (d). y-axis shows the negative base-10 logarithm of the P-value, and x-axis indicates traits tested. Each dot represents a trait, colored by trait categories, and the red dash line indicates the FDR threshold of 0.05.
Supplementary information
Supplementary Information (download PDF )
Supplementary Figs. 1–30, Supplementary Discussion and Supplementary References.
Supplementary Table 1 (download XLSX )
Supplementary Tables 1–16.
Source data
Source Data Fig. 2 (download CSV )
Statistical Source Data.
Source Data Fig. 4 (download CSV )
Statistical Source Data.
Source Data Fig. 5 (download CSV )
Statistical Source Data.
Source Data Fig. 6 (download CSV )
Statistical Source Data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guan, D., Bai, Z., Zhu, X. et al. Genetic regulation of gene expression across multiple tissues in chickens. Nat Genet 57, 1298–1308 (2025). https://doi.org/10.1038/s41588-025-02155-9
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41588-025-02155-9
This article is cited by
-
OmiGA for ultra-efficient molecular quantitative trait loci mapping
Nature Communications (2026)
-
RNA sequencing analysis in chicken spleen infected with Newcastle disease virus reveals genotype-specific immune response
BMC Genomics (2025)
-
The genetic influence of sex on gene expression for blood in pigs
BMC Genomics (2025)
-
Multi-omics dissection of the genetic regulation underlying fatty acid composition in cattle
Genome Biology (2025)
-
Selection signature analysis in chickens divergently selected for growth rate reveals novel candidate genes regulating fat deposition
BMC Genomics (2025)


