Introduction

Fatty acid composition (FA) is crucial for normal daily metabolism and can be obtained from a variety of foods1. FA in beef products can affect meat taste and flavor, which are essential indicators of meat quality2. Considerable emphasis has been placed on the nutritional value of meat products and their implications for human health3. FA is generally known as lowly or moderately heritable with complex genetic architecture in cattle4,5, thus it’s feasible to enhance the selection of animals with benefit FA through the identification of key candidate genes and genomic selection6,7,8,9. Genome-wide association studies (GWAS) using single nucleotide polymorphisms (SNP) arrays have explored the candidate variants for FA and identified candidate genes in various cattle populations6,10,11,12,13,14,15. Simultaneously, several studies evaluated the accuracies of genomic prediction of FA using SNP arrays in different cattle populations16,17,18,19.

GWAS have benefited from the rapid development of next-generation sequencing. Whole genome sequence analysis can help to elucidate genetic mechanisms of important traits, as it includes either the causal variants that underlie phenotypic variations or the polymorphisms with high linkage disequilibrium (LD) with the causal variants20. Association studies using whole genome sequence variants can provide more opportunities to pinpoint the causative mutations and improve the efficiency of genomic selection20,21. Based on the imputed SNPs, many studies have been carried in GWAS for important traits in dairy cattle, including mammary gland morphology22, fertility, calving traits23,24,25, and milk related traits26,27,28.

Recently, numerous evidences suggest that SNPs linked with complex traits in farm animals correspond to expression quantitative trait loci (eQTL)29, and integrating eQTL information can helps refine putatively causal variants30,31. Wang et al. integrated gene-based GWASs with cis-expression quantitative trait locus data and identified two genes (PON3 and PRIM2) for carcass yield32. Higgins et al. integrated the analysis of eQTL and GWAS for average daily gain, residual feed intake, and feed intake traits, and identified twenty-four potential functional cis-eQTL associated with these traits33. Furthermore, a recent study performed a colocalization analysis of cis-eQTL and GWAS loci using the Cattle Genotype Tissue Expression (CattleGTEx) atlas, and they observed the top GWAS signals were mostly enriched in the regulatory QTLs in their respective tissues34,35.

In our previous study, we identified 42 genomic regions associated with FA in beef cattle, and several candidate genes including ELOVL5, FASN, CASP2, and TG were also been implicated in fatty acid biosynthesis19. However, the identification of additional candidate variants using sequence-level variants and their molecular regulation underlying FA is not fully explored in cattle. In this study, we carried out multiple strategies association study using the imputed sequencing data in cattle, and identified three candidate genes (FASN, CCDC57 and ZMAT4) for FAs by integrating GWAS and eQTL analyses. Finally, the functional validation of candidate gene FASN in preadipocytes suggest its potential role in regulation of FA metabolism.

Results

Imputation reliability

In this study, approximately ten billion paired-end reads with 2952 Gb were obtained for the 44 animals based on Illumina HiSeq 2500 instruments. We observed similar imputation reliability using BEAGLE and Minimac for genotype imputation. While the imputation reliability of BEAGLE was more stable than Minimac with different minor allele frequency (MAF) (Supplementary Fig. 1), therefor we select BEAGLE imputation results for the subsequently GWAS analysis. We performed imputation based on 14,345,738 imputed variants from the 44 sequenced individuals using BEAGLE v4.1. Imputation from SNP chip to sequence data was performed across autosomes. We evaluated the imputation reliability using allelic R2, which predicts the imputation reliability per SNP. We observed 9,318,254 (~65%) SNPs with average allelic R2 > 0.3, of which 5,902,681 SNPs (~41.1%) were found with R2 > 0.8 from the Bovine HD to sequence data (Supplementary Table 1). The average imputation reliabilities ranged from 0.87 for MAF (>0.05) to 0.93 for MAF (>0.45) (Supplementary Table 2). The reliability displays slight differences across bovine autosome chromosomes (BTA), sugguesting the imputation reliability may be affected by the size of chromosome and decreased by structural variation36,37 (Supplementary Fig. 2). Next, we evaluated the impact of MAF on the reliability of imputation. Our analysis revealed that the imputation reliability increased with the MAF increase (Supplementary Table 3).

Annotation of whole genome sequence variants

We annotated a total of ~14 million sequence variants into 20 functional classes. We observed extremely large numbers of variants in intron (42.18%) and intergenic (48.79%) classes, while missense variant and synonymous variant classes represented 0.43% and 0.44% of the total variants, respectively. We also investigated the causal variants and their potential functional impacts, in terms of variants with a direct impact on proteins (e.g., missense or frame-shift mutations). We found 61,844 missense variants, while 87,300 synonymous variants were identified based on genomic variant annotation and functional effect prediction tool (SNPeff).

Association of fatty acid traits

We carried out GWAS for FA using imputed WGS variants. After SNP quality control and filtering based on imputation reliability allelic R2 > 0.3, around 9.3 million SNPs remained for subsequent analyses. We utilized genomic control approach to correct for possible population stratification using GRAMMAR-GC approach. A total of 7515 significant SNPs with imputation relatabilities >0.78 were identified for 14 FAs in the step I association using the imputed sequence data (Supplementary Data 1). Among them, 1864 SNPs located in BTA1-BTA6, BTA8-BTA22 and BTA24-BTA29 were identified for SFAs (C14:0, C20:0, C22:0 and C24:0), 1750 SNPs (BTA1-BTA9, BTA11-BTA21, BTA24 and BTA26-BTA29) for MFAs (C14:1, C16:1, C20:1), 3901 SNPs (BTA1-BTA21 and BTA24-BTA29) for PFAs (C18:3 n-6, C20:2, C20:4, C22:5:n-3, C22:6:n-3, C18:2 t-9c-11, and C18:2 t-12c-10). The manhattan plot of SFAs (C14:0 and C20:0), MFAs (C14:1, C20:1) and PFAs (C18:3 n-6, and C20:4) were shown in Supplementary Fig. 3, Supplementary Fig. 4 and Supplementary Fig. 5, respectively. Also, the quantile-quantile (Q-Q) plot of GWAS for these traits demonstrated the goodness of fit of the observed to expected significance values for studied traits after applying possible population stratification in GRAMMAR-GC approach. We then performed gene annotation using 15 kb windows size around candidate SNPs based on Ensembl release 110 databases (genome-build ARS-UCD1.2). In total, 2723 SNPs were newly found and overlapped with 122 unique genes. We next conducted the step II GWAS analysis based on the results from step I. Regions with a size of 0.5 Mb on both sides of the genome-wide significant SNPs were analyzed. In total, 462,376 SNPs were used in step II GWAS.

For the step II analysis, we identified 93 candidate SNPs (P > 9.39E-8) for 11 FAs including C14:0, C14:1, C18:3 n-6, C18:2 t-9c-11, C18:2 t-12c-10, C20:0, C20:1, C20:2, C20:4, C24:0, C22:6 n-3 (Supplementary Table 4). Among these 122 unique genes identified in step I, we observed 65 SNPs were overlapped with five candidate genes (CCDC57, FASN, HDAC11, ALG14 and ZMAT4). Notably, we found two peaks overlapping with FASN and ZMAT4, one significant peak with 6 SNPs located at 50.7 Mb on BTA19 associated with C14:0. Another peak with 24 SNPs at 35.8 Mb on BTA27 were detected showing significant association with both C20:0 and C20:1 (Supplementary Table 4).

Multiple traits GWAS

We next conducted multiple traits GWAS analysis using MTAG for the correlated fatty acid traits and obtained 85 significant SNPs for two group fatty acid traits based on the computed summary statistics (Supplementary Table 5). For the group of C14:0, C14:1, C18:1 and C18:2, 16 significant SNPs (including 14 novel loci) were found associated with C14:0 (Fig. 1B), which were located within or near genes CCDC57and FASN. Among these 16 SNPs, 2 SNPs were also identified significant with C14:0 in step I GWAS (Fig. 1A). For the group of C20:0, C20:1, C24:0 and C24:0, 55 SNPs and 14 SNPs were identified significantly associated with C20:0 and C20:1, respectively (Supplementary Fig. 6). Among these significant SNPs, 10 SNPs were both found significantly associated with C20:0 and C20:1, and these 10 significant SNPs were located within genes ZNF514, PROM2, MFAP5, GSTA5 and ACOX3.

Fig. 1: Genome-wide manhattan plots for single trait GWAS and multiple traits GWAS.
figure 1

A Genome-wide manhattan plots showing P-values of association for saturated fatty acids (C14:0) using GRAMMAR-GC approach, expressed as -log10(P-value). B Manhattan plots of MTAG results for saturated fatty acids (C14:0). The x axis shows chromosomal position, and the y axis shows significance on a –log10 scale. The active line marks the threshold for genome-wide significance (P = 5.37 × 10–9) for GWAS and MTAG results.

Candidate common SNPs and missense variants

Among 65 significant SNPs in step II GWAS, we found that 49 SNPs were associated with five genes including CCDC57, FASN, HDAC11, ALG14 and ZMAT4 in fatty acid C14:0, C20:0, C20:1, C20:2, C18:2 t-9c-11, C20:4 and C18:2 t-12c-10. Notably, we found one candidate region located at 50.7 Mb on BTA19 within the CCDC57 and FASN for C14:0 (Supplementary Fig. 7A). This region includes seven significant SNPs, which contributes to around 6% of the phenotypic variance. In addition, we identified 25 SNPs within the ZMAT4 for C20:0 (Supplementary Fig. 7B), which collectively account for about 7% of phenotypic variance. Moreover, we detected 12 candidate SNPs, which were annotated as missense variants that cause amino acid changes in proteins. These SNPs were found to overlap with several genes, including BCAS1, MYBL1, TANGO6, MCMDC2, DCT, FSHB, CSF2RB and RRS1. For example, one SNP (chr13:81410289) with effect of p.Ala414Thr/c.1240 G > A was located in BCAS1, which was previously found to be related to dietary n-3 polyunsaturated fatty acids38. The BCAS1 gene also has been reported to be associated with fertility and production traits39,40. Another SNP at chr5:75336714 (p.Glu26Asp/c.78 G > T) was detected in CSF2RB, a gene previously found to be associated with milk related traits, environmental adaptation and acclimation41,42.

eQTL mapping

The distribution of cis-eQTL on autosomes and QQ plots are shown in Fig. 2A, B. In total, we identified 27,644 cis-eQTL (Fig. 2A, B) associated with 3639 ensembl genes (FDR < 0.05). For the intersection analysis of SNPs between eQTL mapping analysis and fatty acid compositions association, we found 8 shared cis-eQTL with step I association study, 4 shared cis-eQTL with step II association study, and 4 shared cis-eQTL with multiple traits GWAS analysis (Fig. 2C). For the intersection analysis of genes regulating the expression, we found 34 shared genes with step I association study and 7 shared genes with multiple traits GWAS analysis. We found two shared genes (FASN and CCDC57) among three gene sets (Fig. 2D).

Fig. 2: Compare the results of GWAS and eQTL mapping.
figure 2

A The CMplot of eQTL, chromosomes are shown in different colors. B The QQ plot of cis-eQTL. C The venn plot among significant SNPs for step I GWAS, step II GWAS, multiple traits GWAS using MTAG and cis-eQTL mapping. DThe venn plot among gene sets identified from step I GWAS, multiple traits GWAS using mtag and cis-eQTL mapping.

QTL and cis-eQTL analysis of C14:0

Using multiple strategy associations for C14:0 based on the imputed sequencing data, we detected the top four significant SNPs including chr19:50766239, chr19:50766772, chr19:50767356 and chr19:50779529. Genotype-phenotype correlation among the four SNPs were plotted (Fig. 3A–D). These four SNPs were highly significant with the corrected phenotype using the Kruskal-Wallis (P < 0.01). Three SNPs (at 50,766,239 bp, 50,766,772 bp and 50,767,356) were located at upstream of FASN, and one SNP (at 50,779,529) located in the intron region of FASN (Fig. 3E). We founded three haplotype blocks among these four SNPs, including two haplotype blocks in FASN, and one haplotype block upstream of FASN. Notably, we founded that three SNPs (50,766,239 bp, 50,766,772 bp and 50,767,356) were in the same haplotype block (Fig. 3F). To investigate the relationship between four SNPs and the FASN gene expression level, we plotted the genotype against gene expression. We found that two SNPs (at 50,766,239 bp and 50,779,529 bp) were significantly corrected with the gene expression level (P = 0.019 and P = 0.018) (Fig. 4A and D).

Fig. 3: The most significant four SNPs for C14:0.
figure 3

A The box plot displays the correlation between genotype at the chr19:50766239 and the corrected saturated fatty acids (C14:0). B The box plot displays the correlation between genotype at the chr19:50766772 and the corrected saturated fatty acids (C14:0). C The box plot displays the correlation between genotype at the chr19:50767356 and the saturated fatty acids (C14:0). D The box plot displays the correlation between genotype at the chr19:50779529 and the corrected Saturated fatty acids (C14:0). The Kruskal-Wallis was used to test the significance. E Allelic information of sequence variants in FASN and its upstream. F The haplotype block map for the significant SNPs upstream and embedded in the FASN gene using the Haploview program.

Fig. 4: The correlation between most significant four SNPs genotype and the FASN gene expression.
figure 4

A The box plot displays the correlation between the FASN gene expression and the genotype of chr19:50766239. B The box plot displays the correlation between the FASN gene expression and the genotype of chr19:50766772. C The box plot displays the correlation between the FASN gene expression and the genotype of chr19:50767356. D The box plot displays the correlation between the FASN gene expression and the genotype of chr19:50779529.

Functional validation of FASN

To investigate the functional attributes of FASN, we discerned a substantial upregulation in the mRNA expression levels of FASN during preadipocyte differentiation, as illustrated in Fig. 5A. In this study, we strategically designed three small interfering RNAs (siRNAs) targeting exons across FASN transcript isoforms. Among these, si-FASN-1, distinguished by its heightened efficiency (Fig. 5B), was selected for subsequent analysis.

Fig. 5: The expression of FASN gene during preadipocyte differentiation and its effects on lipid metabolism-related genes and FAs of preadipocytes.
figure 5

A The induced differentiation of preadipocyte into maturing adipocyte and the expression level of FASN during the induced differentiation of preadipocyte into maturing adipocyte. B Interference efficiency of different siRNA for bovine FASN. C The expression level of lipid metabolism gene after transfected with si-NC or si-FASN. D Preadipocytes triglyceride content alteration after transfected with si-NC or si-FASN. E Preadipocytes saturated fatty acid composition alteration after transfected with si-NC or si-FASN. F Preadipocytes unsaturated fatty acid composition alteration after transfected with si-NC or si-FASN. The statistical significance of differences was assessed using Welch’s t-test. *P < 0.05, **P < 0.01, ***P < 0.001.

We examined gene expression differences of DGAT2, AGPAT6, ATGL, and PLIN2 between the interference FASN group and the control group, which are related to lipid metabolism according to previous researches. Remarkably, AGPAT6 exhibited a significant upregulation, whereas ATGL and PLIN2 experienced marked downregulation in the interference group (Fig. 5C). Concurrently, alterations in triglyceride content within preadipocytes were notably diminished after FASN interference (Fig. 5D).

Focusing on the fatty acid composition after the interference of FASN, our analysis revealed a significant increase in SFAs (C13:0, C14:0, C15:0, C16:0, C17:0, C20:0, and C24:0) and MFAs and PFAs (C14:1, C17:1, and trans-C18:2) in the interference group, while PFAs (C20:2, C20:4, C20:5 and C22:6) were significantly decreased (Fig. 5E, F). Additionally, Oil Red O staining revealed a significant reduction in the number of lipid droplets for the interference group (Supplementary Fig. 8A, B).

To comprehensively elucidate the metabolic pathways in which FASN are implicated, we conducted transcriptome sequencing on preadipocyte cells post FASN interference. The interference group exhibited distinct separation from the control group, as depicted in Fig. 6A. The notable upregulation after post-interference was observed in genes such as FEM1A, ESRP2, PARVB, STX1A, and RGMA, while CCNE2, FXN, PDCD6, PLIN2, FST and TSPAN12 were significantly downregulated (Fig. 6B). Enriched Gene Ontology (GO) terms encompassed positive regulation of response to stimulus, cell differentiation, and lipid metabolic processes (Fig. 6C). We further generated the pathway based on prior knowledge from Kyoto Encyclopedia of Genes and Genomes (KEGG), the regulation of candidate genes including FASN, SCD, CCDC57, ZMAT4 and ELOVL5 in fatty acid metabolism were shown in Supplementary Fig. 9.

Fig. 6: The differentially expressed genes and GO enrichment of si-NC and si-FASN in preadipocytes.
figure 6

A Principal component analysis diagram of the two groups which transfected with si-NC and si-FASN. B The volcano plot from si-NC and si-FASN. C GO enrichment of differentially expressed genes.

Discussion

Recent advances in the sequencing technologies have offered more opportunities to identify candidate variants in livestock, especially for cattle43,44,45. Genomic variants can be imputed by leveraging shared haplotype blocks between reference and target individuals, and then these imputed variants can be used in association studies of economically important traits and to predict the genetic potential based on a larger number of candidate variants20.

Several studies have suggested that increasing the number of sequenced populations and individuals in the reference panel can improve imputation accuracy36,42. We observed a higher imputation accuracy compared with reports using different reference or admixed populations, which could be due to the fact that the selected sequenced individuals were representative from the same population with similar LD pattern46. Further, several studies revealed that variants occurring in splice site and synonymous classes are some of the most significant47,48. Our study detected only 12 missense variants that may cause amino acid changes in proteins. This finding may be explained by the potentially imputation errors present in the missense classes47 and the current poorly annotation state of the bovine genome49.

Among the 7515 significant SNPs associated with 14 fatty acids identified in step I GWAS. Of those, 68 SNPs that can be found in the Bovine HD panel, and 2723 SNPs were newly detected from the imputed SNP dataset. As for step II GWAS, a total of 93 candidate SNPs were identified, while 77 of them were found embedded within candidate genes. Among them, we identified one SNP within ALG14 at 48.4 Mb on chr3 showing significant association with multiple traits (C20:0, C20:2, C20:4, C18:2 t-9c-11 and C18:2 t-12c-10). We also detected one region from 50.66 Mb to 50.77 Mb for C14:0 overlapping with CCDC57 and FASN. FASN has been identified to be associated with milk fatty acids in Dutch Holstein-Friesian cows, Norwegian Red and Chinese Holstein cattle50,51,52,53. Another region ranging from 35.54 Mb to 35.93 Mb was identified for both C20:0 and C21:0, this region was found that overlapping with ZMAT4. In addition, two SNPs at chr19:50779529 and chr22:58427069 were identified for C14:0 within FASN and HDAC11. Moreover, many studies had detected candidate variants within FASN that were related to fatty acid composition54,55,56,57 and milk fat content in cattle58,59. Using the multiple trait GWAS, we further detected four significant SNPs (BTA19:50766239, BTA19:50766772, BTA19:50767356 and BTA19:50779529) associated with FASN, which have not been identified before. Notably, we observed one candidate region with strong LD located at 50.7 Mb on BTA19 within FASN for C14:0.

Previous studies also revealed that FASN was an important candidate gene for fatty acids in various cattle breed including Japanese Black cattle60, Fleckvieh61 and Qinchuan cattle62. FASN is a multifunctional enzyme that plays a central role in mammalian lipid metabolism and de novo biosynthesis of long-chain fatty acids63,64. Our multiple strategies association also provided sufficient evidence that FASN was an important candidate gene for FAs.

Our subsequent analysis further validated the function of FASN on the regulation of FA in our population. Notably, interference with FASN yielded a significant reduction in intracellular triglyceride content in preadipocytes and impeded lipid droplet accumulation in adipocytes, which was consistent with previous study in pre-adipocytes of duck65, granulosa cells of geese66, mammary epithelial cells of cattle67. Moreover, as fatty acids are the building blocks of essential lipids, FASN has emerged as a unique oncologic target in cancer treatment68,69. FASN inhibitors have been studied preclinically and beginning to transition to human trials70. In our study, 7 genes have been reported associated with lipid metabolism using differential expression gene analysis in preadipocytes, including ESRP271, PARVB72, CCNE273, PDCD674, PLIN275, FXN76 and FST77. The differential expression genes implicated key pathways, particularly those associated with cellular developmental processes, cell differentiation, and lipid metabolism. These findings further suggest the indispensable role of FASN in orchestrating cellular growth and development, with a specific emphasis on the intricate realm of fatty acid metabolism.

Methods

Animals and genotypes

A total of 723 animals were genotyped by Bovine HD Beadchip, and these cattle (Huaxi cattle) were originated from Ulgai, Inner Mongolia of China, and then moved to JinweifurenCo., Ltd for fattening after weaning. All these cattle were raised with the same feeding and management conditions. More detailed description of breeding and management has been described previously19,78,79. We used liftover software to determine the physical coordinates of the Illumina BovineHD BeadChip markers according to the ARS-UCD1.2 genome assembly. Physical coordinates of 693,441 autosomes SNPs were available for the ARS-UCD1.2 assembly. Then the quality control of genotype data was conducted using PLINK (v1.9)19. SNPs were selected for the analyses based on minor allele frequency > 0.05, proportion of missing genotypes <0.05, Hardy-Weinberg equilibrium P < 10-6. Moreover, 38 individuals with >10% missing genotypes were excluded. After quality control, the final data consisted of 685 individuals and 595,715 autosomal SNPs.

Fatty acid composition

All individuals were processed for market at around 20 months of age. We strictly follow the guidelines set forth by the institutional meat purchase specifications for fresh beef, meticulously measuring key traits during the slaughter process. The meat samples extraction and measurements of FA have been described in our previous publication19. In the current study, we analyzed a total of 21 individual FAs including six saturated fatty acids (SFA), four monounsaturated fatty acids (MFA), and eleven polyunsaturated fatty acids (PFA) in 685 animals. Each FA was quantified as a weight of percentage of total fatty acids according to our previous analysis19.

Sequencing

To avoid related individuals, 44 individuals were selected for whole genome sequencing (WGS) with the lowest level of co-relationship (according to the pedigree and identity by descent value estimated using PLINK v1.980). DNA was isolated from blood specimens employing the TIANamp Blood DNA Kit manufactured by Tiangen Biotech Company Limited. Subsequently, DNA samples with an A260/280 ratio falling between 1.8 and 2.0 were chosen for subsequent analyses. The high-quality DNA was utilized for library construction. For each individual, two paired-end libraries were created, with each library featuring a read length of 2×150 base pairs. WGS was carried out using Illumina Hiseq 2500 instruments, manufactured by Illumina Inc. in San Diego, California, USA, achieving a coverage depth of 25X. The protocols provided by the manufacturer were strictly followed for all procedures undertaken. In pursuit of high-quality data, we employed a rigorous filtering approach on the raw sequence data. This entailed the exclusion of reads meeting any of the following criteria: (1) containing >10% unknown bases, (2) displaying adapter sequences, and (3) exhibiting over 50% low-quality bases.

Mapping short reads and variant calling

Upon completion of data filtering, sequence alignment was performed using bwa-0.7.8 version, employing the parameters (mem -t 4 -k 32 -M). The ARS-UCD1.2 genome assembly was sourced from the UCSC website (http://genome.ucsc.edu/). To streamline subsequent analysis, SAM files were transformed into BAM files and subsequently sorted and merged using SAMtools. We proceeded with variant calling by utilizing SAMtools81. Variants failing to meet the thresholds of an overall quality (QUAL) score below 20, a mapping quality (MQ) score below 30, or a read depth below 10 were excluded. Moreover, we incorporated proximity filters, which entailed removing variants with a lower QUAL score if they were within 3 base pairs of another variant. For subsequent analysis, only biallelic SNPs were included. Additionally, the SNPs were filtered using PLINK, employing three specific criteria: --geno 0.05 --hwe 0.00001 --maf 0.005. After quality control, the total number of sequence markers was 14,345,738 across autosomes.

Imputation of sequence variants

We performed imputation from SNP array to sequence using BEAGLE v4.182 with default parameter settings. The algorithm implemented in BEAGLE v4.1 uses population-based information to infer haplotypes and missing genotypes. Imputed sequence variant genotypes were assigned numerical codes: 0 for homozygous, 1 for heterozygous, and 2 for alternative homozygous animals. The reference data underwent pre-phasing using BEAGLE v4.182. We also estimated imputation reliability using Minimac with default settings83. SNPs with MAF  >  0.05 and imputation reliability (R2) >  0.3 were used in GWAS study, and the remain markers was 9,318,254.

Evaluation of the imputation reliability

The overall reliability of imputation was measured by the mean correlation between in silico imputed and true (sequenced) genotypes (rIMP,SEQ) across the autosomes analyzed. The allelic R2 measure provided by BEAGLE gives a good measurement of the imputation accuracy. Specifically, for each chromosome, we calculated the correlation between the imputed and the true genotype. These values were averaged to obtain the overall R2. In addition, we grouped the imputed sequence variants into ten classes with respect to their MAF (0.05–0.075, 0.075–0.1, 0.1–0.15, 0.15–0.2, 0.2–0.25, 0.25–0.3, 0.3–0.35, 0.35–0.4, 0.4–0.45 and 0.45–0.5) in the reference population. The R2 for each MAF class were measured by their average values across chromosomes.

Association analyses for FA using GRAMMAR

Step I: Association tests using whole-genome imputed SNP

We first performed step I association tests between 9.3 million imputed sequence variants and FA using GRAMMAR-GC method implemented in an R package GenABEL84,85. The method accounts for population stratification and covariance structure of individuals inferred from all SNP. Bonferroni corrected threshold of 5.37E-9 (P = 0.05/9318254) was adopted for the top 5% genome-wide significance. Imputed variants with P-values <5.37E-9 were considered as associated casual variants.

Step II: Association studies for the targeted genomic regions

We next conducted step II association tests for the candidate genomic regions detected in step I. These regions were defined as 0.5 Mb window sizes at both sides of the genome-wide significant SNP for FA identified in step I, which included 462,376 SNPs. The association between these SNPs and FA was also assessed using GRAMMAR-GC method. The most significant SNPs were identified based on a Bonferroni-corrected p-value threshold of 1.08E-7.

In the current study, the proportion of phenotypic variance explained by each significant SNP was estimated as follows86:

$${var}\left( \% \right)=\frac{2{{\rm{pq}}}{{{\rm{\beta }}}}^{2}}{{var}({{\rm{P}}})}$$

where, \({{\rm{p}}}\) and \({{\rm{q}}}\) are the allele frequencies, \({{\rm{\beta }}}\) is the estimated allele substitution effect that was calculated by GenABEL package, and \({\mathrm{var}}({{\rm{P}}})\) is the phenotypic variance.

Multi-trait analysis using summary statistics

To improve the statistical power of GWASs for medium and high genetic correlation fatty acid traits, the multi-trait analysis of GWAS (MTAG87) was used to joint analysis of multiple traits with the imputed WGS data. Herein, single trait GWAS summary statistics for fatty acids was considered as an input, bivariate LD score regression was employed to compensate for an overlap of the cohorts described by different summary statistics. In the result, MTAG generated trait-specific effect estimates for each SNP and took about half an hour in TS860M5 with 8-socket mission critical server based on Intel Xeon Platinum 8260 processors.

Variant annotation and analyses

We utilized the Variant Effect Predictor (VEP)88 to annotate all sequence variants, offering detailed insights into their functional implications and potential effects on genes or proteins. Moreover, VEP provided predictions regarding the impact of the variant through SIFT. We estimated the frequency of the sequence variants of interest within our cattle population. Linkage disequilibrium between variants was assessed using the “--ld” function in PLINK. Additionally, we conducted in silico prediction of the impact of missense variants using SNPeff89.

Cis-eQTL mapping and gene annotation

RNA extraction was carried out using frozen longissimus dorsi muscle samples obtained from 120 individuals. The process of RNA preparation and sequencing were followed by Wang et al. 32. RNA-seq dataset were used to detect cis-eQTL. We analyzed only those genes that exhibited expression in muscle tissue for >25%. For each of the 16,472 genes meeting this criterion, we tested the association of expression levels (measured in Transcripts Per Million - TPM) with all 9.3 million imputed sequence variants located on the same chromosome as the respective gene, utilizing the R package MatrixEQTL90. The cis-eQTL was mapped by considering variants located within one megabase (Mb) up- or downstream of the gene locus. To assess whether a cis-eQTL was enriched with significant SNPs from GWAS, we examined the overlap of cis-eQTL within a 50 kilobase (kb) interval upstream and downstream of significant SNPs, with a false discovery rate (FDR) < 0.0532.

Preadipocyte culture and differentiation

Preadipocytes culture and were prepared according to previous study91, The cells were plated into cell culture dishes filled with DMEM/F12 medium (Gibco, Waltham, MA, USA), supplemented with 10% FBS (Invitrogen, San Diego, GA, USA), and 1% Penicillin-Streptomycin. Subsequently, the cells were incubated in a humidified atmosphere with 5% CO2 at 37°C. When the cell density reached 80%, preadipocyte differentiation medium I, comprising 5 µg/ml insulin, 0.5 mM IBMX, 1 µM DEX, 1% Penicillin-Streptomycin, 10% FBS, and DMEM/F-12, was utilized to initiate differentiation. After 24 h, the induced differentiation II solution, containing 5 µg/ml insulin, 1% Penicillin-Streptomycin, 10% FBS, and DMEM/F-12, was replaced to sustain the induction of differentiation. The induced differentiation II solution was changed every 2 days.

Chemical synthesis siRNA, transfection

Three sets of complementary pairs of siRNAs (si-FASN-1, si-FASN-2 and si-FASN-3) oligos, designed to suppress the expression of bovine FASN, along with control siRNA (si-NC) oligos, were synthesized by Guangzhou RiboBio Co., Ltd. The specific sequences of the siRNAs can be found in Supplementary Table 6. When the cells reached 80–90% confluence, transfection with siRNAs or si-NC (100 nM) was carried out using the Lipofectin 3000 transfection kit (Invitrogen, San Diego, GA, USA) for FASN interference experiments.

RNA extraction and qRT-PCR analysis

The total RNA of preadipocyte was extracted using the Trizol Reagent method (Invitrogen, Carlsbad, CA, USA). cDNA was synthesized from 1 μg of total RNA using the PrimeScript RT Reagent Kit with gDNA Eraser (Perfect Real Time) following the manufacturer’s instructions (TaKaRa Biotech Co. Ltd, Tokyo, Japan). The primers for qPCR analysis were provided in Supplementary Table 7 and synthesized by Sangon Biotech (Shanghai) Co.,Ltd. mRNA expression was assessed using the KAPA SYBR® FAST qPCR Master Mix (2X) Kit in the QuantStudio 7 Flex Real-Time PCR system (Life Technologies, Carlsbad, CA, USA). The 2-ΔΔCt method was used to calculate the relative abundance of target mRNAs. To assess the statistical significance of differences between the two groups, we employed a two-tailed t-test assuming unequal variances (Welch’s t-test).

RNA sequencing

The raw RNA-seq data were sequenced by the Beijing Genomics Institute (BGI, Shenzhen, PR China), with each group comprising three replicates (Interference group and control group). Subsequently, clean RNA reads were aligned to the ARS-UCD1.2 reference genome using TopHat92. Cufflinks was employed to quantify both gene and transcript expression93. Transcript and gene expression levels in each sample were estimated using fragments per kilobase of transcript per million mapped reads (FPKM). The criteria for selecting Differentially Expressed Genes (DEG)were defined as follows: an absolute log2(FPKM_interference/FPKM_control) > 1, with a corresponding P-value < 0.05. Here, FPKM_interference/FPKM_control represents the average ratio, and the P-value was calculated using a T-test. Group differences were visualized through Principal Component Analysis (PCA) packages. Volcano plots, implemented in the ggpubr package, were employed to illustrate the distribution of DEGs. Finally, a functional Gene Ontology term enrichment analysis of the DEGs was carried out using g:Profile94.

Triglyceride assay and gas chromatography analysis of fatty acids

The cells were culture in 60 mm well, then transfected with si-FASN or si-NC when the cells at 80–90% confluence. After 48 h, the cells were harvested, and the triglyceride content was quantified using an enzymatic triglyceride assay kit following the manufacturer’s instructions provided by Applygen Technologies Inc., China. Protein quantification was carried out using the BCA assay protein quantification kit following the manufacturer’s protocol provided by Applygen Technologies Inc., China. The final total triglyceride content was calibrated in µg/mg protein concentration. Fatty acid extraction and analysis were performed by previous methods95. Relative proportions of fatty acids were determined as percentages of the total peak area. The statistical significance of differences was assessed using Welch’s t-test.

Oil red O staining

The cells were cultured in 12-well plates. When the cells reached 80% confluence, they were transfected with si-FASN or si-NC, followed by induction with differentiation medium for 6 days. Following the experimental treatment, the cells were washed three times with 1×PBS and then fixed for 30 min with 4% paraformaldehyde. Then cells were stained using a Modified Oil Red O Staining Kit (Beyotime Institute of Biotechnology, Jiangsu, China) according to the manufacturer’s instructions.

Computing environment

All computational analyses were conducted utilizing the National Center of Beef Cattle Genetic Evaluation (NCBCGE) cluster, situated at the Institute of Animal Science, Chinese Academy of Agricultural Sciences. The Inspur Tiansuo Server was used for the computation, which included 8-core Intel® Xeon® processors rated at 2.2 GHz with 2.3 TB of random-access memory.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.