Introduction

Low reproductive capacity is the primary cause of cows being culled early. Lately, the negative relationship between cow production and reproductive qualities has increased, likely due to the pleiotropic effects of genes1. This has resulted in reduced cow fertility as a result of strong selective pressure on production traits2. Furthermore, the process of reproduction as a complex trait is regulated by numerous genetic loci and influenced by environmental elements, and demonstrates low heritability3. Therefore, understanding the genetic architecture of cows with high reproductive potential is beneficial for animal breeding.

Genome-wide association studies (GWAS) identified thousands of associated loci with complex traits4. In addition, it is noteworthy that the majority of GWAS loci are non-coding, they may thus indirectly affect complex traits through regulating gene expression5 and GWAS also has limitation in prioritizing causal genes. GWAS results identify causal genes mainly based on the fine-mapped distance of associated variants in linkage disequilibrium (LD) with the causal marker6. So identification of new loci by studies based on physical distance are rare7. On the other hand, gene expression serves as a molecular phenotype and enhances the functional interpretation of GWAS discoveries8.

In addition, RNA sequencing data can serve as a molecular phenotype for identifying expression quantitative trait loci (eQTL) and differentially expressed genes (DEGs). eQTLs are genomic regions where the expression levels are linked to regulatory genotype variants in various tissues9. It is worth noting that to obtain gene counts, a multi-step analysis including quality control, trimming, mapping, and assembling of RNA-seq data is necessary. Each step offers a range of options, leading to a complex process that poses a challenge when interpreting the results10. Nevertheless, in a meta-analysis strategy, increasing sample size also enhances the statistical power and leads to improved results11. Additionally, weighted gene co-expression network analysis (WGCNA) emerges as a suitable technique to identify significant co-expressed genes, their nodes, and hub genes12. Moreover, Transcriptome-wide association studies (TWAS) that combine eQTL investigations with GWAS play a role in understanding the genetic impact on transcriptome associations with complex traits13. This approach helps in characterizing new genomic regions and determining key genes linked to complex traits in various tissues14,15.

Therefore, exploring and integrating different genetic data can be helpful for the identification of various effective factors for controlling complex traits. In this study, we aim to interpret the biological mechanisms underlying reproductive traits such as cow conception rate (CCR), daughter pregnancy rate (DPR), heifer conception rate (HCR), and fertility by integrating results from TWAS and GWAS summary analyses, RNA-seq data analysis, meta-analysis, the WGCNA approach, and gene ontology. Through this integration, we seek to identify potential candidate genes and explore their molecular, cellular, and biological functions related to reproductive traits in cattle.

Materials and methods

Data collection

We utilized TWAS summary results from 11,642 Holstein cattle (available at https://cgtex.roslin.ed.ac.uk/downloads/) and GWAS summary statistics from 27,241 Holstein cattle (accessible at https://figshare.com/s/ea726fa95a5bac158ac1) to analyze three traits: Cow Conception Rate (CCR), Daughter Pregnancy Rate (DPR), and Heifer Conception Rate (HCR). HCR is defined as the percentage of inseminated heifers that become pregnant at each service, while CCR represents the percentage of inseminated cows that conceive per service. CCR is typically measured 30 days after artificial insemination or natural mating through ultrasonographic pregnancy diagnosis. DPR reflects the percentage of nonpregnant cows that become pregnant during each 21-day estrous cycle. The primary distinction between cows and heifers lies in their age, with cows generally ranging from 2 to 10 years and heifers being younger, typically under 2 years.The TWAS summary results encompass 23 tissues, including adipose, blood, embryo, hypothalamus, ileum, intramuscular tissue, jejunum, leukocytes, liver, lung, lymph nodes, macrophages, mammary gland, milk cells, monocytes, muscle, ovary, oviduct, pituitary gland, rumen, salivary gland, skin fibroblasts, and uterus. Additionally, gene expression counts were obtained from 40 cattle —20 with high fertility and 20 with low fertility— using samples from endometrial and ovarian tissues, based on mapping to the ARS-UCD1.2 cattle reference genome. This dataset includes 27,607 transcripts and is available at https://cgtex.roslin.ed.ac.uk/downloads/. The details of the pipeline used can be found at https://cgtex.roslin.ed.ac.uk/. For more information, please refer to Table 1.

Table 1 Accession numbers and meta-data of gene counts for the analyses.

Analysis method

We utilized TWAS, GWAS, RNA-seq, meta-analysis, WGCNA, GO, and KEGG pathway analysis to investigate reproduction traits. GWAS identifies associations between genetic variants (SNPs) and traits, but often lacks insight into the biological mechanisms, especially when SNPs are in non-coding regions18. TWAS bridges this gap by integrating gene expression data with GWAS, connecting genetic variants to gene expression and identifying potential causative genes19.

Gene expression levels reflect the behavior of genes associated with complex traits20,21. To enhance statistical power, meta-analysis aggregates data from multiple RNA-seq studies, resulting in more robust and generalizable findings22. WGCNA identifies co-expressed gene modules and correlates them with traits, aiding in the discovery of gene networks23. To further elucidate these findings, GO analysis classifies genes into biological processes, molecular functions, and cellular components24, while KEGG pathway analysis maps genes to established biochemical pathways25, providing insights into their roles in complex traits such as reproduction.

Tissues correlation based on TWAS result

Adipose, blood, embryo, hypothalamus, ileum, intramuscular, jejunum, leukocyte, liver, lung, lymph node, macrophage, mammary, milk cell, monocyte, muscle, ovary, oviduct, pituitary, rumen, salivary gland, skin fibroblast, and uterus were the 23 tissues whose correlation coefficients were computed to examine the relationship between them. To calculate the correlation coefficient, we used the available Z-score in the TWAS summary statistical results pertaining to CCR, DPR, and HCR. In order to minimize the noise effect in the correlation calculation, we kept genes that have z-score in more than 10 tissues. Also we kept tissues that have z-score in more than 70% of genes. The metan package26 within the R (v.4.0.5) was utilized to obtain the correlation coefficient. Also candidate genes based on TWAS, and GWAS methods (p-value < 0.01) were introduced related to CCR, DPR, and HCR.

Gene counts analysis of individual datasets and Meta-analysis

The gene counts of 40 cows belonging to endometrial and ovarian tissue were used as input data for the Deaeq2 package27 in R (v.4.0.5) to identify DEGs between high and low fertility cows for each datasets. The pipeline used by the FarmGTEX consortium to calculate gene counts is available at https://cgtex.roslin.ed.ac.uk/. We used the metaRNASeq package version 1.0.2 28 in R (v.4.0.5) to find the meta-genes for each tissue. Raw p-values from the datasets for the ovary and endometrium were gathered into a new file and used individually as input data for the metaRNASeq package. We utilized the “invnorm” function to combine p-values, number of repeat equal 6, and 5 for endometrium and 4, and 5 for ovary tissue and Benjamini–Hochberg method used for p-value correction. Finally, genes that showed co-directional expression in all datasets, with p-value ≤ 0.05 at least in one dataset, and meta-analysis p-value ≤ 0.05 were considered as meta-genes for each tissue.

Weighted gene co-expression network analysis

The Deseq2 package’s variance stabilizing transformation method was used to normalize the gene counts. The normalized endometrial and ovarian tissue-specific gene counts were used separately as input data for weighted gene co-expression network analysis (WGCNA) (version 1.71) package29 in R (v.4.0.5) for detection of co-expressed networks and hub-genes. In order to minimize the impact of noise on the correlation computation, at first step we filtered normalized gene counts and retained genes with gene counts above 10 in more than 90% of cows. In second step, we employed the “removeBatchEffect” function from the limma package (version 3.50) to eliminate batch variations amongst datasets. Then, we investigated missing value and outlier sample using the “goodSamplesGenes” and “hclust” options, respectively. Furthermore, based on the scale-free topology index (R2) and mean connectivity for the ovarian and endometrial samples, power equals 10 and 7 were estimated as appropriate amounts, respectively. Thereafter, the “pickSoftThreshold” option was used to calculate the adjacency matrix. An adjacency matrix can be computed based on the correlation coefficient between each pair of genes, allowing genes with only direct connectivity to form networks. Consequently, we computed the Topological Overlap Matrix (TOM) and dissimilarity matrix (1-TOM) in order to add indirect connectivity genes to the weighted network. We used the dynamic tree cut (DTC) algorithm to cluster the genes using the dissimilarity matrix. Ultimately, gene co-expression modules were created using a hybrid approach, a cut height of 0.975, a minimum module size of 30 genes, and a DeepSplit of 2. Following the acquisition of the gene co-expression modules, we used the “moduleEigengenes” function to determine the eigengenes for each module. The outputs obtained up to this point could eventually yield the following outputs: (a) identification of the modules that exhibit a significant correlation with the fertility trait in cow, (b) relationships among the modules, and (c) hub genes in each module. It is noteworthy that each module’s hub genes were identified using two different statistical techniques: “chooseOneHubInEachModule”, and “chooseTopHubInEachModule” functions.

Functional enrichment analysis of potential candidate genes

The STRING database (https://string-db.org/) was used for the purpose of creating the PPI network and identifying the hub-genes by meta genes. Following this, we obtained common significant genes between the TWAS results of the HCR, the DPR, and the CCR, as well as hub genes, modules and meta-genes of the ovary and endometrium tissues via (https://bioinformatics.psb.ugent.be/webtools/Venn/). Furthermore, we employed the ClueGO plugin within Cytoscape software (version 3.7.2) to conduct enrichment analyses based on the KEGG pathway and Gene Ontology (GO) to interpret the meta genes and common significant genes between the TWAS results, hub genes, modules, and meta-genes individually. Only pathways with p-values below 0.05, following Bonferroni step-down correction, were considered significantly enriched.

Results

Correlations amongst tissues based on TWAS summary results

An analysis of tissue correlations for each trait (CCR, DPR, and HCR) revealed no negative correlation based on the Z score of TWAS results. The TWAS summary for the CCR trait indicated a highly significant positive correlation involving the uterus and 14 tissues. Moreover, the DPR results from TWAS demonstrated significant correlations between the liver and lung, which were associated with 12 and 11 tissues, respectively. Additionally, the TWAS findings for HCR identified significant positive correlations between muscle and 14 tissues. A detailed overview of these results is presented in Fig. 1.

Fig. 1
figure 1

The correlation plot derived from the three traits’ TWAS summary results between 23 tissues. The traits associated with cow conception rate, daughter pregnancy rate, and heifer conception rate are represented by Plots A, B, and C, respectively.

TWAS and GWAS results to identify candidate genes

Based on TWAS and GWAS results, candidate genes for each trait (CCR, DPR, and HCR) were introduced separately. The potential genes for each trait in different tissues, as determined by TWAS, can be found in Supplementary File 1. Additionally, certain candidate genes based on TWAS summary results are shared across multiple tissues with a p-value less than 0.01, and these are summarized in Fig. 2 for each trait to facilitate understanding. Supplementary File 2 contains candidate genes identified through GWAS results. Furthermore, the analysis of both TWAS and GWAS data revealed common potential genes related to each trait, such as AFF1 and ZNF613 for HCR, UMPS for DPR, and UMPS and ZNF613 for CCR trait.

Fig. 2
figure 2

Discovery of shared genes across tissues for cow conception rate, daughter pregnancy rate, and heifer conception rate using TWAS summary results with a p-value below 0.01.

Identification of meta-genes for each tissues

Four datasets were examined separately utilizing DESeq2. Genes demonstrating co-directional expression patterns across all datasets, with a p-value of ≤ 0.05 in at least one dataset and a meta-analysis p-value of ≤ 0.05, were identified as meta-genes for each tissue individually. A list of meta-genes related to ovarian and endometrial tissues can be found in Supplementary File 3. The expression levels of 90 and 168 meta-genes associated with endometrial and ovarian tissues are represented in a volcano plot across four datasets. This visualization highlights the expression patterns of these meta-genes in the context of the individual dataset analyses, emphasizing the significance of conducting a meta-analysis with multiple comparable datasets instead of merely presenting all identified DEGs from each dataset. Notably, most meta-genes did not show differential expression in any of the individual analyses, as depicted in Fig. 3 of the volcano plot. This finding underscores the critical role of meta-analysis in detecting differential transcripts that exhibit subtle variations.

Fig. 3
figure 3

The meta-gene expressions of the four datasets are represented in a volcano plot. The datasets are arranged from left to right as follows: dataset1, dataset2, dataset3, and dataset4.

Weighted gene co-expression network analysis and identification of the hub-genes

WGCNA was run independently using the expression values of 2,132 and 2,177 genes from the endometrium and ovary tissues, respectively (Supplementary File 5). After examining outlier samples and excessively missing values in the first step, a soft-threshold power of 8 was found for the ovary and 10 for the endometrium. (Supplementary Fig. 1). Our approach for building the modules was the dynamic tree cut algorithm. For the ovarian tissue, 14 modules were identified, with sizes varying from 52 to 418 genes, alongside 92 genes that were classified as unassigned due to their inability to fit into any of the established modules (Fig. 4-A). Figure 4–B illustrate the hierarchical clustering of genes based on the topological overlap matrix (TOM). Among the 14 identified modules, seven modules were particularly significant: blue (r = − 0.74, p-value = 4e-04), turquoise (r = − 0.83, p-value = 2e-05), yellow (r = − 0.78, p-value = 1e-04), black (r = − 0.74, p-value = 5e-04), green (r = -0.82, p-value = 4e-05), pink (r = -0.62, p-value = 0.006), and red (r = -0.66, p-value = 0.003). The gene names corresponding to these seven significant modules can be found in Supplementary File 6-A. Additionally, Fig. 4-C presents the p-values and pearson correlation coefficients of the identified modules in relation to the fertility trait. In input genes for endometrium tissue, 12 modules with sizes ranging from 33 to 216 genes were identified, in addition to 689 genes that were deemed unassigned since they could not be placed in any of the modules (Fig. 4-D). Figure 4–E depict the hierarchical clustering of genes using the topological overlap matrix (TOM). Two out of the 12 modules including pink (r = 0.46, p-value = 0.03), purple (r = 0.46, p-value = 0.03) modules were found to be especially significant. The gene names for each of the two major modules are listed in Supplementary File 6-B. The identified modules’ p-values and Pearson correlation coefficient are presented in Fig. 4-F.

Fig. 4
figure 4

Gene co-expression modules, which were clustered based on a read counts of the endometrial and ovarian samples. In (A) and (D), the y-axis represents the co-expression distance, the x-axis corresponds to genes, and colors indicate modules, with the gray module representing genes not assigned to any module. (B) and (E) illustrates a heatmap plot of topological overlap in the gene network. (C) and (F), the relationship between modules and treatment groups (high fertility and low fertility groups) was examined by calculation of correlating coefficient between traits and module eigengenes. Rows represent module eigengenes, columns represent treatment groups, and each cell contains the correlation value and significance level (p-value) of the relationship.

Subsequently, hub genes for each module were identified utilizing the “chooseTopHubInEachModule” and “chooseOneHubInEachModule” methods, as presented in Table 2.

Table 2 Top five hub genes, top hub, and one hub genes in significant modules for potential fertility in cows.

Functional analysis of the potential candidate genes

Following the identification of 91 and 169 meta-genes for endometrium and ovary tissues, respectively, STRING was used to build their PPI network, as shown in Fig. 5. There were many edges between the nodes in the network that was created, which suggested that the meta-genes were significantly related. Furthermore, the meta-genes were subjected to GO and KEGG pathway analyses using the ClueGO plugin of Cytoscape (Fig. 6). As a result, we were able to identify 2, 8, 9, and 2 significant terms for molecular function (MF), cellular components (CC), biological processes (BP), and KEGG pathways, respectively. Among the most significantly enriched BP terms were “humoral immune response,” “protein activation cascade,” “antibacterial humoral response,” “negative regulation of vasculature development,” “heart process,” “regulation of blood circulation,” “heart contraction,” cardiac cell development ,” and “cardiac muscle cell development “. In addition, Significant CC terms were “sarcoplasm,” “sarcoplasmic reticulum,” “contractile fiber part,” “contractile fiber,” “sarcomere,” “myofibril,” “I band,” and “Z disc”. Moreover, “sodium ion transmembrane transporter activity,” and “integrin binding” were detected as significant terms for MF. Also, significant KEGG terms were “endoplasmic reticulum chaperone complex” and “iron ion binding” (Supplementary File 4).

Fig. 5
figure 5

Examination of ovarian and endometrial meta-genes through protein-protein interaction (PPI) network analysis. Following a meta-analysis of RNA-seq expression data across four datasets, we discovered meta-genes associated with potential fertility traits in cow.

Fig. 6
figure 6

The significant ontology terms were identified using 258 meta-genes, with a Bonferroni adjusted p-value threshold of less than 0.05. These meta-genes were derived from the results of a meta-analysis conducted on ovarian and endometrial samples.

Ultimately, we obtained shared significant genes between the TWAS results of the HCR, the DPR, and the CCR, as well as hub genes, modules and meta-genes of the ovary and endometrium tissues via (https://bioinformatics.psb.ugent.be/webtools/Venn/). Therefore, common genes between at least three of mentioned results were used for exploration of functional analysis by the ClueGO plugin of Cytoscape software (Supplementary File 7).

To identify candidate genes associated with reproductive traits, we employed a comprehensive approach, integrating multiple methodologies. A Venn diagram was generated to visually represent the overlap of significant genes identified through various analyses, including: TWAS, GWAS, RNA-seq, meta-analysis, WGCNA analysis result, and candidate genes involved in significant terms identified by GO and KEGG pathway analysis. By combining these approaches, we could capture a more comprehensive set of candidate genes involved in reproductive traits. The Venn diagram highlights the intersections between the significant genes found across these different analyses. This overlap increases confidence in their role in cow fertility.This diagram is presented as Fig. 7, and further supporting details are provided in Supplementary File 9. This combined approach enhances the reliability of our candidate gene identification, offering a robust foundation for future genomic selection aimed at improving reproductive traits in cows.

Fig. 7
figure 7

Venn diagram illustrating the overlap of significant genes identified across multiple analyses, including TWAS, GWAS, and RNA-seq, meta-analysis, and WGCNA results.

Discussion

Genetic testing can be most beneficial for fertility traits because they are the most complex, difficult to measure, and low-heritability traits30. Also, fertility traits are affected by genetic factors31. Therefore, artificial selection and combined with artificial insemination leads to significant improvement in dairy cows fertility32. Identifying key genes and variants related to fertility is essential for selective breeding. To achieve this, we analyzed TWAS and GWAS summary results, performed a meta-analysis of RNA sequencing data, WGCNA analysis, and conducted GO and KEGG pathway enrichment analyses using data from FarmGTEX. Our aim was to identify potential candidate genes linked to reproduction traits.

RPL12, UMPS, FKBP1B, FZD10, PIK3R1, THBS1, VWF, TNN, MASP1, SGK1, DSTN, GPI, TGM2, NCOR1, AOX1, ITPR1, SMOC2, RPL11, and APOA1 genes were identified as significant by at least three of the following methods: TWAS, meta-analysis, GO, and WGCNA.

Ribosomal proteins like RPL12 are crucial for protein synthesis and ribosome biogenesis, vital processes in oocyte maturation and early embryonic development. Efficient protein synthesis supports rapid cell division during these stages, influencing folliculogenesis and oocyte quality33. Ribosome biogenesis involves the assembly of ribosomal proteins and rRNA, regulated by factors like mTOR signaling and MYC. Disruptions in ribosomal components, such as RPL12 and RPL11, can impair ribosome formation, affecting protein synthesis and cellular development. These proteins are essential for regulating protein production during meiosis, directly impacting oocyte quality and early embryo development34. UMPS (Uridine Monophosphate Synthetase) is a key enzyme in pyrimidine biosynthesis, converting orotate to UMP, which is essential for DNA and RNA synthesis. This process is critical for cell division, particularly in oocyte maturation and early embryonic development. Deficiencies in UMPS can cause orotic aciduria, leading to developmental issues and impaired cell function due to disrupted nucleotide production, which in turn affects oocyte quality and embryo viability35. FKBP1B plays a crucial role in calcium signaling, which is essential for oocyte activation during fertilization and embryo development. Proper regulation of calcium oscillations influences the transition from the germinal vesicle (GV) stage to the mature oocyte stage, primarily mediated by inositol 1,4,5-trisphosphate receptors (IP3Rs) in the endoplasmic reticulum. Disruptions in these calcium signals can lead to fertilization issues and impair early embryo development36. FZD10 participates in Wnt signaling, which is crucial for follicle development and oocyte maturation37. PIK3R1 is involved in the PI3K-Akt pathway, impacting ovarian follicular survival and oocyte quality through growth factor signaling38. THBS1 and VWF are associated with angiogenesis and vascular health, which are essential for ovarian blood supply and follicle growth39. SGK1 regulates ion balance and cell survival, affecting ovarian cells’ stress responses40, while TGM2 is involved in apoptosis and cellular differentiation, both critical for embryo development. NCOR1 modulates gene expression via nuclear hormone receptors, influencing hormonal control over ovarian cycles41. ITPR1 is key for calcium signaling, which is vital for oocyte maturation and early embryo development42. Other genes, like APOA1, involved in lipid metabolism, affect hormone synthesis and follicle health43. Collectively, these genes regulate key pathways related to cell survival, growth, angiogenesis, and signaling that are fundamental to ovarian function, oocyte competence, and successful embryo development.

The significant genes associated with CCR, DPR, and HCR traits based on TWAS are as follows: NLRP12 and COX10 for adipose tissue; FBXO40, ENSBTAG00000037537, and ENSBTAG00000004608 for blood; ZNF613 and LGI4 for the hypothalamus; ENSBTAG00000021433 for the ileum; BPHL for intramuscular fat; ENSBTAG00000054322, GAA, LYSMD3, ETFBKMT, ACSM5, ENSBTAG00000050562, ANKS3, MKX, and NUDT9 for the liver; ENSBTAG00000050865 for the lung; ARRDC3 for lymph nodes; ZNF404 and EMC9 for macrophages; ENSBTAG00000047761 for mammary tissue; ENSBTAG00000050562 for milk cells; and ELOVL6, TIGAR, PARP14, ARHGAP44, DKKL1, and ADAMTS8 for muscle. Additionally, EVA1B was identified for the pituitary gland, while WAC and ENSBTAG00000050562 were significant for the uterus.

It is noteworthy that the COX10 gene is significantly involved in “oxidative phosphorylation,” “generation of precursor metabolites and energy,” “electron transfer activity,” “heme-copper terminal oxidase activity,” “oxidoreductase activity, acting on a heme group of donors,” “electron transport chain,” “hydrogen transport,” “oxidoreductase activity, acting on a heme group of donors with oxygen as an acceptor,” “cytochrome c oxidase activity,” “proton transport,” “hydrogen ion transmembrane transporter activity,” and “hydrogen ion transmembrane transport.” Additionally, previous analysis based on GWAS indicated that the COX10 gene had highly significant effects on DPR, CCR, and HCR traits44. Furthermore, it plays a role in the catabolism of carbohydrates and lipids, as well as in the electron transfer to oxygen from reduced cytochrome c45. COX10 is vital for the correct assembly of cytochrome c oxidase (complex IV) in the mitochondrial respiratory chain, which is key to ATP production46. Mitochondrial function is crucial for follicular development, oocyte maturation, and embryo growth47. Disruption of COX10 can impair energy production in oocytes, potentially reducing their quality and fertilization potential. GAA plays a significant role in the “generation of precursor metabolites and energy.” The transcription levels of GAA can be altered in sperm under the influence of DNA methylation in humans48. The GAA gene encodes an enzyme crucial for breaking down glycogen into glucose within lysosomes, specifically known as acid α-glucosidase (GAA). Glycogen catabolism, including in uterine epithelial cells, is essential for maintaining energy balance49. Efficient energy production in ovarian cells supports hormone secretion, follicular development, and oocyte maturation. Disruptions in glycogen metabolism could reduce the energy available to oocytes, compromising their quality and maturation. Furthermore, adequate energy metabolism is vital for the early stages of embryo development and successful implantation50. The ETFBKMT and ACSM5 genes have been identified as significant genes related to CCR, DPR, and HCR traits based on TWAS results in the liver. ETFBKMT is essential for biological processes related to the “generation of precursor metabolites and energy.” Previous research has shown that during the early luteal phase in heifers, proteins associated with carbohydrate metabolism were observed at elevated levels51. This increase is attributed to changes in energy utilization and protein synthesis that occur during embryonic development. Additionally, the bovine embryo depends on the uterine environment for vital nutrients, including growth factors, amino acids, and ions, until it reaches the implantation stage52. ETFBKMT is involved in mitochondrial electron transport by methylating the electron transfer flavoprotein53. This activity impacts mitochondrial function and energy metabolism. Mitochondria are essential for ATP production, which supports oocyte maturation and early embryonic development. Altered mitochondrial function due to ETFBKMT could impair oocyte quality by reducing energy production and contributing to oxidative stress. The quality of mitochondria in oocytes is also linked to successful fertilization and embryo development54. ACSM5 is involved in fatty acid metabolism and the conjugation of carboxylic acids55 as well as in fat deposition56, fat synthesis and metabolism57. Fatty acid metabolism is important for energy production during oocyte maturation and early embryo development. Dysfunction of ACSM5 could impair energy metabolism in oocytes, leading to suboptimal oocyte quality and early embryonic development. Ledgard et al. discovered that the endometrium demonstrates heightened metabolic activity during pregnancy58. NUDT9 and TIGAR are involved in the biological processes associated with “nucleoside monophosphate metabolic process,” “purine nucleoside monophosphate metabolic process,” “ribonucleoside monophosphate metabolic process,” and “purine ribonucleoside monophosphate metabolic process.” Therefore, NUDT9 and TIGAR play roles in metabolism59, and bovine congestive heart failure60, respectively NUDT9 is involved in the hydrolysis of ADP-ribose, a molecule important for cellular signaling and energy balance. Disruption of NUDT9 could lead to abnormal signaling and energy imbalances in oocytes, negatively affecting oocyte quality and embryo viability. TIGAR regulates glycolysis and protects cells from oxidative stress by modulating glucose metabolism. Energy metabolism and oxidative stress regulation are crucial for oocyte health and embryo development. TIGAR could protect oocytes from oxidative damage, promoting higher oocyte quality and successful embryo development61. The ARHGAP44 gene is significantly downregulated in both individual datasets of ovary and in the meta-analysis results concerning low fertility in cattle. It is involved in the biological process related to the “regulation of actin filament-based processes.” ARHGAP44 has been introduced as a candidate gene associated with First Calving62, CCR, DPR, and HCR traits63. ARHGAP44, a member of the Rho GTPase-activating proteins, has been studied for its role in oocyte maturation, ovulation, and early embryo development. Oocyte maturation and embryo development are influenced by genetic and molecular factors, including interactions between granulosa cells and oocytes, as well as the cellular signaling pathways that regulate folliculogenesis and oogenesis. Studies have highlighted the role of genetic factors, including ARHGAP44, in follicle development, oocyte quality, and early embryogenesis. Disruptions in these pathways can lead to defects in reproductive processes, potentially causing infertility or early embryonic arrest.​64. DKKL1 was found to be upregulated in low fertility cattle according to both individual and meta-analysis results involving ovarian tissue. This gene is also associated with weak spermatogenesis65 and infertility66. In contrast, LGI4 was significantly downregulated in the ovary and endometrium, as observed in one of the individual datasets and the meta-analysis results related to low fertility in cattle. It has been shown to play a role in axon segregation and myelin formation in cattle67,68. ZNF613 and ZNF404 were significantly downregulated in both the individual dataset and the meta-analysis of ovarian tissue from low fertility cattle compared to high fertility cattle. ZNF613 was identified as a significant gene based on TWAS and GWAS summary results. Additionally, based on QTL analysis, ZNF613 has been recognized as an important gene related to calving difficulty69, , lengthening of gestation70, body depth, conception rate, sire calving ease71,72,73, fertility74, and longevity75. ZNF613 is thought to play a significant role in gene regulation, specifically impacting processes such as cell growth, differentiation, and apoptosis. These mechanisms are essential for ovarian follicle development, oocyte maturation, and embryo growth. Dysregulation of ZNF613 could potentially impair folliculogenesis or reduce oocyte quality, ultimately influencing fertility74. The FBXO40 gene is downregulated in the ovary and endometrium of low fertility cattle. FBXO40 is a candidate gene related to meat production76, and body weight77. NLRP12 gene has a significant role in “negative regulation of interleukin-1 secretion” term of BP. The NLRP12 gene plays a significant role in the biological process of “negative regulation of interleukin-1 secretion.” It is crucial in regulating inflammation and immune responses, primarily through its involvement in inflammasome activity. NLRP12 is known to activate caspase-1, leading to the cleavage of pro-inflammatory cytokines like IL-1β. This activity is essential for modulating immune responses, which are critical during reproductive processes such as ovulation, corpus luteum formation, and embryo implantation78. Research indicates that NLRP12’s regulation of inflammasome activity may influence local inflammatory responses in ovarian and uterine tissues, impacting folliculogenesis and early embryo development. Its activation has also been associated with neutrophil recruitment and other immune responses during viral infections, further supporting its role in managing inflammation across various biological contexts, including reproduction78.

Based on differential expression and WGCNA analysis results, we identified additional candidate genes, including RPL12, UMPS, FKBP1B, FZD10, PIK3R1, THBS1, VWF, TNN, MASP1, SGK1, DSTN, GPI, TGM2, NCOR1, AOX1, ITPR1, SMOC2, RPL11, APOA1, PSMB5, ACTR2, CFL1, ATP6VOD1, PAIP2, COX7A2, PTP4A1, ARPC5, and TXNDC5. All of these genes were identified as significant based on TWAS, GWAS, individual datasets, WGCNA, or meta-analysis results.

We employed multiple bioinformatics methods to identify these genes, and the results appear to be accurate. However, we strongly recommend conducting further detailed laboratory experiments to clarify the effects of these introduced genes on reproductive traits. Additionally, it would be beneficial to explore the regulatory effects of the identified key genes.

Conclusion

In the present study, we introduced candidate genes potentially related to CCR, DPR, and HCR traits based on GWAS, TWAS, RNA-seq data analysis, meta-analysis, WGCNA, GO, and KEGG pathway analysis for each tissue. The study also explored the correlation between different tissues using findings from TWAS, revealing that the uterus, liver, lung, and muscle show connections with 14, 12, 11, and 14 other tissues, respectively.

We discovered that the RPL12, UMPS, FKBP1B, FZD10, PIK3R1, THBS1, VWF, TNN, MASP1, SGK1, DSTN, GPI, TGM2, NCOR1, AOX1, ITPR1, SMOC2, RPL11, APOA1, COX10, GAA, ETFBKMT, ACSM5, NUDT9, TIGAR, COX7A2, PAIP2, and PSMB5 genes are involved in certain metabolic pathways related to carbohydrates, lipids, and proteins. Irregular metabolism can lead to ovarian dysfunction, resulting in decreased or absent ovulation, which is necessary for fertilization and marks the initial stage of pregnancy. Additionally, the mentioned genes are involved in the immune system, oxidative phosphorylation, and maintaining a functional electron transport chain.