Abstract
The Sahiwal are among the most prominent international transboundary dairy cattle distributed in large numbers between India and Pakistan. With the elapse of more than seven decades after the independence and limited cross-border exchange of Sahiwal germplasm, one thought-provoking question arises as to whether natural and artificial selection could alter the genomic signature patterns in the Sahiwal, reared for different purposes in these two countries. Deciphering the genetic mechanisms that underlie economic traits is essential for advancement and long-term breeding plans that are reflected in the distinct selection signatures they carry. To identify these genomic signatures, three medium-density SNP datasets of Sahiwal from three geographical locations of India and Pakistan were analyzed, using De-Correlated Composite of Multiple Selection Signals technique to identify the major candidate genes. In the genome of Sahiwal, a total of 70 genomic regions with 261 protein-coding genes were found. Milk production (NEK11, HMGCS1, BTN1A1,KCNH3), reproduction (SH3BGR, PSMG1, BRWD1,B3GALT5) and immune response genes (BPIFB1, MCOLN2) were more closely related to the Indian Sahiwal. Pakistani Sahiwal had genes closely linked with the dual-purpose meat (RALGAPA2, RIN2, CFAP61), and milk (SLC24A3 GALNT17, BACH2) traits. Our findings revealed differential patterns of selection signatures in transboundary Sahiwal cattle.
Similar content being viewed by others
Introduction
The Sahiwal, probably the heaviest milker of all indigenous cattle breeds with a well-developed udder, has its origin in the dry Punjab region of India-Pakistan subcontinent and has become one of the best dairy breeds of choice in both India and Pakistan due to its remarkable qualities of high milk yield along with heat tolerance, tick tolerance and high resistance to both internal and external parasite1. They have since gained recognition for their resilience to external and internal parasites and for being able to yield sufficient milk in a subsistence agriculture system. Many nations throughout the world have already acknowledged the superior qualities of Sahiwal cattle which are regarded as one of the transboundary breeds of cattle present in more than one country selected for dairy and meat production purposes worldwide. They have been imported to several nations for use in the creation of synthetic breeds as well as purebred raising and some of these synthetics include Australian-Friesian-Sahiwal, Australian Milking Zebu, Frieswal, Jamaica Hope, Karan Swiss, Mafriwal, Mpwapwa, and Taurindicus2.
In India, the breeding tract of Sahiwal lies in the Indo-Pakistan border in Ferozepur, Haryana, Amritsar district of Punjab, Sri Ganganagar district of Rajasthan and the adjoining areas. They are believed to be related to cattle of Afghanistan, Red Sindhi and Gir cattle breeds3. After the independence of India in 1947, many of the traditional Sahiwal breeders migrated to India, while the majority remained in Pakistan. While the Indian Sahiwal cattle were mostly reared for milk production, the same cattle were reared as dual-purpose breeds for both dairy and meat. After the elapse of almost seven decades which roughly denotes 7–8 generations and with limited cross-border exchange of Sahiwal germplasm, the interesting question arises whether the natural and artificial selection could alter the proportion of variation in the Sahiwal populations reared for different purposes from both the countries, leaving any differential signature patterns in their genomes. It is well known that selection for economic traits has left impressive selection footprints in the genome of cattle. Identifying these selection signatures can help in the genomic breeding efforts for increasing dairy cattle production.
Moreover, as long as the nation is focused on choosing the best animals with the highest potential for milk production, genomic regions of Sahiwal cattle will endure severe selective pressures for a considerable amount of time4,5. To fully understand the molecular mechanisms influencing quantitative as well as other important traits, it is therefore essential to examine the genomic signatures of selection in Sahiwal cattle. Selection signatures are distinct DNA variations that result from changes in the genomes of both chosen and neutral loci of a species that have been selected over time6. The SNP genotypic data can help in the detection of selection signatures to uncover genes and advantageous mutations associated with ecologically and economically important traits. Genomic selection signatures show breed specificity and explain putative regions of importance in indigenous and endangered breeds containing variants underlying phenotypic variations7. Identification of beneficial mutations exerts a selective advantage in a particular livestock population and gives knowledge on the evolutionary history of the development of different livestock breeds which in turn provide interesting insights to highlight new targets for selection and genetic improvement of the breed.
Several studies on signatures of selection for Sahiwal was conducted and demonstrated their utility in locating polymorphisms and/or potential genes responsible for economically significant traits5,8,9,10. These discoveries are vital to understand the processes that account for the variations in phenotype between breeds. However, in contrast to these earlier studies, which focused mainly on the assessment of selection signatures of Sahiwal cattle either from India or Pakistani decent, our study consisted of a comparative assessment of selection signatures in the genome of transboundary Sahiwal cattle using SNP genotype data from three different populations having Indian and Pakistani origin. The majority of the previous research frequently relied on a number of single statistical tests, having a lower efficiency to identify selection signatures. Therefore, integrating several statistical tools of selection signatures into a multi-point statistic viz. De-Correlated Composite of Multiple Selection Signals (DCMS) framework will be more reliable and robust. By integrating multiple statistics of signatures of selection within a single DCMS framework and accounting for their correlation, the DCMS method yields more efficient results than other statistical tests11, which in turn provides us with comparative genomic information of Sahiwal of different origins with better statistical power and resolution12.With this background and to address the issue of transboundary Sahiwal populations, the present study was undertaken to analyze three genomic data sets of Sahiwal (two from India and one from Pakistani origin) to determine the comparative differences in the genomic signature patterns among these three populations for major economic traits.
Results
Quality control and effective population size (Ne)
The present study employed three Sahiwal populations, two from India (NDRI and Hisar) and one from Pakistan13 with a total of 240 animals. After quality control of genotypes for Hardy–Weinberg equilibrium, minor allele frequency, genotype call rate, and duplicated genotype parameters, a total of 39,040, 41,227, and 20,150 SNPs remained in the final dataset of NDRI, Hisar, and Pakistan Sahiwal genotype data, respectively. QC of genotype data minimizes biases in the data and accordingly, the final dataset was utilized for Ne estimation in three Sahiwal herds. It was found that Ne was 54, 55, and 93 (--effective-size 54, --effective-size 55 and --effective-size 93) for NDRI, Hisar, and Pakistan Sahiwal in the first generation using GONE software (Fig. 1)14. The GONE software is an advancement over SNeP and similar methodologies, in which the value of LD between loci at a given genetic distance is determined by the combined effects of genetic drift and recombination that have accumulated throughout earlier generations. All of the Ne readings in this study are higher than 50, indicating that the population has reached the FAO’s minimum recommended level of 5015.
De-correlated composite of multiple selection signals (DCMS)
The DCMS method combines five statistics viz., FST, Haplotype Homozygosity Statistics (H1), Modified Haplotype Homozygosity (H12), Tajima’s D index, Nucleotide Diversity (π) between the Sahiwal populations and was estimated for each of the population using a MINOTAUR R package16. The selection analysis discovered 31, 24, and 15 genomic regions for NDRI, Hisar, and Pakistan Sahiwal. The total protein-coding genes of NDRI were found to be 115, 94 for Hisar, and 52 for Pakistan Sahiwal populations. The identified regions through gene annotation (Table 1) revealed several recognized and new candidate genes linked to important economic traits. The study revealed that genes like PCP4, MCOLN2, Lpar3, DEK, LPIN2, DLGAP1 were found to be common among the Indian Sahiwal (NDRI and Hisar) which was associated with feed efficiency, immune response, mastitis resistance, resistance to bovine tuberculosis, milk protein and milk yield. Furthermore, in NDRI and Hisar Sahiwal populations, a number of other candidate genes were found mostly related to reproduction and production traits, viz. SH3BGR, PSMG1, BRWD1 that were associated with utero embryonic development, fertility, reproductive functions and cellular response to heat stress; BPIFB1 for innate immune response; NEK11 for milk composition traits; KCNH3 for growth and milk protein content; MITF for coat colour; PICALM for milk protein content and cheesemaking properties. While, most of the candidate genes found in the Pakistan Sahiwal population were related to growth and carcass traits, viz. RALGAPA2 and CFAP61 for subcutaneous fat thickness, feed intake, conformation, weight; RNF111 for yearling weight; GALNT17 and AUTS2 for milk fat and protein, feed efficiency and TCEA3 for bovine MDSCs (muscle-derived satellite cells). The distribution of the regions of signals of selection throughout the genomes for the three Sahiwal population is presented in Fig. 2 .
Gene network analysis and hub genes identification
The protein network was constructed based on the information in the STRING v12.0 database (Fig. 3). To identify the most significant modules, the top-rank genes based on MCC values were identified for the three Sahiwal populations. The top genes with the highest rank were SH3BGR, BPIFB1, and RALGAPA2 for NDRI, Hisar, and Pakistan Sahiwal (Fig. 4a, b and c).
Signatures of selection for milk and related traits
Sahiwal is a well-known milch breed which have improved milk production qualities having a lean conformation and a brown coat colour. The DCMS analysis in our study identified genes on NDRI Sahiwal viz., NEK11, HMGCS1, NIM1K, BTN1A1 located on BTA 1, 20, 23 respectively that are associated with milk traits (Table 1). A pathway of lipid metabolism regulated by PPARA (peroxisome proliferator-activated receptor alpha) was involved with HMGCS1 whereas NIM1K was related with lactation persistency and synthesis of milk cholesterol and lipid32,80. BTN1A1 gene was found to be responsible for secretion of milk fat and affecting fat percent. Other genes like FBP2, GRIN2C, DLGAP1, CCNT1 and LPIN2 were further identified to be associated with milk protein percentage.
The genes under selection in Hisar Sahiwal related to milk production traits include KCNH3, ATL1, MAP4K5, DLGAP1 and PICALM located in BTA 5, 10, 24 and 29 (Table 1). KCNH3 gene was found to be linked with milk protein content5. In humans, KCNH3 expression is linked to poor overall and disease-free survival with ovarian cancer patients81. The genes MAP4K5 and ATL1 at BTA 10 were also responsible for milk production45,46. Pakistan’s Sahiwal yielded the genes BACH2 and SLC24A3 associated with milk fatty acids and milk production traits. ADAMTS9 and GALNT17 were likewise related to milk protein and fat processing.
Signatures of selection for reproduction traits
The genes responsible for reproduction traits include SH3BGR, PSMG1, and BRWD1 where all are located at BTA 1 having the highest score from hub genes identification in NDRI Sahiwal (Fig. 4a). PSMG1 and BRWD1 were associated with fertility and reproductive functions. Additionally, HMGN1 and B3GALT5 were involved with Bovine Maternal-zygotic transition22 and heifer early calving. USP3 gene at BTA 10 was found to be associated with embryo development in Hisar Sahiwal whereas in Pakistan Sahiwal, SLC24A3 was identified at BTA 13 related to fertility traits in the Rustaqi breed of Iraq72.
Selection signatures for immune response and disease resistance
The study revealed that the MCOLN2 gene at BTA 3 was involved with immune response in ICAR-NDRI and Hisar Sahiwal. MCOLN2 (Mucolipin-2) contributes to the Arf6-associated recycling pathway and is thought to be mostly found in recirculating endosomes. In most tissues, the expression is minimal, but in the thymus and spleen, it is highly expressed which suggests its role in immunity25. BPIFB1 gene at BTA 13 was found to be associated with innate immune response in Hisar Sahiwal having the highest hub gene score (Fig. 4b). Several genes associated with disease resistance against bovine tuberculosis, including DEK and RNF144B were discovered to be under selection in NDRI Sahiwal (Table 1). In Hisar Sahiwal, DEK gene was also found to be involved with resistance to bovine tuberculosis. Lpar3 gene was likewise identified to be involved with mastitis resistance which corresponds with mastitis resistance in cattle of NDRI and Hisar Sahiwal26. In Pakistan Sahiwal, MGAT5 was also identified to be linked with mastitis resistance.
Signatures of selection for body growth and feed efficiency
Several candidate genes were identified to be associated with growth and feed efficiency traits in Sahiwal cattle. NHL repeat containing E3 ubiquitin protein ligase 1 (NHLRC1) was one of the significant genes found on BTA 23 in NDRI Sahiwal cattle. We also identified PCP4 to be related to feed efficiency in NDRI and Hisar Sahiwal. TAL1 and CYP4A11 genes at BTA 3 were similarly found to be associated with growing skeletal muscle during puberty42 and growth and fat deposition43.
The most significant genes with the highest hub gene score in Pakistani Sahiwal cattle were located at BTA 13 namely RALGAPA2, CFAP61, SLC24A3, and RIN2 (Fig. 4c). These genes were associated with subcutaneous fat thickness, feed intake, conformation, and weight70,71. Several genes like TCEA3, TTC21B, UBXN4, and CNTNAP5 located at BTA 2 were identified for Bovine MDSCs (muscle-derived satellite cells), hip width/rump width, carcass traits and conformation and growth61,63,64,65. For feed intake, PTPRZ1 and CADPS2 genes were identified and AUTS2 at BTA 25 was associated with feed efficiency79. Additionally, it was discovered that at BTA 10, RNF111 was involved with yearling weight70, and CCNB2 (Cyclin B2) had a significant impact on the acceleration of the cell cycle and rumen development.
QTL identification and enrichment analysis
According to the QTL identification, 52.3% of the milk-type QTLs in NDRI Sahiwal cattle are located in significant genomic areas, and other QTL types such as exterior, health, meat and carcass, production and reproduction were annotated and accounted for 8.77, 7.63, 10.11, 12.46, 8.73% respectively (Fig. 5a). These QTLs were mapped to BTA 1, 3, 20, 23 and 24. The top most significant QTLs were found to be associated with milk protein percentage, milk fat percentage, milk yield, milk lauric acid content, and lactation persistency (Fig. 5b).
In Hisar Sahiwal cattle, the QTL identification accounted for 55.54% of the milk-type QTLs which are located in significant genomic areas. The other QTL types such as exterior, health, meat and carcass, production, and reproduction were annotated and accounted for 5.56, 7.39, 11.81, 11.11, and 8.58% respectively (Fig. 6a). These QTLs were mapped to BTA 3,5,10, 23, 24 and 25. The top most significant QTLs were found to be associated with milk fat yield, milk kappa-casein percentage, and milk protein yield (Fig. 6b).
In Pakistan Sahiwal cattle, the QTL identification accounted for 40.07% of the milk-type QTLs which are located in significant genomic areas. The other QTL types such as exterior, health, meat and carcass, production, and reproduction were annotated and accounted for 12.13, 8.59, 13.43, 13.91, and 11.87% respectively (Fig. 7a). These QTLs were mapped to BTA 2, 10, 13, 14, 22 and 25. The top most significant QTLs were associated with body weight, fat thickness, structural soundness, retail product yield, and milk-alpha casein percentage (Fig. 7b).
Discussion
Transboundary cattle are cattle breeds that are distributed across borders rather than regional breeds that are unique to a single nation. They have the potential to expand throughout countries and enhance the world’s supply of animal products82. In India, there are sixty indigenous breeds of cattle, eight regional transboundary breeds, and seven international transboundary breeds83. Sahiwal cattle originated from the dry Punjab region which lies along the Indian-Pakistani border, and is one of the transboundary breeds of cattle present in more than one country selected for dairy and meat production purposes worldwide. They are considered one of the best Zebu cattle breeds that can potentially play the same role as Holstein in tropical environments2.
In our study, two Sahiwal populations from India (Karnal and Hisar) and one original herd from Pakistan (Punjab) were analyzed simultaneously, which may be the first comparative study to check the status of signatures of selection among these populations, using a robust multi-point statistical tool DCMS. Studies in the past have shown that composite measurements of signals of selection can offer an objective standard to more accurately identify variations under selection12. Thus, candidate genes can be discovered with more power and greater precision for forthcoming studies in medicine, agriculture, and animal breeding to identify signals of selection. According to our results, several genomic regions and genes related to economic traits, including milk production, growth, feed efficiency, and reproduction were discovered.
Significant genes like HMGCS1 (20:314.31-314.56) and NIM1K (20:314.64-315.42) were associated with lactation persistency and synthesis of milk cholesterol and lipid31,79 in NDRI Sahiwal. Moreover, the production of milk fat and its impact on fat percentage have also been linked to the BTN1A1 (23:315.85-315.91) gene. This gene exhibited polymorphism in the native cattle populations of Tharparkar, Sahiwal, Jhari, and Belahi as well as in crossbreeds between Holstein Friesian, and Jersey whereas found to be monomorphic in water buffalo populations of Murrah, Chilika, Gojri, Chhattisgarhi and Bargur35. Furthermore, the production and composition of fatty acids were found to be influenced by the fatty acid synthesis process carried out by the enzymes ACSF3 (18:142.03-142.08), CREB1 (2:958.46-958.97), and FADS6 (19:565.20-565.36), which all had high fatty acid levels. A role for milk protein was observed in LPIN2 (24:371.40-372.14) in both the populations of NDRI and Hisar Sahiwal. Lipins (LPIN2) operate as transcriptional co-regulators of gene expression in addition to function as phosphatidate phosphatases, giving them dual roles in lipid metabolism84. According to research on mice, the LPIN2 gene was functional during the normal development of adipose tissue and might be involved in the metabolism of triglycerides in humans. This gene is a potential gene for human lipodystrophy, a condition marked by insulin resistance, fatty liver, loss of body fat, and hypertriglyceridemia85. In Valle del Belice sheep, it was discovered to be connected to lipid and milk protein metabolism86.
In Pakistan Sahiwal, ADAMTS9 located at BTA 22 (367.60-369.22), was involved in the processing of milk proteins34 and further found to be associated with lipid metabolism in Sanjiang cattle87. This gene further can control the number of mitochondrial complexes in skeletal muscles and insulin sensitivity88. It can therefore be used as a helpful molecular marker to improve goat growth characteristics89,90. GALNT17 (25:291.33-295.63) was also shown to be connected to milk protein and fat. A GWAS analysis in Danish Jersey and Holsteins also found that GALNT17 was associated with milk fat and protein traits78. According to the QTL annotation analysis, 52.3% and 55.54% were linked to milk type QTL of NDRI and Hisar Sahiwal whereas 40.07% of milk type QTL for Pakistan Sahiwal (Figs. 5a, 6a and 7a). These results revealed that milk type QTLs of Indian Sahiwal (NDRI and Hisar) was higher as compared to Pakistan Sahiwal. In a study of Genome-Wide Assessment of Signatures of Selection in the Pakistan Sahiwal Cattle10, lower milk-type QTLs of 25.08% was as well identified. Hence, we can deduce that Indian Sahiwal cattle contribute a major role in the selection of animal for milk production and its related traits whereas Sahiwal cattle from Pakistan have not undergone extensive selection for traits related to milk production.
For reproduction traits, SH3BGR (1:139.52–139.60) was associated with utero-embryonic development18 which is highly significant in NDRI Sahiwal. This gene has a connection to thioredoxin, which has the ability to stimulate growth hormone production in tissue-culture cells91. Reproduction traits contribute 8.73% to QTL annotation in NDRI Sahiwal, making the most contribution according to the QTL annotation analysis (Fig. 5a). Reproductive functioning and fertility have also been linked to PSMG1 and BRWD1 (1:139.22–139.36). In Holstein cattle, these genes were previously shown to be selective50. In mice, BRWD1 epigenetically regulates meiotic chromosomal stability, an essential process for female fertility92. In association with the maternal-zygotic transition in cattle, HMGN1 (1:139.41-139.41), was found to be involved22. HMGNs influence neuronal, ocular, reproductive, and pancreatic cell development in addition to their role in embryogenesis93. USP3 gene, a member of the ubiquitin-specific proteases (USPs) family at BTA 10 (464.13-465.13) was found in Hisar Sahiwal to be associated with embryo development in cattle49, while it was found to be involved with protein degradation in river buffalo94. The significance of reproduction traits in Indian Sahiwal may be due to the fact that prioritization was not only given to milk production traits but also to reproduction traits, which are linked to the selection of superior germplasm and the creation of appropriate breeding programs for long-term genetic improvement.
Sahiwal was renowned for its strong resistance to internal and external parasites, as well as its ability to withstand heat and ticks. These distinctive traits are manifestations of the robust genetic makeup that drives innate immunity and its relationship to acquired/adaptive immunity95. The innate immune response was linked to the BPIFB1 (13:627.62-627.85) gene in Hisar Sahiwal. This gene shares structural similarities with LPS-binding protein and BPI protein, two innate immune molecules known for their functions in detecting and reacting to Gram-negative bacteria, where these proteins support innate immunity96. Additionally, several bovine tuberculosis disease resistance genes, including DEK and RNF144B (23:391.78–393.87), were discovered to be under selection in Hisar and NDRI Sahiwal. These protein-encoding genes regulate NF-κB in human macrophages. Ring finger protein 144B (RNF144B) contribution to bovine tuberculosis has also been documented in Holstein-Friesian cattle97. ABT1 gene (23:315.20-315.23) was identified to be associated with resistance to the Bovine Leukaemia virus in NDRI Sahiwal. In Argentinean dairy cattle, the expression of the ABT1 gene transcription factor was higher in low-pro Viral Load cows than in high-pro Viral Load cows at a 95% significance level33. These results suggested that selection under different tropical environment situations has resulted in the existence of immunity-related genes in NDRI and Hisar Sahiwal cattle populations.
Body conformation and feed efficiency a complex characteristic that are controlled by multiple biological mechanisms. PCP4 gene (1:139.93-104.005) was associated with feed efficiency in NDRI and Hisar Sahiwal. TAL1 and CYP4A11 genes located between BTA 3 (990.05-992.22) were found in Hisar Sahiwal and linked to growing skeletal muscle during puberty38 and growth and fat deposition43. CYP4A11 is a significant omega hydroxylase of lauric acid (medium-chain fatty acids) which plays a role in blood pressure regulation, fatty acid metabolism and the conversion of arachidonic acid to 20-hydroxyeicosatetraenoic acid (20-HETE)98. Several genes like CREBBP (25:305.43-317.33), ADCY9 (25:324.69-334.91), MGRN1 (25:375.51-379.18) and TFAP4 (25:344.38-345.66) were further identified in Hisar Sahiwal to be associated with feed conversion54 lipid and meat characteristics and ear size genes55 bull fertility directly involved in spermatogenesis and meat tenderness57,] and bovine satellite cells56. In pigs, the gene CREBBP was found to be crucial for pig growth and feed conversion54. Furthermore, in Pakistan Sahiwal, RALGAPA2 (13:399.01-401.88) was linked to the thickness of subcutaneous fat which was responsible for encoding the catalytic alpha subunit 2 (α2) of Ral GTPase-activating protein (RalGAP). Proteins utilised in membrane trafficking or cellular vehicles are encoded by the genes RALGAPA2 and Ras and Rab interactor 2 (RIN2)99. In Pakistan Sahiwal, these genes are determined to be highly significant, with the highest contributions to QTL enrichment analysis coming from body weight, fat thickness, and structural soundness (Fig. 7b). The RIN2 (13:393.64-395.73) gene was likewise implicated in muscle growth and production in crossbred small-tailed Han and Dorper x small-tailed Han sheep73. The hatching weight and fat features of chickens were found to be substantially correlated with this gene100. ASAP1 (14:102.95-106.04) and EYA1 (14:347.86-351.46) were similarly found to be related to the circumference of the scrotum, beef quality, and production traits74,75. An investigation of gene expression revealed that EYA1 might play significant functions in both differentiated and undifferentiated bovine muscle101. Additionally, the QTL annotation for Pakistan Sahiwal showed that production traits contribute 13.91% (Fig. 7a) which was higher than both NDRI and Hisar Sahiwal in the study. Based on these findings, we can conclude that the Pakistani Sahiwal cattle was under intense selection for production traits that had important genes associated with it.
Conclusion
In our knowledge, this is the first attempt to study the patterns of comparative selection signatures in the genomes of transboundary Sahiwal cattle using a multi-point composite statistic of DCMS. The study enriched our understanding how different selection utilities left differential genomic footprints for various economic traits in these cattle transcending the geographical boundaries. This study also demonstrated the power and reliability of DCMS technique in the detection of selection signatures over other univariate statistical techniques. The result revealed a number of major genes primarily focused on the milk production (NEK11, HMGCS1, BTN1A1, KCNH3) and reproductive traits (SH3BGR, PSMG1, BRWD1, B3GALT5) in the Indian Sahiwal, while Pakistan Sahiwal population were selected mostly for growth and meat traits having different sent of candidate genes (RALGAPA2, RIN2, CFAP61). Despite selection for different utilities, the Sahiwal retained their fundamental genomic signature patterns associated with milk, growth and reproduction. Our findings added further insight of the genomic footprint of the Sahiwal, one of the most significant international transboundary cattle that will benefit the ongoing genetic improvement programme in these countries.
Materials and methods
Animal resources, SNP genotyping and quality control
This study considered two Sahiwal cattle populations from India, National Dairy Research Institute (NDRI) and Hisar, as well as one Sahiwal cattle population from Pakistan. A total of 240 genotypic data samples counting NDRI Sahiwal (n = 193), Hisar Sahiwal (n = 30), and Pakistan Sahiwal (n = 17) were obtained for analyses (Table 2). The 50 K SNP data chip was used in which NDRI Sahiwal and Hisar Sahiwal were obtained from NDDB, India, which was customized from commercially available Illumina Bovine SNP chip (BovineSNP50K v3 Bead Chip). Pakistan Sahiwal 50 K SNP data was sourced from public data repository13.
The quality control (QC) of the genotyped data was implemented in PLINK1.9 program105. Only the SNPs found on autosomes were taken into consideration for analysis, eliminating the unmapped SNPs and SNPs found on the X and Y chromosomes. SNPs having a Hardy–Weinberg equilibrium below 0.001, a minor allele frequency of less than 0.05, and a genotype call rate of less than 0.95 were excluded. Quality control of genotypes was again performed for phasing of haplotypes with SHAPEIT v2.r904 program106 to get high-quality SNPs. The version and URL of the software/Package used in this study are provided in the Supplementary Table S1.
De-correlated composite of multiple selection signals (DCMS)
In this study, De-Correlated Composite of Multiple Selection Signals (DCMS) was used to integrate several statistics for selection signature detection while taking into consideration the correlation between various statistics. It comprises of both intra-population and inter-population statistics to identify selection signatures entailing five methods as described in107.
-
1.
FST (Fixation index).
-
2.
Haplotype Homozygosity Statistics (H1).
-
3.
Modified Haplotype Homozygosity (H12).
-
4.
Tajima’s D index.
-
5.
Nucleotide Diversity (π).
where plt - the p-value at position l for statistic t; rit -the weighing factor at each locus, and n is the total number of test statistics (combined) in the DCMS12,108,109,110,111. All the statistics were transformed into p-values using one-tailed and two-tailed ranks, which are fractional ranks that fall between 1/(n + 1) and n/(n + 1), respectively, in order to produce the DCMS.
The threshold for selection of DCMS value was set to q < 0.1. This was because when q < 0.05 or 0.01 was taken as significance threshold, only few significant markers were found. Therefore, the q value was set to < 0.1 for selecting more significant markers.
Effective population size (Ne)
Effective population size is the size of an idealised population going through the same rate of genetic drift as the population under studied112. The parameter (--effective-size) was required for phasing as a number of genetic analyses depend on the process of haplotype phasing, which determines which genetic variants are physically situated on the same chromosome. Ne was then estimated using GONE software14 for each of the three Sahiwal populations under study after which Ne was subsequently included in the phasing parameter using the SHAPEIT v2.r904 program106.
Fixation index (FST)
The fixation index, a population differentiation measurement, was computed using PLINK1.9 --fst and --within functions for every SNP and breed. The R program’s runmed function was used to smooth the FST values of each SNP after FST values less than 0 were transformed to zeros.
The FST (Fixation Index) is a fundamental concept in population genetics that measures genetic divergence between populations. It measures the proportion of genetic diversity caused by differences between populations rather than within them. As a result, FST is a pairwise comparison statistic and must be compared between populations. In our present study, the three Sahiwal populations i.e., NDRI, Hisar and Pakistan were merged into three groups pair-wise, viz. NDRI and Hisar were merged in Group1, NDRI and Pakistan were merged in Group2, and Pakistan and Hisar were merged in Group3. Subsequently, PLINK1.9 was used to estimate pair-wise FST values.
Haplotype homozygosity statistics (H1 and H12)
The SHAPEIT v2.r904 programme106 was used individually for phasing each chromosome. Then, Haplotype Homozygosity Statistics (H1 and H12) were obtained from the phased file using haplotype frequency spectrum statistics including the LASSI composite likelihood ratio statistic113, H12 and H2/H1111 which was calculated using LASSIP v1.1.1 software.
Tajima’s D and nucleotide diversity (π)
The VCFTOOLS v0.1.16 program was used to estimate Tajima’s D and pi statistics114. Using the --TajimaD function, Tajima’s D statistics were determined for each breed and chromosome individually, taking into account non-overlapping sliding windows of 300 Mb (--TajimaD 300000). SNPs within a 300-Mb bin were assigned the predicted D values for that bin, and missing values were replaced with zeros. For Nucleotide Diversity, the --site-pi tool was used to calculate the pi statistics for each breed and chromosome independently. The outputs were then smoothed for each chromosome using the R’s runmed function with a window size of 31 SNPs (k = 31, endrule = “constant”) in order to remove noise107.
Calculation of DCMS statistic
To create a new composite signal known as DCMS, all five statistical analyses for each SNP - H1, H12, FST, Tajima’s D index, Nucleotide Diversity (π) were combined. Applying the stat_to_p-value function in the MINOTAUR package in the R environment, the left-tailed test was applied to Tajima’s D values and π based on the functional ranks, while the right-tailed test was applied to H1 and H12 and the FST statistic, respectively16. Subsequently, a correlation matrix of n × n order was calculated using the covNAMcd function (alpha = 0.75, nsamp = 50,000) from the rrcovNA v.0.5-2 R package115. This matrix was imported into the MINOTAUR R package16 to compute the genome-wide DCMS values. Then, MASS v.7.3–61 R package116 was utilised to convert the DCMS data into a normal distribution by applying the robust linear model (rlm)107. The fitted model’s outputs, i.e. Mu [mean] and SD [standard deviation] were input into the pnorm R function to determine the DCMS statistics’ p-values: dcms_pvalues = pnorm(q = dcms, mean = mu, sd = SD, lower.tail = FALSE). Ultimately, the p-values that was acquired were transformed into the corresponding q-values by the use of the q-value R function following Benjamini and Hochberg method117.
Gene annotation and functional annotation
Genes that were located in the genomic regions were considered as significant if q-value is lower than 0.1. The R package GALLO v1.4 (Genomic Annotation in Livestock for positional candidate Loci) was used to annotate the genes and QTLs118. To locate the gene and QTL respectively, the gene and QTL annotation files (.gtf and.gff files) produced from ARS-UCD1.2 assembly119 and the Animal QTL Database120 were utilised. The genes and QTL enrichment analysis was conducted using the same GALLO v1.4 programme for all QTLs identified using the chromosome-based technique and functional annotation was done through PANTHER v18.0121.
Network formation and hub genes identification
Using STRING v12.0122, we integrated the protein coding genes to derive the biologically relevant interface permitting the flow of information. Then the results were visualized using CYTOSCAPE v3.10.1123 software and the hub genes were identified based on the number of associations with other genes in the protein network using widely known technique algorithm viz. MCC.
Data availability
The genotypic data used and analyzed for NDRI and Hisar Sahiwal are available from the corresponding author upon reasonable request. Genotypic data of Pakistan Sahiwal can be found in the WIDDE database http://widde.toulouse.inra.fr/widde/ .
References
Sahiwal Australia. (n.d.). Sahiwal cattle – Pure tropical breed. Retrieved from https://www.sahiwal.com.au.
Rehman, Z., Khan, M. S. & Mirza, M. A. Factors affecting performance of Sahiwal cattle-a review. JAPS: J. Anim. Plant. Sci. 24, 1–12 (2014).
Prakash, B. et al. Cattle genetic resources of India. Sahiwal Cattle-The champion breed. Monograph. 13 (2015).
Haskell, M. J., Simm, G. & Turner, S. P. Genetic selection for temperament traits in dairy and beef cattle. Front. Genet. 5 https://doi.org/10.3389/fgene.2014.00368 (2014).
Illa, S. K., Mukherjee, S., Nath, S. & Mukherjee, A. Genome-Wide scanning for signatures of selection revealed the putative genomic regions and candidate genes controlling milk composition and coat color traits in Sahiwal cattle. Front. Genet. 12 https://doi.org/10.3389/fgene.2021.699422 (2021).
Kreitman, M. Methods to detect selection in populations with applications to the human. Annu. Rev. Genomics Hum. Genet. 1, 539–559. https://doi.org/10.1146/annurev.genom.1.1.539 (2000).
Cesarani, A. et al. Genome-wide variability and selection signatures in Italian Island cattle breeds. Anim. Genet. 49, 371–383. https://doi.org/10.1111/age.1269 (2018).
Dixit, S. P. et al. Genome analyses revealed genetic admixture and selection signatures in Bos indicus. Sci. Rep. 11, 21924. https://doi.org/10.1038/s41598-021-01144-2 (2021).
Dash, S., Singh, A., Dixit, S. P. & Kumar, A. Identification of selection signatures for milk performance traits among Indigenous dairy cattle breeds using high density genomic information. Indian J. Anim. Res. 1 https://doi.org/10.18805/IJAR.B-4908 (2022).
Sesay, A. R., Saif-ur-Rehman, M., Ramzan, F. & Awan, F. S. Genome-Wide assessment of signatures of selection in the Pakistan Sahiwal cattle. Pakistan J. Zool. 56, 1501–2000. https://doi.org/10.17582/journal.pjz/20231211061004 (2024).
Grossman, S. R. et al. A composite of multiple signals distinguishes causal variants in regions of positive selection. Science 327, 883–886. https://doi.org/10.1126/science.1183863 (2010).
Ma, Y. et al. Properties of different selection signature statistics and a new strategy for combining them. Heredity 115, 426–436. https://doi.org/10.1038/hdy.2015.42 (2015).
Sempéré, G. et al. WIDDE: a Web-Interfaced next generation database for genetic diversity exploration, with a first application in cattle. BMC Genomics 16, 940. https://doi.org/10.1186/s12864-015-2181-1 (2015).
Santiago, E. et al. Recent demographic history inferred by high-resolution analysis of linkage disequilibrium. Mol. Biol. Evol. 37, 3642–3653. https://doi.org/10.1093/molbev/msaa169 (2020).
FAO. Secondary guidelines for development of national farm animal genetic resources management plans. Management of small populations at risk. (1998).
Verity, R. et al. Minotaur: a platform for the analysis and visualization of multivariate results from genome scans with R Shiny. Mol. Ecol. Resour. 17, 33–43. https://doi.org/10.1111/1755-0998.12579 (2017).
de Simoni Gouveia, J. J., da Silva, M. V. G. B., Paiva, S. R. & de Oliveira, S. M. P. Identification of selection signatures in livestock species. Genet. Mol. Biol. 37, 330–342. https://doi.org/10.1590/S1415-47572014000300004 (2014).
Verardo, L. L. Gene networks from genome wide association studies for pigs reproductive traits. http://www.locus.ufv.br/handle/123456789/6773 (2015)
Cai, Z., Guldbrandtsen, B., Lund, M. S. & Sahana, G. Prioritizing candidate genes for fertility in dairy cows using gene-based analysis, functional annotation and differential gene expression. BMC Genom. 20, 1–9. https://doi.org/10.1186/s12864-019-5638-9 (2019).
Sigdel, A., Liu, L., Abdollahi-Arpanahi, R., Aguilar, I. & Peñagaricano, F. Genetic dissection of reproductive performance of dairy cows under heat stress. Anim. Genet. 51, 511–520. https://doi.org/10.1111/age.12943 (2020).
Mota, L. F. Integrating genome-wide association study and pathway analysis reveals physiological aspects affecting heifer early calving defined at different ages in Nelore cattle. Genomics 114, 110395. https://doi.org/10.1016/j.ygeno.2022.110395 (2022).
Vigneault, C., McGraw, S., Massicotte, L. & Sirard, M. A. Transcription factor expression patterns in bovine in vitro-derived embryos prior to maternal-zygotic transition. Biol. Reprod. 70, 1701–1709. https://doi.org/10.1095/biolreprod.103.022970 (2004).
Alexandre, P. A. et al. Systems biology reveals NR2F6 and TGFB1 as key regulators of feed efficiency in beef cattle. Front. Genet. 10, 230. https://doi.org/10.3389/fgene.2019.00230 (2019).
Yao, D. et al. cAMP response element binding protein 1 (CREB1) promotes monounsaturated fatty acid synthesis and triacylglycerol accumulation in goat mammary epithelial cells. Animals 10, 1871. https://doi.org/10.3390/ani10101871 (2020).
Cheng, Z. et al. Hepatic global transcriptomic profiles of Holstein cows according to parity reveal age-related changes in early lactation. Int. J. Mol. Sci. 24, 9906. https://doi.org/10.3390/ijms24129906 (2023).
Brajnik, Z. & Ogorevc, J. Candidate genes for mastitis resistance in dairy cattle: a data integration approach. J. Anim. Sci. Biotechnol. 14, 10. https://doi.org/10.1186/s40104-022-00821-0 (2023).
Marina, H. et al. Genome-wide association studies (GWAS) and post-GWAS analyses for technological traits in Assaf and Churra dairy breeds. J. Dairy Sci. 104, 11850–11866. https://doi.org/10.3168/jds.2021-20510 (2021).
Nayeri, S. & Stothard, P. Tissues, metabolic pathways and genes of key importance in lactating dairy cattle. Springer Sci. Reviews. 4, 49–77. https://doi.org/10.1007/s40362-016-0040-3 (2016).
Witkowski, A., Thweatt, J. & Smith, S. Mammalian ACSF3 protein is a Malonyl-CoA synthetase that supplies the chain extender units for mitochondrial fatty acid synthesis. J. Biol. Chem. 286, 33729–33736. https://doi.org/10.1074/jbc.M111.291591 (2011).
Srikanth, K., Lee, E., Kwan, A., Lim, Y. & Chung, H. Genetic variations in the bovine fatty acid desaturase 6 (FADS6) are associated with fatty acid composition in Hanwoo cattle. J. Agric. Sci. 8, 41–49. https://doi.org/10.5539/jas.v8n12p41 (2016).
Ariyarathne, H. B. P. C., Correa-Luna, M., Blair, H. T., Garrick, D. J. & Lopez-Villalobos, N. Identification of genomic regions associated with concentrations of milk fat, protein, Urea and efficiency of crude protein utilization in grazing dairy cows. Genes 12, 456. https://doi.org/10.3390/genes12030456 (2021).
Rikitake, Y. et al. Anti-oxidative properties of Fluvastatin, an HMG-CoA reductase inhibitor, contribute to prevention of atherosclerosis in cholesterol-fed rabbits. Atherosclerosis 154, 87–96. https://doi.org/10.1016/S0021-9150(00)00468-8 (2001).
Petersen, M. et al. (ed, I.) Expression-based analysis of genes related to single nucleotide polymorphism hits associated with bovine leukemia virus proviral load in Argentinean dairy cattle. J. Dairy Sci. 104 1993–2007 https://doi.org/10.3168/jds.2020-18924 (2021).
Dai, W. T., Zou, Y. X., White, R. R., Liu, J. X. & Liu, H. Y. Transcriptomic profiles of the bovine mammary gland during lactation and the dry period. Funct. Integr. Genom. 18, 125–140. https://doi.org/10.1007/s10142-017-0580-x (2018).
Kumar, M., Ratwan, P., Das, R., Chopra, A. & Vohra, V. Allelic diversity of Butyrophilin (BTN1A1) gene in Indian bovines. Indonesian J. Biotechnol. 22, 92–97 https://jurnal.ugm.ac.id/ijbiotech/article/view/30332/20022 (2017).
Tizioto, P. C. et al. Detection of quantitative trait loci for mineral content of Nelore longissimus dorsi muscle. Genet. Selection Evol. 47, 1–9. https://doi.org/10.1186/s12711-014-0083-3 (2015).
Yan, Z. et al. Identification of whole-genome significant single nucleotide polymorphisms in candidate genes associated with body conformation traits in Chinese Holstein cattle. Anim. Genet. 51, 141–146. https://doi.org/10.1111/age.12865 (2020).
Chernukha, I., Abdelmanova, A., Kotenkova, E., Kharzinova, V. & Zinovieva, N. A. Assessing Genetic Diversity and Searching for Selection Signatures by Comparison between the Indigenous Livni and Duroc Breeds in Local Livestock of the Central Region of Russia. Diversity 14, 859 https://doi.org/10.3390/d14100859 (2022).
Johnston, D. et al. Genome wide association study of passive immunity and disease traits in beef-suckler and dairy calves on Irish farms. Sci. Rep. 10, 18998. https://doi.org/10.1038/s41598-020-75870-4 (2020).
Chen, Z., Yao, Y., Ma, P., Wang, Q. & Pan, Y. Haplotype-based genome-wide association study identifies loci and candidate genes for milk yield in Holsteins. PLoS One. 13, 0192695. https://doi.org/10.1371/journal.pone.0192695 (2018).
Zhang, Z. et al. TGIF1 and SF1 polymorphisms are associated with litter size in small tail Han sheep. Reprod. Domest. Anim. 55, 1145–1153. https://doi.org/10.1111/rda.13753 (2020).
Lau, L. Y. et al. Gene regulation could be attributed to TCF3 and other key transcription factors in the muscle of pubertal heifers. Veterinary Med. Sci. 6, 695–710. https://doi.org/10.1002/vms3.278 (2020).
Yang, M. et al. Association study and expression analysis of CYP4A11 gene copy number variation in Chinese cattle. Sci. Rep. 7, 46599. https://doi.org/10.1038/srep46599 (2017).
Leal-Gutierrez, J. D., Elzo, M. A. & Mateescu, R. G. Identification of eQTLs and sQTLs associated with meat quality in beef. BMC Genom. 21, 104. https://doi.org/10.1186/s12864-020-6520-5 (2020).
Zhang, S. Function and mechanism of map4k4 genes on milk traits and Scs between dairy cow and Buffalo. Annals Agricultural Sci. Moshtohor. 56(4th ICBAA), 89–90. https://doi.org/10.21608/assjm (2018).
Marete, A. G. et al. A meta-analysis including pre-selected sequence variants associated with 7 traits in 3 French dairy cattle populations. Front. Genet. 9, 522. https://doi.org/10.3389/fgene.2018.00522 (2018).
Atashi, H. et al. Single-step genome-wide association for selected milk fatty acids in Dual-Purpose Belgian blue cows. J. Dairy Sci. 106, 6299–6315. https://doi.org/10.3168/jds.2022-22432 (2023).
Alvarenga, A. B. et al. Unraveling the phenotypic and genomic background of behavioral plasticity and temperament in North American Angus cattle. Genet. Selection Evol. 55, 3. https://doi.org/10.1186/s12711-023-00777-3 (2023).
Srirattana, K. & St. John, J. C. Manipulating the mitochondrial genome to enhance cattle embryo development. G3 Genes Genomes Genet. 7, 2065–2080 https://doi.org/10.1534/g3.117.04265 (2017).
Cai, Z., Guldbrandtsen, B., Lund, M. S. & Sahana, G. Prioritizing candidate genes post-GWAS using multiple sources of data for mastitis resistance in dairy cattle. BMC Genom. 19, 656. https://doi.org/10.1186/s12864-018-5050-x (2018).
Petersen, J. L., Sieck, R. L. & Steffen, D. J. White coat color of a black Angus calf attributed to an occurrence of the delR217 variant of MITF. Anim. Genet. https://doi.org/10.1111/age.13327 (2023).
Venturini, G. C. et al. Association between single-nucleotide polymorphisms and milk production traits in Buffalo. Genet. Mol. Res. 13, 10256–10268. https://doi.org/10.4238/2014.December.4.20 (2014).
Howard, J. T. et al. Beef cattle body temperature during Climatic stress: a genome-wide association study. Int. J. Biometeorol. 58, 1665–1672. https://doi.org/10.1007/s00484-013-0773-5 (2014).
Piórkowska, K. et al. The pituitary transcriptional response related to feed conversion in pigs. Genes (Basel). 10, 712. https://doi.org/10.3390/genes10090712 (2019).
Khan, I. M., Khan, A., Liu, H. & Khan, M. Z. Genetic markers identification for animal production and disease resistance. Front. Genet. 14, 1243793. https://doi.org/10.3389/fgene.2023.1243793 (2023).
Lyu, P., Settlage, R. E. & Jiang, H. Genome-wide identification of enhancers and transcription factors regulating the myogenic differentiation of bovine satellite cells. BMC Genom. 22, 901. https://doi.org/10.1186/s12864-021-08224-7 (2021).
Han, Y. & Peñagaricano, F. Unravelling the genomic architecture of bull fertility in Holstein cattle. BMC Genet. 17, 143. https://doi.org/10.1186/s12863-016-0454-6 (2016).
Carvalho, M. E. et al. Identification of genomic regions related to tenderness in Nellore beef cattle. Adv. Anim. Biosci. 8, 42–44. https://doi.org/10.1017/S2040470017001674 (2017).
Zhang, H. et al. Genetic parameters and genome-wide association studies of eight longevity traits representing either full or partial lifespan in Chinese Holsteins. Front. Genet. 12, 634986. https://doi.org/10.3389/fgene.2021.634986 (2021).
Sanchez, M. P. et al. Confirmation of candidate causative variants on milk composition and cheesemaking properties in Montbéliarde cows. J. Dairy Sci. 101, 10076–10081. https://doi.org/10.3168/jds.2018-14986 (2018).
Zhu, Y., Tong, H. L., Li, S. F. & Yan, Y. Q. Effect of TCEA3 on the differentiation of bovine skeletal muscle satellite cells. Biochem. Biophys. Res. Commun. 484, 827–832. https://doi.org/10.1016/j.bbrc.2017.01.182 (2017).
Verma, P. et al. Transcriptome analysis of Circulating PBMCs to understand mechanism of high altitude adaptation in native cattle of Ladakh region. Sci. Rep. 8, 7681. https://doi.org/10.1038/s41598-018-25736-7 (2018).
Doyle, J. L. et al. Genomic regions associated with skeletal type traits in beef and dairy cattle are common to regions associated with carcass traits, feed intake and calving difficulty. Front. Genet. 11, 20. https://doi.org/10.3389/fgene.2020.0002 (2020).
Karisa, B. K. et al. Candidate genes and biological pathways associated with carcass quality traits in beef cattle. Can. J. Anim. Sci. 93, 295–306. https://doi.org/10.4141/cjas2012-136 (2013).
Vanvanhossou, S. F. U. et al. Multi-breed GWAS for morphometric traits in four Beninese Indigenous cattle breeds reveals loci associated with conformation, carcass and adaptive traits. BMC Genom. 21, 783. https://doi.org/10.1186/s12864-020-07170-0 (2020).
Zhong, L. et al. Methylation levels in the promoter region of FHIT and PIAS1 genes associated with mastitis resistance in Xinjiang brown cattle. Genes 14, 1189. https://doi.org/10.3390/genes14061189 (2023).
Do, D. N. et al. Genome-wide association study reveals genetic architecture of eating behavior in pigs and its implications for humans obesity by comparative mapping. PloS One. 8, 71509. https://doi.org/10.1371/journal.pone.0071509 (2013).
Pegolo, S. et al. SNP co-association and network analyses identify E2F3, KDM5A and BACH2 as key regulators of the bovine milk fatty acid profile. Sci. Rep. 7, 17317. https://doi.org/10.1038/s41598-017-17434-7 (2017).
Zhang, Y. et al. Transcriptome analysis of bovine rumen tissue in three developmental stages. Front. Genet. 13 https://doi.org/10.3389/fgene.2022.821406 (2022).
Naserkheil, M., Bahrami, A., Lee, D. & Mehrban, H. Integrating single-step GWAS and bipartite networks reconstruction provides novel insights into yearling weight and carcass traits in Hanwoo beef cattle. Animals 10, 1836. https://doi.org/10.3390/ani10101836 (2020).
Pitt, D. et al. Demography and rapid local adaptation shape Creole cattle genome diversity in the tropics. Evol. Appl. 12, 105–122. https://doi.org/10.1111/eva.12641 (2019).
Alshawi, A., Essa, A., Al-Bayatti, S. & Hanotte, O. Genome analysis reveals genetic admixture and signature of selection for productivity and environmental traits in Iraqi cattle. Front. Genet. 10, 609. https://doi.org/10.3389/fgene.2019.00609 (2019).
Cao, Y., Jin, H., Ma, H. & Zhao, Z. Comparative analysis on genome-wide DNA methylation in longissimus dorsi muscle between small tailed Han and Dorper × small tailed Han crossbred sheep. Asian-Australas J. Anim. Sci. 30, 1529–1539. https://doi.org/10.5713/ajas.17.0154 (2017).
Tizioto, P. C. et al. A SNP in ASAP1 gene is associated with meat quality and production traits in Nelore breed. Meat Sci. 92, 855–857. https://doi.org/10.1016/j.meatsci.2012.05.018 (2012).
Ramayo-Caldas, Y. et al. A marker-derived gene network reveals the regulatory role of PPARGC1A, HNF4G, and FOXP3 in intramuscular fat deposition of beef cattle. J. Anim. Sci. 92, 2832–2845. https://doi.org/10.2527/jas.2013-7484 (2014).
Muniz, M. M. M. et al. Characterization of novel LncRNA muscle expression profiles associated with meat quality in beef cattle. Evol. Appl. 15, 706–718. https://doi.org/10.1111/eva.13365 (2022).
Srikanth, K., Lee, E., Kwon, A., Jang, G. & Chung, H. Association of a single nucleotide polymorphism in the calneuron 1 gene on meat quality and carcass traits in Hanwoo (Bos Taurus coreanae). JAPS J. Anim. Plant. Sci. 28, 651–655 (2018). http://www.thejaps.org.pk/docs/v-28-02/38.pdf
Poulsen, N. A., Robinson, R. C., Barile, D., Larsen, L. B. & Buitenhuis, B. A genome-wide association study reveals specific transferases as candidate loci for bovine milk oligosaccharides synthesis. BMC Genom. 20, 404. https://doi.org/10.1186/s12864-019-5786-y (2019).
Hardie, L. C. et al. The genetic and biological basis of feed efficiency in mid-lactation Holstein dairy cows. J. Dairy Sci. 100, 9061–9075. https://doi.org/10.3168/jds.2017-12604 (2017).
Soares, R. A. N. et al. Differential gene expression in dairy cows under negative energy balance and ketosis: A systematic review and meta-analysis. J. Dairy Sci. 104, 602–615. https://doi.org/10.3168/jds.2020-18883 (2021).
Li, Z. et al. KCNH3 predicts poor prognosis and promotes progression in ovarian cancer. OncoTargets Therapy. 10323–10333. https://doi.org/10.2147/OTT.S268055 (2020).
FAO. Intergovernmental Technical Working Group on Animal Genetic Resources for Food and Agriculture & Session, S. http://www.fao.org/docrep/meeting/026/me588e.pdf (2012).
FAO. The second report on the state of the World’s Animal Genetic Resources for Food and Agriculture. (eds Scherf, B. D. & Pilling, D.) http://www.fao.org/3/a-i4787e.pdf (FAO, 2015).
Jiao, X. L. et al. Ontogenetic expression of Lpin2 and Lpin3 genes and their associations with traits in two breeds of Chinese fat-tailed sheep. Asian-Australasian J. Anim. Sci. 29, 333. https://doi.org/10.5713/ajas.15.0467 (2016).
Fei, W. et al. A role for phosphatidic acid in the formation of supersized lipid droplets. PLoS Genet. 7, 1002201. https://doi.org/10.1371/journal.pgen.1002201 (2011).
Di Gerlando, R. et al. Genome-wide association study between CNVs and milk production traits in Valle Del Belice sheep. PLoS One. 14, 0215204. https://doi.org/10.1371/journal.pone.0215204 (2019).
Lyu, Y. et al. Local ancestry and selection in admixed Sanjiang cattle. Stress Biology. 3, 30. https://doi.org/10.1007/s44154-023-00101-5 (2023).
Graae, A-S. et al. ADAMTS9 regulates skeletal muscle insulin sensitivity through extracellular matrix alterations. Diabetes 68, 502–514. https://doi.org/10.2337/db18-0418 (2019).
Jungers, K. A. et al. Adamts9 is widely expressed during mouse embryo development. Gene Expr Patterns. 5, 609–617. https://doi.org/10.1016/j.modgep.2005.03.004 (2005).
Tang, Q. et al. Detection of two insertion/deletions (indels) within the ADAMTS9 gene and their associations with growth traits in goat. Small Rumin Res. 180, 9–14. https://doi.org/10.1016/j.smallrumres.2019.09.015 (2019).
Mazzocco, M. et al. The identification of a novel human homologue of the SH3 binding glutamic acid-rich (SH3BGR) gene establishes a new family of highly conserved small proteins related to thioredoxin superfamily. Gene 291, 233–239. https://doi.org/10.1016/S0378-1119(02)00602-9 (2002).
Pattabiraman, S. et al. Mouse BRWD1 is critical for spermatid postmeiotic transcription and female meiotic chromosome stability. J. Cell. Biol. 208, 53–69. https://doi.org/10.1083/jcb.201404109 (2015).
Nanduri, R., Furusawa, T. & Bustin, M. Biological functions of HMGN chromosomal proteins. Int. J. Mol. Sci. 21, 449. https://doi.org/10.3390/ijms21020449 (2020).
Ren, Y. et al. Adaptive selection signatures in river Buffalo with emphasis on immune and major histocompatibility complex genes. Genomics 113, 3599–3609. https://doi.org/10.1016/j.ygeno.2021.08.021 (2021).
Nivsarkar, A. E., Vij, P. K. & Tantia, M. S. Animal Genetic Resources of India Cattle and Buffalo (Vinayak, 2000).
Li, J. et al. Molecular biology of BPIFB1 and its advances in disease. Ann. Transl. Med. 8 https://doi.org/10.21037/atm-20-3462 (2020).
Raphaka, K. et al. Genomic regions underlying susceptibility to bovine tuberculosis in Holstein-Friesian cattle. BMC Genet. 18, 1–10. https://doi.org/10.1186/s12863-017-0493-7 (2017).
Powell, P. K., Wolf, I. & Lasker, J. M. Identification of CYP4A11 as the major lauric acid omega-hydroxylase in human liver microsomes. Arch. Biochem. Biophys. 335, 219–226. https://doi.org/10.1006/abbi.1996.0501 (1996).
Moon, S. et al. A genome-wide scan for selective sweeps in racing horses. Asian-Australasian J. Anim. Sci. 28, 1525–1531. https://doi.org/10.5713/ajas.14.0696 (2015).
Lin, W. et al. Novel 61-bp indel of RIN2 is associated with fat and hatching weight traits in chickens. Front. Genet. 12, 672888. https://doi.org/10.3389/fgene.2021.672888 (2021).
Hudson, N. J., Lyons, R. E., Reverter, A., Greenwood, P. L. & Dalrymple, B. P. Inferring the in vivo cellular program of developing bovine skeletal muscle from expression data. Gene Expr Patterns. 13, 109–125. https://doi.org/10.1016/j.gep.2013.02.001 (2013).
Togla, O. et al. Influence of non-genetic factors on milk yield traits in Sahiwal cows. Pharma Innov. J. 11, 2272–2276 (2021). https://www.thepharmajournal.com/archives/2021/vol10issue11S/PartV/S-10-11-116-734.pdf
Department of Animal Husbandry and Dairying. Haryana. (n.d.). Government Livestock Farm, Hisar. Pashudhan Haryana. Retrieved from https://pashudhanharyana.gov.in/12-government-livestock-farm-hisar
Rehman, Z. & Khan, M. S. Environmental factors affecting performance traits of Sahiwal cattle in Pakistan. Pakistan Veterinary J. 32, 229–233 (2011).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81 https://doi.org/10.1086/519795 (2007). 559 – 75.
Delaneau, O., Marchini, J. & Zagury, J. F. A linear complexity phasing method for thousands of genomes. Nat. Methods. 9, 179–181. https://doi.org/10.1038/nmeth.1785 (2012).
Yurchenko, A. A. et al. Scans for signatures of selection in Russian cattle breed genomes reveal new candidate genes for environmental adaptation and acclimation. Sci. Rep. 8, 12984. https://doi.org/10.1038/s41598-018-31304-w (2018).
Nei, M. & Li, W. H. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc. Natl. Acad. Sci. U S A. 76, 5269–5273. https://doi.org/10.1073/pnas.76.10.5269 (1979).
Weir, B. S. & Cockerham, C. C. Estimating F-statistics for the analysis of population structure. Evolution 38, 1358–1370. https://doi.org/10.2307/2408641 (1984).
Tajima, F. Statistical method for testing the neutral mutation hypothesis by. DNA Polymorph. Genet. 123, 585–595. https://doi.org/10.1093/genetics/123.3.585 (1989).
Garud, N. R., Messer, P. W., Buzbas, E. O. & Petrov, D. A. Recent selective sweeps in North American Drosophila melanogaster show signatures of soft sweeps. PLoS Genet. 11, 1005004. https://doi.org/10.1371/journal.pgen.1005004 (2015).
Wright, S. Evolution in Mendelian populations. Genetics 16, 97–159. https://doi.org/10.1093/genetics/16.2.97 (1931).
Harris, A. M. & DeGiorgio, M. Identifying and classifying shared selective sweeps from multilocus data. Genetics 215, 143–171 (2020). (2020).
Danecek, P. et al. The variant call format and vcftools. Bioinformatics 27, 2156–2158. https://doi.org/10.1093/bioinformatics/btr330 (2011).
Todorov, V., Templ, M. & Filzmoser, P. Detection of multivariate outliers in business survey data with incomplete information. Adv. Data Anal. Classif. 5, 37–56. https://doi.org/10.1007/s11634-010-0075-2 (2011).
Venables, W. N. & Ripley, B. D. Modern applied statistics with S-PLUS. Springer Sci. Bus. Media (2002).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc.: Ser. B (Methodol.). 57, 289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x (1995).
Fonseca, P. A. S., Suárez-Vega, A., Marras, G. & Cánovas, Á. GALLO: an R package for genomic annotation and integration of multiple data sources in livestock for positional candidate loci. Gigascience 9, giaa149. https://doi.org/10.1093/gigascience/giaa149 (2020).
Rosen, B. et al. Modernizing the bovine reference genome assembly. Proc. World Congr. Genet. Appl. Livest. Prod 3, 802 (2018).
Hu, Z. L., Park, C. A., Wu, X. L. & Reecy, J. M. Animal QTLdb: an improved database tool for livestock animal QTL/association data dissemination in the post-genome era. Nucleic Acids Res. 41, D871–D879. https://doi.org/10.1093/nar/gks1150 (2013).
Thomas, P. D. et al. PANTHER: A browsable database of gene products organized by biological function, using curated protein family and subfamily classification. Nucleic Acids Res. 31, 334–341. https://doi.org/10.1093/nar/gkg115 (2003).
von Mering, C. et al. STRING: a database of predicted functional associations between proteins. Nucleic Acids Res. 31, 258–261. https://doi.org/10.1093/nar/gkg034 (2003).
Chin, C. H. et al. CytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst. Biol. 8, 1–7. https://doi.org/10.1186/1752-0509-8-S4-S11 (2014).
Acknowledgements
We express our gratitude to the Director of the ICAR-National Dairy Research Institute, Karnal, and Head of the Division for providing invaluable support and guidance during the entirety of our research. We also acknowledge the National Fellowship for Scheduled Tribe Students (NFST) for their financial support and assistance to the first author throughout this study.
Author information
Authors and Affiliations
Contributions
SM and AM: conceived the idea, LM: performed the experiments, analysis, and writing of the original manuscript. JT, GG, AKM: data curation and literature survey. II, IK, JV, AC, SPS, PP: writing-review, word processing, and editing. AM and SM: reviewed and edited the manuscript. All the authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Ethical statement
The studied animal of NDRI and Hisar Sahiwal was reviewed and approved by the Institute Animal Ethics Committee, NDRI (NDRI approval 43-IAEC-18-8). Ethical approval for Pakistan Sahiwal was not required because datasets are publicly available.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Muansangi, L., Tiwari, J., Ilayaraja, I. et al. DCMS analysis revealed differential selection signatures in the transboundary Sahiwal cattle for major economic traits. Sci Rep 15, 15685 (2025). https://doi.org/10.1038/s41598-025-93021-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-93021-5