Introduction

The enhancement of quantitative traits in beef cattle can be achieved through two primary genetic methods: selection within breeds and crossbreeding between breeds. Traits critical to beef production, such as calf survivability, reproductive efficiency, and environmental adaptability, often exhibit low heritability1,2. In such cases, crossbreeding emerges as a more effective approach, leveraging heterosis (hybrid vigor) to amplify productivity. crossbreeding programs can enhance productivity traits by up to 26% compared to purebred systems, a benefit attributed to the synergistic combination of complementary alleles from divergent parent breeds3. Genetic research has consistently highlighted the advantages of crossbreeding in enhancing desirable traits such as growth rate, feed efficiency, and meat quality4,5. For example, Brahman × Angus hybrids exhibit pronounced hybrid vigor, improving growth, carcass quality, disease resistance, and thermotolerance6. Advances in genomic tools now enable deeper insights: whole-genome resequencing of Heilongjiang Crossbred cattle identified functional genes linked to meat quality and reproduction7, while analyses of the Purun composite breed revealed genetic ties among founder breeds and loci governing heat-tolerance, growth and meat quality traits8.

However, crossbreeding outcomes are context-dependent, shaped by ecological and economic priorities. Crossbreeding indigenous cattle with commercial breeds in China, is a prevalent strategy to enhance productivity while preserving their unique adaptive traits9,10. The country’s rich genetic diversity, spanning over 53 native cattle breeds, provides valuable insights into genes associated with adaptation to varying climates11,12,13. Among these, Mongolian cattle stand out for their hardiness in extreme environments14,15, while Chinese Red Steppe (CRS) cattle, developed by crossing Mongolian cows with Shorthorn bulls16, are valued for their disease resistance, stress tolerance, cold climate adaptation, high-quality meat and milk production9,17. To further improve the meat quality and growth rate of CRS, breeders initiated a program crossing CRS with Red Angus (RACS), a well-known commercial breed for marbling and rapid growth traits, creating the Red Angus × Chinese Red Steppe (RACS) hybrid. This hybrid uniquely merges commercial productivity (Red Angus) with indigenous resilience (CRS), making RACS an ideal model to study how divergent genetic lineages interact. By analyzing RACS’s genomic architecture, our study explores how strategic crossbreeding can balance productivity and adaptation, a critical consideration for sustainable livestock systems.

Advances in WGS (whole genome sequencing) technology and genotyping platforms, decreasing the cost of sequencing, along with the appearance of a variety of statistical methods, have effectively helped to trace genomic regions and genes that were subject to selection in indigenous18, commercial and composite19 beef cattle. Different analytical methods have been developed based on the signal or pattern they capture. For instance, Fst and ROH are two widely used statistics that detect genetic signatures through allelic frequencies and haplotypes, respectively20,21,22,23. Fst is effective for identifying selection signatures between populations with unequal sample sizes24. In the other hand, ROH are consecutive homozygous segments across the genome, which are inherited from both parents to the offspring’s genome25. Shared regions of homozygosity (ROH islands) that occur with greater frequency in a population may indicate signs of selection associated with adaptability and economically important traits22,26. Additionally, the characteristics and abundance of ROH across chromosomes have extensively been investigated in farm animals to figure out the demographic history and estimate the genome-wide inbreeding27,28,29.

While crossbreeding in cattle has been extensively researched, the genetic architecture underlying the Red Angus × Chinese Red Steppe (RACS) hybrid remains poorly characterized. Although hybrid vigor and beneficial phenotypic traits in other crossbred populations are well-documented4,5, genomic mechanisms specific to RACS, particularly those driving its unique adaptability and productivity, have yet to be explored. This study investigates the genetic foundations of the RACS crossbreed to elucidate the functional outcomes of hybridization and identify genomic regions under selection. Using population genomics approaches, we map patterns of genetic divergence, assess diversity metrics, and pinpoint loci linked to critical traits such as disease resistance, environmental resilience, growth, and carcass quality. By addressing these knowledge gaps, our findings aim to advance precision breeding frameworks that optimize both productivity and sustainability in livestock systems, offering actionable insights for agricultural adaptation in diverse ecological contexts.

Results

Population genetic structure

Figure1 illustrates the genetic differentiation between the cattle populations. The first PC1 accounted for 5.74% of the variation and distinguished the dairy commercial breed (Holstein) from the other cattle breeds. The RACS crossbred population constituted a cluster in the left corner of the graph that overlapped with ANG and RAN, and situated near SHO and explained 4.55% of the total variation. The CRS and MON populations grouped together in the center, while SIM (Simmental), CHL (Charolais), HFD (Herford) and LMS (Limousine) were positioned in proximate clusters.

The results of population admixture analysis using different values of K (1 to 11) and cross-validation error plot are presented in Fig. 1. In the current study, the CV errors decreased with increasing k value, hence it is not easy to determine the appropriate number of the assumed ancestral population. However, the scrutiny of the plots of PCA, admixture and CV errors demonstrates that the most likely partition for populations is at k = 7. The RACS crossbred was mixed with its parents ANG, RAN, CRS and MON when K = 7. From k = 5 to k = 10, the SHO population was separated from the others.

Fig. 1
figure 1

Principal Component Analysis based on single nucleotide polymorphisms identified. (A) The visualization of the admixture analysis of 11 cattle populations (B) plot of cross-validation error vs. k-values (1–14) (C).

Genomic diversity

Runs of homozygosity

As shown in Fig. 2 (A), the CRS and then RACS crossbred populations harbored the lowest average number (36 and 88, respectively) and coverage of ROH per animal (101.098 and 265.813 Mb, respectively). The highest average number (214) and coverage (731.519 Mb) of ROH were indicated in the SHO breed. The frequency of ROH segments with a length of 1–2 Mb was higher than other length categories in all studied populations. The MON population followed by the CRS population exhibited the highest percentage of short ROHs (between 1 and 2 Mb) at 81.5% and 78%, respectively, while the SHO population had the lowest percentage at 43.4%. The RACS crossbred population showed a median proportion of shortest ROHs. The longest segments of ROH (> 16 Mb) had the lowest frequency in all breeds. The highest proportion of ROHs longer than 16 Mb belonged to CRS (2.9%) followed by HFD (2.2), HOL (1.6), and RACS (1.5) (Table 1; Fig. 2A).

Table 1 The statistical description of ROH segments distribution in different length categories per each cattle breed.

Nucleotide diversity and inbreeding coefficients

The RACS and CRS populations demonstrated the highest average nucleotide diversity (0.436 ± 0.024 and 0.418 ± 0.017, respectively), while that of the HFD and SHO breeds had the lowest values (0.367 ± 0.048 and 0.369 ± 0.023, respectively) (Fig. 2B).

Genomic inbreeding values (mean, lowest, and highest) estimated using four different approaches are shown in Table 2. The RACS population exhibited the lowest inbreeding coefficients across all ten beef and one dairy cattle breeds using three methods (FGRM, FHOM, and FUNI), However, when using the FROH method, the CRS population exhibited the lowest inbreeding coefficient, followed by RACS. In RACS, FGRM (average of -0.047), FHOM (average of -0.047), and FUNI (average of -0.047) values ranged from 0.019 to 0.232, -0.203-0.268, -0.143-0.081, and − 0.133–0.169, respectively. Since different estimators have different characteristics, the estimated value of the inbreeding coefficient based on FROH is between 0 and 1, and FUNI, FGRM, and FHOM estimators can take negative30,31. In the estimates based on FGRM, FHOM, and FUNI, the highest inbreeding value (0.088) was observed in the HFD breed, and for FROH, the SHO breed was indicated to be the highest (0.292). While FGRM and FUNI inbreeding coefficients exhibited a strong positive correlation in most breeds (Table S2), FROH generally showed weak, often negative, correlations with other methods. In RACS, the highest and lowest correlation was computed between FGRM and FUNI (0.953), and FHOM and FROH (0.048), respectively.

Table 2 The genomic inbreeding coefficients calculated from four methodologies in eleven cattle breeds.

Linkage disequilibrium and effective population size

Across all populations, the LD decay analysis revealed a decrease in the average r2 values as the distance between SNPs increased (Fig. 2C). On average, the estimated LD values ranged from 0.39 (MON) to 1 (SHO) at a distance of 1000 bp between markers. The RACS crossbred represented a decreasing trend almost similar to HOL. Notably, the CRS population exhibited a sharp decline in LD values as the SNP distance increased up to 40 kb (0.188). Overall, CRS and MON displayed the lowest LD values, while RACS showed moderate values, with r2 values of 0.75 and 0.13 at 1 and 200 kb, respectively.

All breeds exhibited a declining trend in effective population size (Ne) over generations, as depicted in Fig. 2D. Notably, the MON breed, initially displaying the highest Ne, experienced a sharp decline in recent generations. Over the past 15 generations, RACS crossbreds showed higher Ne than RAN and CRS. While CRS initially had a higher Ne than RACS, its Ne has decreased more rapidly.

Fig. 2
figure 2

The distribution of the average ROH number in different length categories, the colors represent the ROH lengths in Mb (A). A display of the nucleotide diversity per breed, a horizontal line drawn in the box denotes the median (B), genome-wide average LD decay estimated from each breed (C), the effective size of eleven populations over past generations (D).

Signatures of selection

We defined an ROH as an island when the SNPs within a run were observed in more than 45% of the population (Fig. 3). As shown in Table S3, the highest and lowest ROH islands numbers belonged to SHO (189 islands), and MON (one island), respectively. The SHO breed exhibited the longest ROH island, spanning 16 Mb on Bos Taurus Autosome (BTA) 3, and also contained the highest number of SNPs within ROH islands, with 105 SNPs on this chromosome. There were 8 ROH islands on BTA 2, 5, 9, 10, 13, 15, 20, and 21 in the RACS breed, which largest ROH island was on BTA5 between 75,585,421 and 76,993,652 kb (39 SNP). This region contains SYT10, ELFN2, MFNG, CARD10, USP18, ALG10, C1QTNF6, SSTR3, RAC2, CYTH4 genes. Two ROH islands were identified in the CRS breed, located on BTA 7 and 21. These islands harbor genes such as NDN, MAGEL2, and MKRN3.

A list of candidate genes identified in ROH islands is depicted in Table S4. Of the 259 ROH islands identified in all breeds, only 79 islands spanning a minimum number of 15 markers were taken into account to detect candidate genes (Table S4). In summary, 2150 genes were identified in ROH regions, including 1710 protein-coding genes and 441 genes of the other types (lncRNA, miRNA, misc_RNA, pseudogene, snoRNA, and snRNA). These genes participated in 52 GO terms (21 biological processes, and 31 molecular functions) across all breeds (Table S5).

Of the 101 genes identified in RACS, 5 biological processes (BP) and 11 molecular functions (MF) were enriched. These included peptidyl-arginine modification (GO:0018195), granulocyte chemotaxis (GO:0071621), and folic acid receptor activity (GO:0061714).

Fig. 3
figure 3

Manhattan plot of occurrence of SNPs in the runs of homozygosity over all autosomal chromosomes of six cattle breeds.

Tables S6 and S7 list the genes and GO terms identified within the candidate selection regions, as determined by the Fst analysis. A total of 508 protein-coding genes and 16 GO terms (BP = 24, MF = 25, CC = 14) were detected in the Fst analysis of RACS vs. SRS, MON, ANG, RAN, SHO. Some terms contained defense response to protozoan (GO:0042832), defense response to Gram-positive bacterium ( GO:0050830), chemokine-mediated signaling pathway (GO:0070098) and glucose 6-phosphate metabolic process (GO:0051156). The Fst analysis revealed a strong signal on bovine chromosome 13 (BTA13) in the RAN breeds (mFst = 0.80), encompassing the ASXL1 gene (Fig. 4). This signal was also observed in ANG and SHO breeds. Furthermore, the NOL4L and NFIA genes were identified in the analysis comparing RACS to ANG, SHO, and RAN. In the genetic differentiation analysis between RACS and MON, SHO, and CRS, we identified three genes of interest: NPAS3, SEMA3E, and FUT9.

Fig. 4
figure 4

The plots visualize the Fst values obtained from the comparison of five cattle breeds with the RACS breed. The significance threshold values for the top 1% were shown in red in the middle of the Manhattan plots.

Discussion

Exotic breeds have been utilized in crossbreeding programs to enhance the productive traits of indigenous breeds. This approach not only boosts the productivity of native livestock but also helps preserve their unique characteristics, such as disease resistance and adaptability to harsh environmental conditions10. In other words, high-yielding and well-adapted synthetic breeds can be developed through crossbreeding. The current study is the first to investigate the genomic characteristics of the RACS crossbreds. Admixture analysis revealed substructures among RAN, ANG, CRS, and MON within the RACS cluster at K = 7. Overall, the population structure analysis indicated that the RACS crossbred has a closer genetic relationship with RAN than with CRS as the parent breeds. This aligns with the fact that since 1997, the CRS breed has been continuously crossed with RAN, resulting in a significant genetic contribution from RAN in the crossbreds. Additionally, because of the involvement of the SHO and MON breeds in the crossing program of the CRS, they are genetically closer to the RACS compared to other breeds.

All studied populations exhibited a steep decline in effective population size in recent generations. Over the past 15 generations, however, the RACS crossbred population maintained a moderately higher Ne than its founder breeds, RAN and CRS. Despite this advantage, the Ne observed in RACS remains lower than that of the Purunã composite breed, a population with a longer history of inbreeding and genetic management8. Notably, RACS also displayed a smaller Ne compared to several purebred breeds studied, consistent with trends observed in other composite populations32. While hybridization initially boosted diversity, the RACS population likely originated from a limited number of founding individuals, creating a bottleneck that restricted the effective contribution of parental genomes. Over generations, artificial selection for traits such as growth, marbling, and environmental resilience may have further narrowed the genetic pool by favoring specific alleles, reducing the number of reproductively influential individuals. Additionally, continuous backcrossing with RAN since 1997 skewed genetic contributions toward one parent breed, diminishing the effective input from CRS and ancestral populations like Mongolian cattle. Although nucleotide diversity captures the broad genetic variation inherited from divergent founder breeds, Ne is more sensitive to recent demographic events, such as selection pressures and breeding strategies, which amplify homozygosity and linkage disequilibrium (LD)33,34. Despite this, RACS exceeded the FAO’s Ne threshold (≥ 50) to mitigate inbreeding risks35. Consistent with Ne trends, RACS exhibited significantly lower LD and higher heterozygosity than its founder breed RAN, indicating enhanced genetic diversity and recombination efficiency. This contrasts with Xia’nan crossbred cattle, where hybrids displayed higher LD than their parental Charolais breed36. Reduced LD in RACS mitigates inbreeding risks and amplifies heterosis potential by diminishing haplotype fixation.

The extent and length of continuous homozygous segments over the genome (i.e., ROH) depend on various factors including, demographic events, selection pressures, population effective size, and inbreeding37,38. In other words, ROH distribution is non-random on the genome and is more prevalent in regions with low recombination and higher LD29. Most of the detected ROHs (43–81%) across populations were short segments (1–2 Mb). A similar pattern was observed in previous studies in cattle22,39,40. This likely reflects ancestral relationships and more ancient inbreeding41,42.

In composite populations, a low ROH count is crucial as the level of heterosis diminishes with a rise in ROH occurrence43. Notably, RACS crossbred cattle exhibited the lowest average ROH quantities per animal compared to purebreds, aligning with trends in composite populations44,45. Admixed populations, due to their divergent ancestry across multiple lineages, inherently accumulate fewer ROH than their parental populations29. This trend was evident in RACS, where we observed a reduced proportion of ROH especially short ROH (1–4 Mb) segments relative to their founder breeds (CRS and RAN). Short ROHs are less likely to harbor severe recessive deleterious alleles due to the historical purging of highly harmful variants through selection46. However, these segments may retain mildly deleterious or neutral alleles. The low frequency of short ROH in RACS minimizes cumulative genetic load, enhancing overall fitness and productivity, while preserving adaptive potential through retained heterozygosity.

Inbreeding and its detrimental consequences, including the inbreeding depression for reproductive traits and fitness and a loss of genetic diversity, have underscored the importance of accurate inbreeding estimation in livestock47,48. Genomic inbreeding coefficients, particularly those based on runs of homozygosity (FROH), provide a more precise measure of autozygosity than pedigree-based methods30,49, which often fail to capture cryptic relatedness50. Unlike FGRM (derived from genomic relationship matrices), FHOM (measuring excess homozygosity), and FUNI (correlation of uniting gametes), FROH directly quantifies contiguous homozygous segments and is unaffected by allele frequency biases or population structure51. Additionally, FGRM, FUNI, and FROH can yield negative values and function more like correlation coefficients31, while FROH confined to a range of 0 to 1 and directly reflects autozygosity52. These methodological differences explain the observed discrepancies between FROH and other estimators. Notably, correlations among inbreeding estimators varied significantly. Consistent with previous studies8,48,51, FGRM and FUNI exhibited the strongest correlation, likely due to their shared emphasis on rare alleles in quantifying inbreeding30. Divergence in population-specific allele frequencies may further explain the variable correlations observed between FGRM, FUNI, and FHOM51. FROH displayed low or negative correlations with other estimators, aligning with recent reports53,54. This finding contrasts with studies that have documented moderate to high correlations in other populations55,56. These discrepancies highlight methodological dependencies; FROH is unaffected by allele frequencies but sensitive to the marker density and ROH detection parameters51. it has been shown that longer ROH segments are linked to stronger correlations between FROH and other inbreeding coefficients56,57, suggesting methodological considerations significantly influence outcomes. The observed negative correlations may reflect population-specific dynamics: elevated FROH values coupled with low inbreeding estimates from other metrics could signal localized selection pressures driving homozygosity at trait-associated loci without genome-wide increases in relatedness. In contrast, for example, FGRM’s reliance on allele frequencies enables it to capture shifts in genetic diversity across the entire genome, including variation unrelated to homozygosity (e.g., allele frequency drift)51.

Compared to purebred populations, RACS crossbred cattle showed the lowest inbreeding coefficients, reflecting their recent hybrid origin. This is supported by their higher nucleotide diversity than other populations studied. Moreover, their inbreeding coefficients were lower than those previously reported for Heilongjiang crossbred cattle7. The low inbreeding levels observed in the RACS population suggest a high level of genetic diversity, which is advantageous for maintaining population health and adaptability. This diversity can enhance traits such as fertility and disease resistance, contributing to the overall performance and sustainability of the cattle population. The crossbreeding strategy employed in RACS appears to be effective in introducing new genetic variations and reducing the likelihood of homozygosity for deleterious alleles, thereby minimizing inbreeding depression. Thus, while RACS cattle currently exhibit low inbreeding, continuous genomic monitoring is essential to maintain genetic health and sustainability. Future efforts should prioritize whole-genome sequencing and expanded datasets for crossbred populations. Refining ROH detection parameters and disentangling selection from drift will improve the inbreeding estimation accuracy, enabling nuanced assessments of genetic health in managed populations. Such advancements are essential to balance productivity and sustainability in modern breeding programs.

A key advantage of crossbreeding is breed complementarity, which involves combining desirable traits from founder breeds to create crossbred or composite animals with enhanced characteristics58. In the case of the RACS crossbred population, comparing it to its founder breeds, RAN (known for marbling and growth) and (CRS, valued for cold adaptation and disease resistance), offers critical insights. This comparison helps elucidate how crossbreeding reshapes genomic architecture to achieve a balance between productivity and environmental resilience, addressing a fundamental challenge in livestock breeding.

In recent years, scrutiny of the ROH region to identify footprints of selection has extensively been of interest to researchers21,27,59. While a significant correlation exists between contiguous homozygous stretches and candidate regions under selection60, interpreting ROH regions as definitive signatures of selection requires caution. Other evolutionary processes, such as genetic drift, population structure, and recombination rate, can also contribute to ROH patterns37,61,62.

In exploring ROH islands in the RACS breed, a 1.4 Mb island on BTA5, encompassing the highest SNP density, was identified. Within this region, ten candidate genes linked to marbling traits were discovered, including ELFN2, MFNG, CARD10, USP18, ALG10, and SYT1063, . This trait may have originated from the RAN parent breed. Genomic association studies indicate that the SYT10 and ALG10 genes are linked to longevity and stability traits across various cattle breeds. SYT10 is particularly important for the release of insulin-like growth factor 1 (IGF1), implying its contribution to longevity by influencing growth and reproductive efficiency64,65. The ROH island also includes immune-related genes critical for disease resistance. USP18 is crucial for the innate immune response and plays a significant role in defending against viral infections66,67. RAC2, encoding a Rho-family GTPase, enhances B-cell signaling and microbial phagocytosis in Holsteins68 and parasite resistance in sheep69,70. Similarly, the IL2RB, gene is known as a key immune factor in cattle71. IL2RB, essential for T-cell-mediated immunity and immune homeostasis, has been associated with disease progression in various species72,73. On BTA15, the IL18BP gene was identified, another gene associated with immune traits. IL-18 binding protein (IL-18BP) acts as a natural regulator of the pro-inflammatory cytokine IL-18. Its role in immune modulation74 highlights applications in veterinary medicine. The identification of these genes suggests enhanced disease resilience in RACS cattle, likely inherited from the CRS parent9, aligning with their robust herd health and productivity.

Crossbreeding and combining the favorable genes of exotic and indigenous breeds leads to adapted cattle with better meat quality and production efficiency3,10. The CAMK1D gene was the only one identified as common between the ROH analysis of RACS and the Fst analysis comparing RACS and CRS. This gene plays a crucial role in various physiological processes that are vital for the development and productivity of cattle, including muscle and development growth75, heifer early calving until 30 months, and stayability76. This gene also influences immune system responses, enhancing the animal’s ability to cope with infections and diseases77,78. A recent study has shown that CAMK1D regulates feed consumption and obesity development in mice79, suggesting its potential role in optimizing feed efficiency in cattle, which is crucial for sustainable beef production.

Additionally, identified on BTA29 through Fst analysis of RACS vs. RAN, exhibited strong selection signals. It has been reported that this gene could be associated with body height in Holstein cows80. A recent study in cetaceans identified this gene as a contributor to tall stature and overgrowth81. The association of AIP with stature indicates its potential role in growth-related traits, which can affect the overall productivity and adaptability of RACS cattle.

Several genes were commonly observed in genetic differentiation analysis between the RACS crossbred and its founder breeds. Notably, a strong selection signal on BTA13 (detected via Fst analysis comparing RACS with RAN, ANG, and SHO) encompassed the ASXL1 gene. ASXL1 regulates gene expression through epigenetic mechanisms, potentially influencing immune cell development and function82,83. The same chromosomal region (BTA13) also harbored the ADA gene, which exhibited differentiation in comparisons between RACS with MON and CRS cattle. ADA is essential for lymphocyte development, particularly T-cell proliferation and differentiation, and contributes to macrophage maturation84. Its critical role in immune function is underscored by studies linking ADA deficiency to lymphopenia and progressive immune dysfunction85. In cattle, ADA activity has been proposed as a biomarker for bovine tuberculosis86, inflammation, and immune activation87, highlighting its diagnostic utility in the RACS crossbred cattle.

SEMA3E, detected in RACS vs. SHO, MON, and CRS comparisons, coordinates immune responses against bacterial infections. SEMA3E, a secreted semaphorin protein, influences cell proliferation, migration, inflammatory responses, and host defense against infections. Research demonstrates that SEMA3E is critical for protective immunity against Chlamydia muridarum lung infection in mice, coordinating the functions of T cells and dendritic cells (DCs)88,89. As MON cattle (adapted to extreme environments) contributed to RACS ancestry, ADA and SEMA3E likely originate from CRS (MON × SHO), enhancing disease resistance and adaptability.

Four genes, SNTG1, KCTD8, ADAMTS2, and NRAP were commonly detected in genetic differentiation analyses of RACS vs. ANG and RAN. Genome-wide association studies suggest that SNTG1 influences body length and longevity in cattle65,90. It has been reported that KCTD8 is associated with carcass traits in composite beef breeds91 and milk production in dairy cattle92. KCTD8 also emerges as a potential selection signature in Maremmana cattle93. This gene encodes subunits for potassium channels linked to prolactin regulation94. ADAMTS2, a procollagen N-proteinase, processes procollagens into collagen95, impacting fat deposition in muscle96, postnatal skeletal muscle development, and meat quality in cattle97. Its role in post-weaning growth is further supported by GWAS in sheep98. Collectively, ADAMTS2, SNTG1, and KCTD8 may enhance meat quality, growth performance, and carcass traits in RACS cattle, fostering economically favorable outcomes for beef production.

The NRAP gene, encoding a highly conserved actin-binding protein critical for muscle function, is strongly associated with cold adaptation in mammals. Primarily expressed in skeletal and cardiac muscles, NRAP facilitates myofibrillar assembly and force transmission, which is particularly vital for cardiac efficiency during cold stress99. A Yakut cattle-specific mutation in NRAP, shared with 16 other cold-adapted species, exemplifies convergent evolution, where distinct lineages independently evolved the same genetic adaptation to enhance heart function in frigid environments100 This mutation likely supports efficient blood circulation during hibernation or extreme cold, underscoring NRAP’s central role in cold resilience. Northern Chinese cattle breeds, including Mongolian (MON) cattle, descendants of taurine ancestry, are exceptionally well-adapted to cold climates100. Given MON’s contribution to the RACS lineage, the NRAP gene likely originated from the CRS parent breed. These findings position NRAP as a key genetic driver of cold adaptation in RACS cattle.

Our study established a genomic baseline for the RACS crossbred population, identifying candidate genes critical for resilience, productivity, and adaptability. These findings provide a foundation to explore their functional roles, enhancing our understanding of the genetic architecture driving these traits. To maximize benefits, we recommend continuing the crossbreeding strategy between Red Angus and Chinese Red Steppe cattle to preserve genetic diversity and heterosis, a proven method to reduce inbreeding and amplify desirable traits like disease resistance, marbling efficiency, and environmental resilience. Simultaneously, regular genomic monitoring of inbreeding levels should be implemented to safeguard diversity and preemptively mitigate inbreeding depression risks.

Integrating genomic selection, particularly through genomic estimated breeding values (GEBVs), will accelerate genetic gains by enabling precise identification and propagation of superior alleles linked to key traits. Collectively, these strategies, strategic crossbreeding, vigilant inbreeding management, and advanced genomic tools, will foster sustainable, high-performing cattle populations capable of thriving in challenging environments while meeting demands for efficient, ethical beef production.

However, while this genomic foundation is vital, we emphasize that phenotypic correlation remains a critical next step. Our study lays the groundwork for future research, in which pairing genomic insights with performance or resilience data will inform targeted breeding strategies. Bridging genomic potential with real-world utility through such integration is essential, and we prioritize this in subsequent investigations to maximize the practical impact of crossbred optimization efforts.

Materials and methods

Data resources and quality control

This study utilized blood samples collected from cattle during routine veterinary procedures on private farms in Xilingol and Ordos, located in Inner Mongolia, China. The sampling process adhered strictly to standard agricultural practices, ensuring no additional interventions were introduced for research purposes.The genotyping analysis conducted on these samples is a common and accepted practice in animal science, often employed for breed improvement and health screening. Importantly, this research did not involve any direct experimentation on live animals, which would typically necessitate a more rigorous ethical review. All sample collection procedures followed established protocols for animal handling and welfare in agricultural settings. The use of these samples for genotyping falls within the regulatory framework for agricultural animal research. While formal ethics committee approval is not required for this type of study, we confirm that all procedures were carried out in accordance with relevant guidelines and regulations of Inner Mongolia Academy of Agricultural & Animal Husbandry Sciences. Additionally, our experimental protocol was reviewed and approved by the appropriate internal review process at Inner Mongolia Academy of Agricultural & Animal Husbandry Sciences to ensure compliance with institutional standards. We also affirm that our methods comply with the ARRIVE guidelines (https://arriveguidelines.org). This approach ensures minimal invasiveness and no additional harm to the animals while contributing valuable data to the field of animal science. The study was conducted in accordance with the ethical standards set forth by the Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences, which oversees animal research practices in this context. A total of 119 cattle, comprising 104 Red Angus × Chinese Red Steppes (RACS) crosses and 15 Chinese Red Steppes cattle (Mongolian × Shorthorn), were genotyped using the GGP Bovine 100k SNP array and the ARS-UCD1.2 genome reference. The genotypes of eight beef cattle breeds (Angus, Herford, Limousine, Charolais, Mongolian, Shorthorn, Red Angus, and Simmental) and one dairy breed (Holstein) were provided from the WIDDE database101. The information associated with the 669 animals used in the current study is represented in Table S1. All the genotype data were filtered based on the following quality control criteria: (1) individuals and SNPs with a call rate less than 0.95, (2) SNPs with an HWE (Hardy-Weinberg equilibrium) test below the \(\:{10}^{-6}\), (3) SNPs with a minor allele frequency lower than 10%, (4) SNPs unmapped to autosomal chromosomes, were excluded from downstream analyses.

Population genetic structure

To figure out the genetic structure of eleven cattle breeds, principal component analysis (PCA) was carried out and plotted using the SNPRelate and ggplot2 packages in R software102,103, respectively. Moreover, we applied ADMIXTURE v1.3.0 for the estimation of individual ancestries104. An optimum value of K (number of assumed ancestral populations) was inferred with a cross-validation (CV) procedure. We analyzed populations admixture with k = 2 to k = 11 along with 2000 bootstrap replicates. Before investigating population structure, the merged SNPs of 10 breeds were pruned for high pairwise LD by PLINK v1.9 105 with the parameter “indep-pairwise 50 10 0.1”.

Genetic diversity

Inbreeding coefficients were investigated using four measures of inbreeding,\(\:{\text{F}}_{\text{G}\text{R}\text{M}}\:106,\:{\text{F}}_{\text{H}\text{O}\text{M}},\:\:{\text{F}}_{\text{U}\text{N}\text{I}\:}106\), and \(\:{\text{F}}_{\text{R}\text{O}\text{H}}106\) which are based on genotype additive variance, homozygous genotype, the correlation between uniting gametes and run of homozygosis, respectively. The inbreeding coefficient values were derived from the following formula:

$$\:{\text{F}}_{\text{G}\text{R}\text{M}}=\frac{{\left({x}_{i}-2{\widehat{p}}_{i}\right)}^{2}}{{h}_{i}}-1$$
(1)
$$\:{\text{F}}_{\text{H}\text{O}\text{M}}=1-\frac{{x}_{i}\left(2-{x}_{i}\right)}{{h}_{i}}$$
(2)
$$\:{\text{F}}_{\text{U}\text{N}\text{I}}=\frac{{x}_{i}^{2}-\left(1+2{p}_{i}\right){x}_{i}+\:2{p}_{i}^{2}}{{h}_{i}}$$
(3)

In the above three equations, \(\:{\text{x}}_{\text{i}}\) is the number of reference allele copies for the \(\:{\text{i}}^{\text{t}\text{h}}\) SNP, and\(\:{\text{p}}_{\text{i}}\)is the observed fraction of the reference allele at locus \(\:\text{i}\:\)and\(\:{\text{h}}_{\text{i}}=2{\text{p}}_{\text{i}}\left(1-{\text{p}}_{\text{i}}\right)\).

$$\:{\text{F}}_{\text{R}\text{O}\text{H}}=\frac{{\text{L}}_{\text{R}\text{O}\text{H}}}{{\text{L}}_{\text{a}\text{u}\text{t}\text{o}}},$$
(4)

where\(\:{\:\text{L}}_{\text{R}\text{O}\text{H}}\) is the length of ROH regions in an individual’s genome, while \(\:{\text{L}}_{\text{a}\text{u}\text{t}\text{o}}\)​ denotes the total genome size that is covered by markers.

Pearson’s correlation coefficients between different estimators of inbreeding coefficients were computed using cor and cor.test functions in the R software (http://www.r-project.org/).

Nucleotide diversity, within each breed, was calculated using the --het option in VCFtools v0.1.15 software106. It investigates diversity from the filtered SNP dataset based on observed heterozygosity. Linkage disequilibrium decay (LDD) and the population recombination history was assessed using PopLDdecay v3.42 software107. The mean\(\:{\:r}^{2}\) values were calculated for markers with physical distances less than 200 kb.

The effective population size (Ne) for each breed was calculated using a multithreaded tool, SNeP108. This software estimates the population demography through LD and the formula presented by Corbin, et al.109:

$$\:{N}_{T\left(t\right)}={\left(4f\left({c}_{t}\right)\right)}^{-1}\left(E{\left[{r}_{adj}^{2}|{c}_{t}\right]}^{-1}-\alpha\:\right),$$
(5)

where\(\:{N}_{T\left(t\right)}\) is defined as the effective population size in\(\:t\:\)generation ago, \(\:{c}_{t}\) is the recombination rate between markers with a specific physical distance,\(\:{r}_{adj}^{2}\) is the Linkage disequilibrium (\(\:{r}_{ad}^{2}={r}^{2}-{\left(\beta\:n\right)}^{-1},\:\:n=\:\text{t}\text{h}\text{e}\:\text{n}\text{u}\text{m}\text{b}\text{e}\text{r}\:\text{o}\text{f}\:\text{i}\text{n}\text{d}\text{i}\text{v}\text{i}\text{d}\text{u}\text{a}\text{l}\text{s}\:\text{a}\text{n}\text{d}\:\beta\:=1\:or\:2\)), \(\:\alpha\:\) is a correction for the occurrence of mutations (\(\:\alpha\:=1,\:2\:or\:2.2\)).

To identify and investigate ROH across the beef cattle genome, we employed the “consecutive runs” method in the detectRUNS software110. The minimum number of consecutive SNPs required for a run adjusted to 15. To mitigate the risk of underestimating the length of long ROH regions and to account for potential genotyping errors, we permitted up to two missing genotypes and two opposing genotypes within each run. Additionally, both the maximum allowed gap between consecutive homozygous SNPs (bp) and the minimum length (bp) for recognized ROH were designated at 1 M. Mean ROH length (Mb), percentage of genomic coverage, and the mean ROH number were separately calculated for five ROH length categories 1–2, 2–4, 4–8, 8–16, and > 16 for each breed.

Signatures of selection

Putative signatures of selection between populations were evaluated using Fst statistics in VCFtools software106. The locus-specific Fst values were estimated in the sliding windows of 100 Kb by a step size of 50 Kb. The top 1% of the Fst values was considered as the candidate regions of selection. To trace selection sweeps in the ROH regions, the proportion of times a given SNP presented within a run in each individual was computed and then the detected SNP position was plotted across all autosomal chromosomes. The 45% ROH occurrence threshold in each breed was determined to define the putative ROH islands40. It is important to note that to identify the common selection candidate regions of RACS with their parents and other commercial breeds, ROH analysis was conducted in each breed, along with Fst analysis comparing the RACS crossbred breed to CRS (Cross Red Steppes), RAN (Red Angus), ANG (Angus), SHO (Shorthorn) and MON (Mongolian).

Gene enrichment analysis

Bioinformatic analyses for potential regions under selection were done in two ways: first, the genes placed in candidate regions were detected via the Variant Effect Predictor tool111, second, the gene ontology (GO) terms, molecular functions and biological processes associated with the identified genes were obtained through the Database for Annotation Visualization and Integrated Discovery (DAVID)112. The defined p-values were adjusted to the Fisher exact statistics (P < 0.05).

Conclusions

Our comprehensive genomic analysis of the RACS crossbred reveals valuable insights into the genetic architecture underlying adaptation and productivity. The observed clustering of RACS with Angus and Red Angus in PCA, coupled with its high nucleotide diversity and low inbreeding coefficients, highlights the successful integration of genetic resources from both founder breeds. Our detection of candidate regions associated with immune response, cold adaptation, and carcass traits within the RACS population supports its potential for resilience in challenging environments. Furthermore, the analysis of Runs of Homozygosity (ROH) indicated that the RACS crossbred population harbored the lowest average number and coverage of ROH per animal after CRS, suggesting a broad genetic base. The higher effective population size (Ne) in RACS compared to its parental breeds (RAN and CRS) over the past 15 generations indicates a promising trajectory for maintaining genetic diversity in this crossbred. By providing a detailed characterization of the RACS genome, including population structure, diversity metrics, and signatures of selection, our study contributes a valuable resource for informing future breeding strategies aimed at optimizing beef production in diverse ecological contexts.