Introduction

Records of coffee cultivation began in the 16th century1. Since then, coffee production has expanded across the intertropical world, covering approximately 11 million hectares and involving around 25 million producers, establishing itself as the second most valuable global commodity and the second most consumed beverage worldwide2. The beneficial economic and social sector, provided by the crop, faces several challenges, primarily driven by climate change and its impacts on production and farmers’ livelihoods. Rising temperatures, prolonged droughts, pest and disease outbreaks, labor shortages, and increasing production costs threaten the sustainability of coffee cultivation3. Additionally, there is an increasing demand for sustainability across the coffee supply chain. Addressing these challenges requires technological advancements to enhance the resilience of coffee production4.

The development of improved coffee cultivars represents one of the most promising and sustainable strategies for addressing the challenges facing global coffee production. Advances in coffee breeding have resulted in cultivars with greater adaptability to climate change, resistance to pests and diseases, and improved yield stability. These cultivars reduce reliance on chemical inputs, enhance resource efficiency, and contribute to the long-term sustainability of coffee farming by maintaining consistent production under variable environmental conditions. Additionally, higher-yielding and high-quality cultivars can improve farmers’ livelihoods by increasing productivity and meeting the evolving demands of global markets5,6,7.

Hybrid vigor, or heterosis, has favored the agricultural success of different crops, meeting the needs of farmers in the search for more profitable businesses, driven mainly by increased productivity and increased efficiency. Hybrid plants can be combined into cultivars with a high level of productivity and complementary characteristics, such as resistance to diseases, pests, nematodes, and precocity. In addition to contributing to greater crop stability and productive adaptability, as they bring together the genetic constitution characteristics of parents that perform well in different environments, that is, plants that have a high proportion of favorable alleles8.

In C. arabica, a self-pollinated plant with 2n = 4x = 44 chromosomes resulted from a natural hybridization between the Coffea canephora (Robusta coffee, subgenome CC (subCC)) and Coffea eugenioides (subgenome EE (subEE)) both with 2n = 2x = 229, the hybrid vigor in intraspecific crosses is studied since the beginning of 1950s. However, at that time the heterosis for grain yield did not gain breeders attention due to the low value and lack of viable propagation technique in large scale10,11. With technological evolution and broadened genetic basis, in 1980s promising results start to be published12,13 and the first hybrid cultivar Ruiru 11 was developed in Kenia14,15 and in 1990s the hybrid production intensified in Central America5, Ethiopia16, Kenia17, Indonesia18 and Colombia19. In Brazil, the study of hybrid vigor began in 195020. To date, only two out of 124 available cultivars are propagated as clones, with an unknown level of heterozygosity21. According to the World Coffee Research (WCR) cultivar catalog, 10 of the 58 C. arabica cultivars are hybrids22 and in Ethiopia 42 cultivars were released between 1977 and 2018, including 35 pure lines and seven hybrids23.

Currently, most of the Arabica coffee sold worldwide comes from inbred cultivars due to the autogamy of the species and difficult in hybrid propagation by clones, although crossings could be carried out manually as in Ruiru 11 cultivar. The process to reach a homozygous cultivar, or pure line, has lasted approximately 25 years. In hybrid cultivar the development cycle can be reduced to 10 years5. The magnitude of heterosis, generally demonstrated in percentage, varies according to the phenotypic mean and diversity of the parents and the trait taking into consideration. As an example, according to the work of Bertrand et al.24 heterosis values in C. arabica L. ranged from 20 to 50% in relation to those of the parents.

The same author and collaborators commented in 2011 that since 1997, coffee growers in Central America have had access to C. arabica L. F1 hybrids propagated vegetatively, either by micropropagation or through somatic embryogenesis. The data presented in 2011 by the authors provide information from 2000 to 2006 for 15 locations in three Central American countries at different altitudes. They found that, in all altitude ranges, hybrids performed 52% better than cultivars did, ranging from 16 to 127% at low elevations (750 to 880 m) and 33%, considering mean values in higher lands (> 1,400 m), ranging from 8 to 95%25.

An important information that could be extracted is the performance of parents in hybrid combination in a diallel design to through estimation of the combining ability. This genetic mean component could be divided into general (GCA) and specific (SCA). The main purpose is to choose parents capable of inheriting their desirable phenotype26. Both are essential to plant breeders to delineate the future crosses focus on the breeders’ objectives, like developing a hybrid or homozygous cultivar, and could generate knowledge regarding the mode of genic action affecting the characteristics. Such information is of utmost importance because the one coffee plant could be of good performance but when used in crosses their merit could not be inherited. In C. arabica there is some information of combining ability of traits related to grain yield, morphological characters, beverage quality and root development, using parents with diverse genealogy and origin16.

Generally, for grain yield and their components GCA and SCA are statistically significant, with a predominance of SCA. Another important information is the possibility of interaction of combining abilities with environment (E) and the probability of lack of correlations between the parents phenotypic mean and the GCA and SCA magnitude27. So, crossing better parents and evaluating their descendants to obtain combining abilities could allow the cultivars evolution due to plants will always increase their performance trough plant breeding.

Since plant breeding is based on the accumulation of advantages, which must be used for the benefit of society. With this purpose, the Empresa de Pesquisa Agropecuária de Minas Gerais (EPAMIG) began crossing lineages in 2017, aiming to develop commercial hybrid combinations and generate information that optimizes the breeding program, thus helping to reduce the lack of information on the subject. Our hypothesis is based on two assumptions. First, heterosis could be reached by crossing correct parents. Second, combining ability of diverse C. arabica parental lines is present and some populations, generated by the crossings, would be more suitable to hybrid and others to pure lines cultivars development.

The results of increasing productivity are promising and can contribute to the sustainable development of Brazilian and global coffee farming through increased production efficiency, thus generating value for the entire chain. So, the purposes of this study were to verify the existence and quantify the heterosis for coffee grain yield, with a focus on exploring hybrid clonal cultivars and providing information to guide C. arabica plant breeding programs for generating variability to obtain cultivars through the estimation of combining abilities of the parents.

Materials and methods

The research was carried out at the Experimental Field of Patrocínio (CEPC), belonging to the Empresa de Pesquisa Agropecuária de Minas Gerais (EPAMIG) and located in the municipality of Patrocínio, Minas Gerais State, Brazil. The geographic coordinates of the experimental field are 18° 59’ 31.9” south latitude and 46°59’15.2” west longitude, with an elevation of approximately 993 meters above sea level .

The climate of the region is considered tropical climate (Aw) according to the Köppen climate classification, with hot and rainy summers, cold and dry winters, average temperature of 21.7 °C and annual precipitation of 1124 mm28. The highest average monthly temperatures 24ºC occur in September and October. From May to July, the average monthly temperatures are around 19ºC. Annual precipitation is concentrated from October to March and in the months of June to August, normal climatological indicates accumulated volumes of less than 21 mm (Fig. 1).

Fig. 1
figure 1

Climatological normal for average temperature and precipitation for the Experimental Field of Patrocínio, Minas Gerais State, Brazil. J, January; F, February; M, March; A, April; M, May; J, June; J, July; A, August; S, September; O, October; N, November; D, December.

The soil in the experimental area was classified as Oxisol typic Hapludox29. The chemical and physical characterization of this soil was determined after sample collection and laboratory analysis. In the collection of soil samples an auger was used. Simple samples were collected to form a composite sample from depths of 0-0.20, 0.20–0.40, 0.40–0.60 and 0.60–0.80 m (Supplemental Table S1).

To obtain hybrid of Coffea arabica L., crosses were performed involving 56 parents represented by commercial cultivars and lines from the EPAMIG germplasm bank. These crosses were performed manually in 2017. After harvesting the fruits resulting from the crosses and producing the seedlings, planting was carried out in February 2019 in the full-sun conventional system. A spacing of 3.5 m × 1.0 m was used, three and a half meters between planting furrows and one meter between plants, resulting in 2,857 plants ha−1.

The soil of the experimental area was prepared before planting and subjected to fertility analysis. Harrowing, limestone incorporation, opening of furrows and mineral and organic fertilization operations were carried out. Liming and fertilization were carried out following the recommendations30 for the nutritional and acidity levels obtained in the soil analysis performed previously. Pest, disease and weed control practices followed a technological standard adopted by the experimental field, aiming to better understand the agronomic performance of Coffea arabica L. hybrids.

The experimental plots were irrigated using a drip irrigation system. The amount of water applied was defined by the irrigation scheduling method based on evapotranspiration and soil water balance. The coffee crop evapotranspiration estimates were calculated according to Eq. 1. ETc = ETo * Kc (1), where ETc is the daily actual evapotranspiration (mm), Kc is the crop coefficient (dimensionless), and ETo is the daily reference evapotranspiration (Penman-Monteith method) (mm). The adopted specific Kc values ​​and the reference evapotranspiration (ETo) in a daily step were obtained through recommendations by Allen et al.31.

The experiment was performed in accordance with a randomized complete block design (RCBD) with three replications, with 90 hybrids of C. arabica L. and 34 lines out of the 56 lines that were used in the crosses (genotypes in advanced homozygosity). The evaluation was carried out by quantifying the fruits weight harvested from the six plants in the experimental plot.

The first harvest took place in 2021, followed by two additional harvests in the subsequent years, 2022 and 2023, when most of the plots had at least 50% of their fruits matured. The fruits harvested from the plots were weighed using a digital hand scale. Production data from the experimental plots for each harvest were adjusted to the ideal number of plants per plot using covariance, following the methodology outlined by Botelho et al.32. During the harvesting process, a four-liter sample from each plot was collected and weighed with a digital hand scale to estimate the conversion index between fruit and peeled coffee beans. The samples were then air-dried under sunlight for approximately 30 days. Once the samples reached a moisture level of around 12%, the beans were peeled and weighed using a digital desk scale. The moisture content was adjusted to 12%. Subsequently, the weight of peeled coffee beans per plot was calculated by multiplying the adjusted weight of the harvested fruit by the conversion index (kg of peeled coffee / kg of fruit). This weight was then converted into the variable grain yield of 60 kg bags per hectare (bags ha⁻¹) using a planting density of 2,857 plants per hectare and then accumulated grain yield (bags ha−1) of 2021, 2022 and 2023 harvests were calculated to eliminate the annual effects of crop production fluctuations33. After checking the assumptions of variance analysis, statistical analysis was performed considering the RCBD of this variable using the GENES Version program. 1990.2021.13234 and Scott and Knott mean test35 was performed to set similar treatments groups. To calculate the GCA and SCA the Statistical Analysis SAS Software36 program was used using accumulated hybrids grain yield flowing the Griffing26 method 4 and fixed effect.

Results

Table 1 presents a summary of the variance analysis of the accumulated coffee yield of the 124 treatments (90 hybrids and 34 lines) harvested from 2021 to 2023. Significant differences between treatments were observed for the accumulated grain yield in bags ha−1. The experimental coefficient of variation (CV%) observed was relatively high compared to that reported in commercial cultivar competition trials37. This is probably a reflection of the greater variability found between treatments since we are working with initial harvests of the crop and with different types of genotypes (hybrids and lines) in the same experimental trial.

Table 1 Summary of analysis of variance for the yield of processed coffee in bags of 60 kg per hectare in the accumulated harvests carried out in 2021, 2022, and 2023.

The average yield values for treatments, parents and hybrids, in the first three harvests are shown in Table 2. The mean productive performance of the 90 hybrids was 66 bags ha− 1 of processed coffee, which corresponds to 26.0 bags ha− 1 of coffee more than the line mean and was 40.7 bags ha− 1 of processed coffee in the accumulated harvests, highlighting the potential of these hybrid genotypes for commercial exploitation. The five best hybrid combinations, EPAMIG 37, EPAMIG 06, EPAMIG 36, EPAMIG 66, and EPAMIG 17, among the 90 evaluated, produced a mean of 120 bags ha− 1 in the first three harvests. This value is well above the mean for most cultivated commercial standards, Catuaí and Mundo Novo. The mean grain yield for the hybrid combinations was 118 bags ha− 1, ranging from 11 bags in the worst combination to 130 bags ha− 1 in the best combination. It is worth noting that the mean test used was efficient at differentiating the hybrids and their parents into four distinct groups and that 23 hybrids out of the 90 evaluated were positioned in the most productive group and no parent. In the second group of the mean grouping test out of 37 treatments, 28 were hybrids. In the third group formed by Scott and Knott test, 63% of the entries were hybrids. In the last group no hybrids were present, counting on two cultivars.

Table 2 Accumulated coffee grain yield values of F1 hybrids and their parents of Coffea arabica L. in bags ha− 1 during the 2021, 2022, and 2023 harvests.

The same lowercase letters in the columns indicate groups formed by Scott and Knott test (P ≤  0.05).

The mean heterosis (MH%) considering 18 crosses (including both MH% and H%) was 64.2% (-26.11 to 184.4). In these same crosses, the heterosis rate in relation to that of the superior parent, heterobeltiosis (H%), was 40.8%, ranging from − 42.9 to 196.7 (Table 3). In relation to heterosis, the highest values were obtained for the hybrid EPAMIG 30, and the highest heterobeltiosis was found for the hybrid EPAMIG 80. Despite the high mean heterosis values, some hybrids presented negative values, indicating that they are less productive than their parents are, revealing genetically close crosses. This fact is perfectly possible considering the narrow genetic base of the species C. arabica L.

Table 3 Heterosis (MH%) and heterobeltiosis (H%) of F1 hybrids of Coffea arabica L. for the processed coffee yield in the accumulated harvests of 2021, 2022, and 2023.

The SH% values of the 90 combinations ranged from − 74.4 to 181.2, with a 43.4% mean (Table 4). It is important to highlight that the yields of these 15 hybrids doubled in relation to the mean yield of the four cultivars most used for coffee growing (46.5 bags ha− 1 in total), and the hybrids produced more than 97 bags ha− 1 in total. The simple use of one of the three best hybrids for commercial crops would increase the accumulated yield of the first three years by more than 265%. Instead of producing 46.5 bags of processed coffee during the accumulation period, the producer would be producing 124 bags ha− 1 of processed coffee.

Table 4 Grain yield superiority in the accumulated harvests of 2021, 2022, and 2023 of Coffea arabica L. F1 hybrids in relation to the mean performance of the standards cultivars (SH%) and in relation to the best cultivar evaluated in the trial (SHS%) for the processed coffee yield.

Standards cultivars: Mundo Novo IAC 349 − 19, Catuaí Vermelho IAC 144, Catuaí Vermelho IAC 99, and Catuaí Amarelo IAC 62. Best cultivar: IPR 102.

Table 5 presents a summary of the diallel analysis considering the F1 hybrids, highlighting the significance of the GCA and SCA effects, indicating that the additive GCA of the evaluated parents and SCA of hybrids are different. So, parents are different in their favorable allelic frequencies and some hybrids have better parent complementation than others as expected by their parents’ average performance in crosses. The CGA accounted for 74% of the sum square of hybrids grain yield.

Table 5 Summary of the analysis of variance and the general (GCA) and specific ability (SCA) of combining the parents and hybrids of Coffea arabica L. for processing cumulative coffee grain yield in bags of 60 kg per hectare during 2021, 2022, and 2023 harvests.

The GCA assumed values between 69.3 to 'Bourbon Amarelo MG 009’ to − 63.1 for 'Catiguá Amarelo’. However, both parents have participating of only one cross each. When we pick up the parents with more crosses, the ‘Acauã Novo’ was the best and ‘MGS Catucaí Pioneira’ the worst. The higher the GCA value, the higher the frequency of favorable alleles from the parents in relation to those used in the development of the 90 hybrids. Other high breeding value parents were ‘IAC 125 RN’, ‘MGS Liberdade’, ‘Catiguá MG2’, ‘Sarchimor MG 8840’, ‘Gueisha’ and ‘IAC Obatã 4739’. Some of these good CGA parents are present in the phenotypic mean test (Table 2), with three of them ranking among the highest gain yield pure line parents. Furthermore, ‘IPR 103’ demonstrated notable grain yield and general combining ability (GCA), although it was involved in only one cross. On the other hand, ‘IPR 102’ had high grain yield and was one of the poorest GCA parent, probably indicating the presence of heterozygosis or negative heterosis with the parents crossed (Table 6).

Table 6 Values of the general combining ability (GCA) of Coffea arabica L. parents for the variable processed coffee yield in the accumulated harvests of 2021, 2022, and 2023 and (n) number of crosses.

The best ten SCA hybrids seven were classified in the first group of the mean test performed, one in the second and three in the third group. When considering the worst ten SCA hybrids, two were positioned in the second mean test group and the rest in the third group. The correlation between the mean phenotypic grain yield value and SCA was estimated in 0.49, a low value, indicating that the chosen of parents to make hybrids should not be based on phenotypic mean of the hybrids. Although EPAMIG 37 hybrid was the third in SCA and first rank in the phenotypic mean (Table 7).

Table 7 Values of the specific combining ability (SCA) of Coffea arabica L. F1 hybrids for the processed coffee yield at the 2021, 2022, and 2023 cumulative harvests.

Discussion

The use of hybrid cultivars to explore heterosis has been established in other species38. In C. arabica, this technology is gradually being adopted by coffee growers, especially in Brazil. If field conditions replicate the heterosis observed in experiments over the past seven decades, adopting hybrid cultivars could significantly boost grain yield with a simple and cost-effective change. Our hypothesis that heterosis could be reached by crossing complementing parents and combining ability of diverse C. arabica parental lines is present were confirmed by the results. The findings here encourage further development of hybrid cultivars and provide valuable information for parent selection, which is crucial for progress in plant breeding39.

After identifying the best hybrid combinations, a selection of genotypes was carried out among the individuals with the best combinations. Although C. arabica L. is a segmental allopolyploid species with a disomic inheritance and regular meiotic behavior40, phenotypic variability is observed between plants resulting from the same hybrid combination. This fact allows the best individuals to be explored within the best combinations, further expanding the genetic gain from selection. Aiming to quantify the gain from selection within the combinations, this was estimated for the best selected cross, EPAMIG 37, considering that this combination mean in the third harvest was 83.4 bags ha−1 and that four genotypes were selected within this combination, with a mean of 100.6 bags ha−1. Considering a heritability of 0.7 for yield, the gain with selection was estimated to be 14.6% (12 bags ha−1) by selecting within the best hybrid combination. These selected genotypes, along with others, are being cloned and will be evaluated again in the final stages of the EPAMIG breeding program.

Among the different ways of verifying the superiority of hybrids, crossing heterosis calculations are widely used in plant breeding. Normally, this heterosis can be expressed as a function of hybrid combination superiority in relation to the mean performance of the parents involved in the crossing or their superiority in relation to the best parent (heterobeltiosis). In this experiment, both were calculated for a small number of crosses in which both parents were present in the test.

From a practical view, the most important thing is to offer the producer superior cultivars. This superiority may be related to the best cultivars commercially used, which could be called heterosis in relation to commercial standards. It is known that national coffee production is based on a few cultivars, with the Mundo Novo and Catuaí cultivars (and their derivations/lines) serving as the basis of 80% of national plantations37. The hybrid combination superiority in relation to the standard mean (standard heterosis – SH%) for the 90 combinations was calculated using the standard of the cultivars Mundo Novo IAC 379/19, Catuaí Vermelho IAC 144, Catuaí Vermelho IAC 99, and Catuaí Amarelo IAC 62; thus, we could infer the gain in possible yield by using C. arabica L. hybrids.

When checking whether combinations can be commercially exploited based on their productive performance, it is essential to calculate their superiority in comparison to the mean performance of the most productive commercial cultivar (SHS). In this specific case, seven hybrids demonstrated a productive performance greater than 30% (25 bags ha−1 more) in relation to the best commercial cultivar present in the trial. Considering this fact and given that the superiority of most cultivated coffee cultivars today is greater than 75 bags ha−1, we can infer that cultivar renewal must be intensified, which will allow coffee growers to achieve greater profitability on their property in a short period of time. As demonstrated, the grain yield of hybrids must pre priority rather than magnitude of heterosis16.

The GCA effect expresses the mean performance of the parent in combination with other parents, and the SCA evaluates the part of the combination not explained by the GCA for each parent involved. In genetic terms, the GCA estimate is associated with genes with predominantly additive effects, in addition to dominance effects and some epistatic interactions of the additive × additive type. The SCA estimate, in turn, basically depends on dominance effects and other epistatic interactions that are usually of small magnitude and, therefore, neglected41. Several studies have documented the significance of both GCA and SCA effects and most of them argued that SCA effects are proportionally higher than GCA16,42. In our study, the GCA was proportionally higher than SCA. However, this discussion in not valid in practice due to the allelic frequency at each locus is probably different from 0.5, at least this assumption could not be made. In this situation de GCA does not contain only additive effect and the SCA contains lesser dominance effects. Thus, the degree of dominance or any index of ratio between these estimates are ineffective43.

In the process of obtaining hybrids, we tried to fully exploit the effects of GCA and SCA. Since the “per se” performance of parents is extremely important and can be observed in parents who have good GCA, it is expected that in programs aimed at obtaining hybrids, at least one of the parents has good GCA and that in crosses, they express the maximum SCA39,44,45. Our results of GCA identified the cultivars Acauã Novo and Catigua MG 2 as good parents, as were reported by Medeiros et al.39. It is important to remember the importance of evaluating the merit of parents based on their cross performance due to possible lack of correlations between the phenotypic mean and combining abilities27.

Two of drawbacks of our work are early evaluations and lack of environmental interaction estimates of heterosis and combining abilities. The evaluations of accumulated three harvests are sufficient to reach above 80% of accuracy46. However, there could be a reduction in heterosis as the plants become old. In Fontes et al.45 the authors reported a 7% heterosis reduction from the fourth to the sixth harvest. If the decline in heterosis over successive harvests is a general phenomenon when comparing hybrids and their parents, the observed rate of decrease was 3.5% per harvest. Therefore, if this rate remains constant, heterosis will no longer be present after five harvests, beginning with the sixth. Even under this scenario, hybrids would remain economically viable, as they would provide a faster return on investment for producers, thereby reducing financial risks. Another important aspect to consider is the potential for productive heterosis following pruning, which is an essential practice in modern coffee cultivation. Given that heterosis is highest during the early developmental stages of coffee plants, when root and vegetative growth rates are elevated, it is reasonable that maximum heterosis could be restored after pruning. This is due to the accelerated vegetative and root growth triggered by branch removal and partial root system senescence. The single environmental experiment led to absence of estimates of the heterosis and combining abilities interaction with environment. Some authors have reported the presence of significative GCA and SCA interaction16,42. So, it will depend on GCA and SCA of each location and parents26.

Despite the limitations, our work has the benefit of crossed a broad genetic diversity present in C. arabica, such as landraces and cultivars derived from “Typica”, “Bourbon”, Timor Hybrid, and introgression of C. canephora and Coffea liberica. In the literature reviewed of heterosis and combining ability in C. arabica, it is not found one work dealing with this number of crossed parents. Additionally, the parents represent the elite high-performance cultivars in Brazilian agriculture and so, with accumulation of favorable alleles gathering across many years of plant breeding process. It is evident that the parents should be with high grain yield and GCA and divergent, regardless of the molecular concepts of heterosis47.

As already mentioned, knowledge about the genetic potential of C. arabica L. genotypes in crosses is not abundant in literature as in cereal crops. However, this information is highly important for directing breeding programs. Plant breeding is based on the accumulation of advantages, in short, the accumulation of favorable alleles for different characteristics. Therefore, it is plausible that those genotypes that present better GCA are the most suitable for cultivation conditions, given that the final reflection of this adaptive capacity is the high productivity of their hybrid combinations. Thus, combining a single combination of characteristics of parents who present high GCA and who still manifest SCA heterosis would be ideal. These combinations could be used directly in commercial management if propagated vegetatively by any currently known technique, whether minicutting, somatic embryogenesis, or both48,49.

In 1985, Ameha and Belachew emphasized that “heterosis needs to be exploited as quickly as possible, and any delay could jeopardize the coffee industry in a short period of time”13. While the coffee chain could have potentially advanced further, the data indicates significant improvements in most key indicators. Hybrid cultivars have been adopted by farmers, and the benefits of heterosis, along with other advantages, have been successfully transferred into field conditions. According to coffee growers, hybrid cultivars offer some benefits, including higher yields, faster fruiting, rapid plant growth, excellent cup quality, and increased resistance to pests and diseases. Additionally, hybrids demonstrate greater productivity over time, provide faster profitability in the initial harvests, and require simpler farming practices. However, access to hybrid seedlings remains a shortcoming50.

Conclusions

There was heterosis for productivity in the first three accumulated harvests of the studied hybrids.

Hybrids mean heterosis was 64.2%, and the heterobeltiosis was 40.8%.

The general and specific combining ability is present in parents and their crossed progenies. The best parents are cultivars Acauã Novo, IAC 125 RN, MGS Liberdade, Catiguá MG2, Sarchimor MG 8840 and these should be preferentially explored in new hybrid combinations.