Introduction

In the current context of population growth, with estimates of a world population of 9.8 billion people in 2050 and 11.2 billion in 2100 (FAO 2017) requiring more efficient natural resource use, rice production in Latin America and Sub-Saharan Africa has a considerable potential for expansion. In this scenario of increasing demand and climate change, upland rice cultivation could play an important role, particularly given the land and water limitations on the expansion of irrigated rice. Embrapa (Brazilian Agricultural Research Corporation) is responsible for the national rice breeding program in Brazil, which aims to increase the competitiveness, resilience, and profitability of rice in tropical ecosystems, including uplands. This breeding program is based on a strategy of achieving genetic gains while preserving genetic variability to meet long-term objectives of improving the traits of interest (Barros et al. 2018; Martínez et al. 2014). A two-phase breeding scheme is used, consisting of a population improvement phase, in which the aim is to improve performance for quantitative traits, and a product development phase focusing on the fixation and selection of elite material (Fig. 1). The upland rice breeding program at Embrapa is based on four populations developed for high yield with medium cycle, high yield with short cycle, drought tolerance and low phosphorous tolerance. These populations have been systematically improved, with breeding efforts focusing mainly on productivity and grain quality but also on tolerance to water deficiency, disease resistance, lodging, greater precocity, and initial vigor.

Fig. 1: Schematic representation of the Embrapa rice breeding program, including the population improvement phase and the product development phase.
figure 1

VS1: plant selection nursery 1; EOF: family observation trial; ERF: multisite family yield trial; VS2: plant selection nursery 2; EOL: line observation trial; EP: multisite preliminary trials; ER: multisite regional trials; VCU: multisite trials assessing value for cultivation and use.

Most traits of economic importance are quantitative, controlled by many genes and greatly influenced by the environment (Walsh and Lynch, 2009). In this context, recurrent selection (RS) is one of the most efficient breeding methods available. It aims to increase the probability of finding superior genotypes by progressively concentrating favorable alleles and increasing the mean value of the population at each cycle (Dudley 2007; Hallauer and Darrah 1985; Morais 2005). RS strategies generally begin with the constitution of a population with a broad genetic base, followed by multiple cycles of selection and recombination to achieve continuous genetic gain, while minimizing the loss of genetic variability (Rutkoski, 2019). In the selection of superior genotypes for quantitative traits governed by a large number of loci with small effects, the probability of finding individuals combining all the favorable alleles is very small. The selection and intercrossing of superior individuals within a genetically diverse population increase the frequency of favorable alleles in the offspring, thereby increasing the chances of finding families with a better performance. As a result, the population mean for target traits gradually shifts in the desired direction (Hallauer et al. 2010; Morais Júnior et al., 2015). This is the bases of RS, which is now considered the best strategy for achieving genetic gains for quantitative traits in plant species (Bernardo, 2020; Cobb et al., 2019; Grenier et al., 2015).

Plant breeding demands long-term investment. It is, therefore, important to monitor the effectiveness of breeding strategies continually, through the use of indicators such as genetic gain (Ceccarelli, 2015). Genetic gain is the increase in mean breeding value of a population over at least one cycle of selection for a particular trait or index of traits (Eberhart, 1964; Rutkoski, 2019). Genetic gain can be estimated by a linear regression analysis of mean genetic performance over time. In crop species, realized genetic gain is typically estimated from two different types of information: (i) studies evaluating lines or varieties from different time periods in the same trials, possibly in multiple locations and/or years (known as era studies) or (ii) historical data from a breeding program or from the variety release system (Laidig et al., 2014; Rutkoski, 2019). Phenotypic data stored over time and available for several cycles of selection can be used to estimate the changes associated with past selection, which can guide the orientation of subsequent breeding cycles (Breseghello et al., 2021). The systematic assessment of genetic gains for key traits is, therefore, essential for assessing the efficiency of the breeding scheme (Rutkoski, 2019).

One of the first rice breeding programs based on population improvement through RS began in 1989, through a partnership between Embrapa and CIRAD (Centre de Coopération Internationale en Recherche Agronomique pour le Développement) (Taillebois and Guimaraes, 1989). That initiative had the objective of developing populations with a broad genetic base and a large potential for the generation of promising families. The RS method is used widely in outcrossing species, but much less so in self-pollinating species, due to the difficulty of making a large number of crosses. However, the discovery of recessive genetic male sterility in rice (Fujimaki, 1979) rendered the RS method feasible for this species. The first upland rice population developed at Embrapa was named CNA-IRAT 5, which had a focus on water deficiency tolerance and blast resistance (Taillebois and Guimaraes, 1989). CNA6 was generated in 1993/94 by incorporating the alleles from 27 parents into the CNA-IRAT 5, with the objective of increasing grain yield and drought tolerance.

In this study, our aim was to assess the effectiveness of population improvement via RS for achieving genetic progress, maintaining genetic variability, and increasing the potential for the selection of superior lines. We analyzed data from the five last cycles of the CNA6 population. Genetic parameters for plant height, days to flowering and grain yield were calculated from the data of multi-environment trials performed in the rice-growing regions of Brazil. The contribution of the CNA6 population to the development of elite lines was assessed. Additionally, we suggest adjustments to the RS methodology, with the aim of increasing the efficiency of family selection in the upland rice breeding program.

Materials and methods

Upland rice breeding program structure

The upland rice breeding program at Embrapa has two phases (Fig. 1). The aim of the first phase is to improve the base population through RS. The second phase focuses on product development through pedigree breeding using elite material selected from the improved population and from other breeding programs.

Population improvement

The CNA6 population was created in 1993/94 by crossing 27 parents with the CNA-IRAT 5 population. The CNA-IRAT 5 population was developed in 1984 by crossing 26 japonica lines with the male-sterile mutant of IR 36 (indica), developed by Singh and Ikehashi (1981). This mutant carries the recessive ms male sterility gene, facilitating the recombination between families (Fujimaki, 1979).

The first two breeding cycles for the CNA6 population were based on mass selection in the S0 generation (Table S2). This selection was based mainly on flowering time and plant height, with the aim of eliminating late-maturing and/or tall genotypes. In each cycle we selected about 100 plants to limit the effects of genetic drift. In the second cycle, we carried out a field evaluation of yield (YLD) based on S0:1 families. The remaining seeds from the selected families were combined to generate the population for the third cycle. This population was then subjected to five cycles of selection, as follows: (i) selection of individual male-sterile plants in the S0 generation; (ii) evaluation of S0:1 families and selection of the top 30% by bulk-harvested, retaining five panicles from each family (each panicle corresponds to a S1:2 progeny) as recombination units; (iii) evaluation of S0:2 families in multisite yield trials; (iv) S1:2 progenies from the best families were recombined. Recombination was performed with genetic male sterility (msms genotype) in cycles 3, 4 and 5, and by manual crosses in cycles 6 and 7. The duration of the cycle was three years for cycles 3 to 6 and four years for cycle 7. Therefore, the average cycle duration over the five cycles was 3.2 years.

Individual plants from the S0 generation were selected on the basis of plant architecture, disease resistance and grain type (long-slender), according to the requirement of the Brazilian consumer market. Selection in S0:1 families focused on high-heritability traits, including plant height, days to flowering, lodging resistance, disease resistance and grain quality. Selection of S0:2 families focused on grain yield. These target traits for the upland rice breeding program in Brazil have been described in more detail elsewhere (Breseghello et al. 2011; Colombari Filho et al. 2013).

Product development

Embrapa’s product development phase follows a modified pedigree method, with early-generation testing (EGT) as a strategy for selecting segregating families with potential for extraction of high-yielding inbred lines (Barros et al., 2018). The EGT approach is based on the premise that the performance of segregating families in early generations is a good predictor of the mean performance of the inbred lines derived from them (Bernardo, 2003). EGT improves quantitative traits very effectively in elite populations, through the use of selected families with a high additive genetic value as new parents, as in RS, or the selection of inbred lines. This approach combines the evaluation of yield in segregating families and recombination of the best families with elite inbred lines and commercial cultivars, before the advanced inbreeding stage (Barros et al., 2018; van Ginkel and Ortiz 2018). Once inbred lines are obtained, they are evaluated in multisite trials with the objective of selecting candidates for commercial release (Fig. 1). Phenotypic data of the elite lines developed from CNA6 families from multisite trials were used for assessing the potential of CNA6 for the development of upland rice cultivars.

Field experiments and phenotypic traits

Recurrent selection: progeny testing

The experimental data used in this study was generated in multisite trials conducted on S0:2 families from five cycles of RS in the CNA6 population (Table 1). The following traits were analyzed: grain yield (YLD, kg.ha−1), days to flowering (FLW, days) and plant height (PHT, cm). In total, 682 S0:2 families were evaluated in 20 experiments, conducted in the seasons 2000/01 2003/04 2006/07 2009/10 and 2014/15, corresponding to cycles 3, 4, 5, 6 and 7, respectively. Within each cycle, the same set of families was evaluated at all sites. The characteristics of the 20 field experiments are presented in Tables 1, and S1. The geographic area covered by this study encompasses the states with the highest levels of upland rice production in Brazil (Ramirez-Villegas et al., 2018).

Table 1 Characteristics of the yield trials of S0:2 families for each cycle of recurrent selection on the CNA6 population.

In cycles 3 to 6, trials were conducted with an augmented block design (Federer and Raghavarao, 1975). The trials for the seventh cycle had an alpha-lattice design with two replicates per site. Plots consisted of four rows of 5 meters, with an inter-row spacing of 0.30 meters and 60 seeds per meter sow density corresponding to 1200 plants per plot. All trials were rainfed, conducted in aerobic soil, in the wet season. Chemical fertilizers were applied to the soil at sowing, at rates considered intermediate-to-high relative to standard farm management. Nitrogen fertilizer was applied as a top dressing at tillering stage. Weed control was done by herbicides, supplemented by hand-weeding when needed. Insecticides were applied to the furrow in most trials, with complementary canopy spraying if required. No fungicides were applied, as one of the features evaluated in the trials was the genetic resistance to fungal diseases.

For YLD evaluations, the two central rows were harvested, discarding 0.50 m on each end of the plot. A total of 480 plants per plot (2 rows of 4 meters, with 60 seeds per meter) was used to evaluate grain yield. For PHT evaluations, a mean of five readings for the distance from the base of the plant to the tip of the panicle was obtained. For FLW, the interval between sowing and heading of 50% of the plants was determined.

Selection of families

The best S0:2 families in each cycle were selected for recombination and the generation of a new breeding cycle. Data for the selected families were used to estimate the gain associated with RS. The selection differential (S) was calculated as the difference between the mean for the selected families and the population mean. The response to selection, or selection gain, was calculated by multiplying the selection differential by the heritability of the specific trait (R = h2S).

Product development: evaluation of advanced lines

Fifty-six inbred lines were derived from CNA6 S0:2 families selected in cycles 3 to 6 and evaluated in multisite yield trials. Given the time required for inbreeding and evaluation, no data were available for lines derived from cycle 7. The lines derived from the CNA6 population were tested in preliminary trials (EP, F7), regional trials (ER, F8) and multisite VCU (value for cultivation and use, F) trials (Fig. 1) in the 2016/17 to 2020/21 cropping seasons (Table 2). Some of the best performing lines were evaluated at several stages. The experimental design used was an alpha-lattice with two or three repetitions for EP and ER trials and randomized complete blocks with four repetitions for the VCU trials. The plots and the protocol for the trials were as described for progeny testing above. The traits grain yield (YLD, kg.ha−1), days to flowering (FLW, days) and plant height (PHT, cm), were evaluated. BRS Sertaneja, a control present in cycles 6 and 7 of CNA6, and BRS Esmeralda, present in cycle 7, were also used as controls in the multisite trials of the product development phase.

Table 2 Characteristics of the advanced yield trials including elite lines derived from recurrent selection on the CNA6 population in the product development pipeline. EP, ER and VCU are the preliminary yield trials, regional yield trials and trials assessing value for cultivation and use, respectively.

Statistical analyses

Genetic trend evaluation on the CNA6 population

Statistical analyses were performed to check the quality of each of the 20 trials. We checked the data for outliers with a model-based method deployed within the SAS Proc Mixed procedure, using the "influence" option of the "model" statement. Proc Mixed computes parameter estimates from the difference between full-data and reduced-data estimates. Based on restricted likelihood distances, Cook’s distance and CovRatio, 0 to 9 plots per trial were identified as outliers and were removed for subsequent analyses. The outliers accounted for less than 3.5% of the total number of plots. For each cycle and each trait, the Pearson correlation coefficients between trials were estimated using adjusted values at the trial level.

Breeding cycles involved different generations of families, and the same set of families was evaluated across all trials, within a given cycle. We performed a two-step analysis: (1) at plot data level and (2) at genotypic mean level. The first step involved analyzing the plot data to extract the variance components and adjusted means of families. Two models were used, depending on the experimental design:

Augmented blocks design (cycles 3, 4, 5 and 6):

$${Y}_{{ijkl}}=\mu +{T}_{l}+{C}_{k}+{{TC}}_{{kl}}+{b(T)}_{{jl}}+{g(C)}_{{ik}}+{e}_{{ijkl}}$$

Yijkl: Observed phenotype

μ: Grand mean

Tl: Fixed effect of trial l from a given cycle, l {1, 2, …, L}

Ck: Fixed effect of group k, k being either the group of controls or the group of CNA6 families

TCkl: Interaction between trial l and group k

b(T)jl: Random block effect j nested in trial l, \({b(T)}_{{jl}} \sim N\left(0,{\sigma }_{b}^{2}\right)\)

g(C)ik: Random family or control effect i, nested in group k, \({g(C)}_{{ik}} \sim N\left(0,{\sigma }_{g}^{2}\right)\),

eijkl: Random residual, structured by trial, \({e}_{{ijkl}} \sim N\left(0,{\sigma }_{e}^{2}\right)\), \({\sigma }_{e}^{2}=\left({\sigma }_{e1}^{2},{\sigma }_{e2}^{2},\ldots ,{\sigma }_{{eL}}^{2}\right)\)

Alpha-lattice design (cycle 7):

$${Y}_{{ijkrl}}=\mu +{T}_{l}+{R(T)}_{{rl}}+{C}_{k}+{{TC}}_{{kl}}+{b({RT})}_{{jrl}}+{g(C)}_{{ik}}+{e}_{{ijkrl}}$$

Yijkrl: Observed phenotype

μ: Grand mean

Tl: Fixed effect of trial l from a given cycle, l {1, 2, …, L}

R(T)rl: Fixed effect of replication r nested in trial l, r {1, 2}

Ck: Fixed effect of group k, k being either the group of controls or the group of CNA6 families

TCkl: Interaction between trial l and group k

b(RT)jrl: Random block effect j nested in replication r and trial l, \({b({RT})}_{{jrl}} \sim N\left(0,{\sigma }_{b}^{2}\right)\)

g(C)ik: Random family or control effect i, nested in group k, \({g(C)}_{{ik}} \sim N\left(0,{\sigma }_{g}^{2}\right)\)

eijkrl: Random residual, structured by trial, \({e}_{{ijkl}} \sim N\left(0,{\sigma }_{e}^{2}\right)\), \({\sigma }_{e}^{2}=\left({\sigma }_{e1}^{2},{\sigma }_{e2}^{2},\ldots ,{\sigma }_{{eL}}^{2}\right)\)

The adjusted means of the progenies by cycle were used to estimate the genotypic correlation between traits. For each cycle, the Pearson correlation coefficients were computed using SAS. The variance components extracted from the models described above were used to compute the heritability for a given cycle. The following formula was used: \({H}^{2}=\frac{{\sigma }_{g}^{2}}{{\sigma }_{g}^{2}+{\sigma }_{e}^{2}}\), \({\sigma }_{g}^{2}\) being the variance associated with the genetic effect and \({\sigma }_{e}^{2}\) the variance associated with the residuals, as explained above.

The second step involved combining the genotypic adjusted means of all cycles together and modeling them as a result of the selection process. We studied the trend in this process, by considering a linear regression model with the two groups (CNA6 families and controls) to estimate the intercept and the slope associated with each group. The model can be written as follows: \({\hat{Y}}_{{it}}={\alpha }_{i}+{\beta }_{i}{\bar{Y}}_{t}+{e}_{{it}}\), where αi is the intercept for the group of controls or the CNA6 population, βi is the corresponding slope, \({\bar{Y}}_{t}\) is the estimated mean for cycle t and eit is the residual term. In the case of the CNA6 population, the slope βi can be interpreted as the genetic trend across cycles. It corresponds to the mean unit gain per breeding cycle. In the case of the controls, βi correspond to the non-genetic trend mostly related to environmental variations.

Adjusted performances of inbred lines

The inbred lines were evaluated in multisite trials, in different generations and years (Table 2). Single-trial analyses were performed to evaluate the quality of data. Joint analysis was performed for multisite trials per year, using the model below:

$${Y}_{{ijrl}}=\mu +{T}_{l}+{R(T)}_{{rl}}+{b({RT})}_{{jrl}}+{g}_{i}+{{gT}}_{{il}}+{e}_{{ijrl}}$$

Yijrl: Observed phenotype

μ: Grand mean

Tl: Fixed effect of trial, l {1, 2, …, L}

R(T)rl: Fixed effect of replication r nested in trial l, r {1, 2, 3}

b(RT)jrl: Random block effect j nested in replication r and trial l, \({b({RT})}_{{jrl}} \sim N\left(0,{\sigma }_{b}^{2}\right)\)

gi: Fixed effect of genotype, i {1, 2, …, I}

gTil: Random interaction between genotype i and trial l, \({{gT}}_{{il}} \sim N\left(0,{\sigma }_{{gT}}^{2}\right)\)

eijrl: Random residual, \({e}_{{ijrl}} \sim N\left(0,{\sigma }_{e}^{2}\right)\), \({\sigma }_{e}^{2}=\left({\sigma }_{e1}^{2},{\sigma }_{e2}^{2},\ldots ,{\sigma }_{{eL}}^{2}\right)\)

For the joint analysis, we used a mixed model approach based on the MIXED procedure (generalized linear model) in SAS (SAS Institute 1985) to obtain the vector of genotype means, adjusted for the trial effect, for the lines and controls evaluated.

Results

Genetic gain in five cycles of recurrent selection on the CNA6 population

The genetic trends for YLD, PHT and FLW from cycle 3 to 7 are shown in Fig. 2. A significant gain was observed for YLD. The mean gain per cycle was 215 kg.ha−1 (Table 3), corresponding to a gain of 9.85% per cycle. Over the five selection cycles, the average cycle length was 3.2 years. This resulted in a gain per year of 67.8 kg.ha−1 (3.08%). The trend observed for the controls was negative but not significant due to a large variability between controls (Fig. 2 and Table 3). In the first cycle (cycle 3), the yield for the group of CNA6 families was 2318 kg.ha−1, 24% lower than the mean of the controls, at 3036 kg.ha−1. That was expected, as cycle 3 was the first in which multisite yield trials were performed. However, the relative performances of the families, driven by the selection pressure for YLD, increased steadily over time, outperforming the controls in cycle 7 (+37%, Table 4). In cycle 7, the S0:2 families had a mean yield of 4167 kg.ha−1, whereas the mean yield for the controls was 3038 kg.ha−1. A medium selection intensity was applied, with 47 (23%), 45 (47%), 46 (21%), 25 (38%) and 38 (41%) families selected in cycles 3, 4, 5, 6 and 7, respectively. This selection intensity was applied with the aim of promoting genetic gain, whilst simultaneously preserving the genetic variability of the population for subsequent cycles. The selection differential ranged from 84.2 kg.ha−1 (cycle 5) to 250.6 kg.ha−1 (cycle 7), with a mean value of 152.1 kg.ha−1 per cycle (Table 4). However, the low heritabilities compromised the gains from selection. The gains ranged from 4.8 (cycle 5) to 146.3 kg.ha−1(cycle 7, Table 4). Cycle 5 had the lowest heritability (0.06) and cycle 7 the highest (0.41). This difference in heritability was related to the level of correlation between trials within each cycle. In cycles 3 to 6, the correlations between trials were mostly not significant, with values ranging from −0.18 to 0.37 (Table S3). In cycle 7, the adoption of an alpha-lattice design increased the accuracy of the evaluation, and the correlations were all significant (0.3 to 0.65). However, for all the cycles, the effect of genotype-by-environment interactions was significant (Table S4).

Fig. 2: Trend analysis of CNA6 families against the controls in cycles 3 to 7.
figure 2

Each individual data point corresponds to the least squares mean of the controls or the CNA6 families in one trial (20 trials in total). The blue lines represent the linear regression over the cycles.

Table 3 Parameters (intercept and slope) for the linear regression estimated for the CNA6 population and controls.
Table 4 Evolution of the CNA6 population across cycles and the impact of selection on the three traits: grain yield (YLD, kg.ha−1), days to flowering (FLW, days) and plant height (PHT, cm).

The results for FLW and PHT after five cycles of selection revealed a non-significant trend (Table 3, Fig. 2). The estimates of the slope were similar than those of the controls indicating that most of the change was associated with non-genetic factors. In the case of FLW an increase of 1.53 days per breeding cycle, or 0.48 days per year was observed (Table 3). This can partly be explained by the negative genetic correlation between FLW and YLD (−0.2, Table 5). In addition, in the last breeding cycle, the mean FLW for the selected families (82.2 days) was 3 days greater than that for the controls (Table 4). For FLW, phenotyping accuracy was high, with heritabilities ranging from 0.47 to 0.73 (Table 5) and significant correlations between trials for all cycles except cycle 6 (Table S3). In the case of PHT, a decrease of 1.99 cm per cycle was estimated (Table 3, Fig. 2). Only two of the five cycles of selection had a negative selection differential, with a lower PHT for the selected families (Table 4). This was probably related to positive genotypic correlations found between PHT and YLD (Table 5). The heritabilities were lower than for FLW and ranged from 0.15 to 0.40.

Table 5 Genotypic (above the diagonal) and phenotypic (below the diagonal) correlations between the traits grain yield (YLD, kg.ha−1), plant height (PHT, cm) and days to flowering (DTF, days), by cycle and over the whole period.

Elite lines derived from the CNA6 population

In total, 56 elite lines (at least at the F5:7 generation) developed in the upland rice breeding program were derived from cycles 3 to 6 of the CNA6 population. In the most advanced trial in the product development pipeline (VCU) for the 2020/21 cropping season, 25% of the lines were derived from CNA6 (Table 2). The YLD performances of the lines derived from CNA6 population were generally similar to those of other lines evaluated at the same stage (Fig. 3). For PHT and FLW, performances were more variable between stages, due to environmental effects (Fig. 3). However, the CNA6-derived lines had a slightly shorter cycle. Several CNA6-derived lines outperformed the best controls (especially BRS Esmeralda) at the EP and ER stages: 12 at EP 16/17, 5 at EP 17/18, 2 at ER 17/18 and 7 at ER 18/19. However, for the VCU 20–21 stage, none of the CNA6-derived lines outperformed the best controls (BRS Esmeralda and BRS A502).

Fig. 3: Distribution of phenotypic values for advanced lines from the product development pipeline.
figure 3

The data is originated from the EP (preliminary multisite trials), ER (regional multisite trials) and VCU (multisite trials of value for cultivation and use). The lines derived from the CNA6 population are shown in orange, other lines are in blue, and controls are in black. The vertical dashed lines indicate the mean value.

Discussion

Genetic progress enabled by recurrent selection

Grain yield

Regular assessment of the genetic gain in breeding programs is a key component of any breeding strategy (Ceccarelli, 2015). Indeed, such assessments can identify the strengths and weaknesses contributing to the realized rate of genetic gain and can guide future breeding activities. Here, the genetic gain for YLD was estimated at 215 kg.ha−1 per cycle (67.8 kg.ha−1.year−1) or 9.85% per cycle (3.08% per year). These values are in the same range than the ones that reported in other studies of RS in rice in terms of absolute value (kg.ha−1), but the relative value (%) is similar or higher. For example, in three cycles of selection based on the evaluation of S0:2 families, Rangel et al. (2002) estimated a genetic gain for YLD of 259.9 kg.ha−1 per cycle, or 4.67% in lowland rice. Similar values were reported by Breseghello et al. (2009) after three cycles of RS on an upland population. They estimated an annual genetic gain of 103.4 kg.ha−1 (3.6%), with a better mean performance of the selected population than of the controls in the last cycle. More recently, in a study of the CNA12S population for irrigated rice, Morais Júnior et al. (2017) found a rate of genetic gain for yield of 1.98% per year. Our estimate of genetic gain for YLD was obtained over a longer period, with five cycles of selection. This long-term evaluation better captures the progress of the program. For example, selection intensity was not constant over the five cycles of selection. This highlights changes in decision processes, with selection of the best families on different traits and the maintenance of genetic variability over the cycles. Indeed, the variation of selection intensity was associated with YLD performance of the families in a given cycle, but also with other traits of interest, such as plant architecture, lodging, earliness, disease resistance and grain quality, all of which are fundamental qualities for both rice growers and the agri-food industry. YLD evaluations were initiated in multisite trials in cycle 3, in which low-yielding families were easily detected and eliminated, raising the population mean. The gain observed for traits with a higher heritability (plant height and flowering time) in the S0:2 families were also partly the results of gains achieved through selection in generations S0 and S0:1, as reported by Breseghello et al. (2009).

Drought tolerance is another objective of the upland rice breeding program. In cycle 7, the first selection criterion applied was a grain yield greater than 1500 kg.ha−1 under drought conditions, evaluated in managed stress trials, followed by grain yield in standard conditions. This choice of selection criteria was designed to exert a high selection pressure for drought tolerance in the CNA6 population, thereby improving the general response of the CNA6 population to stress in the next cycle.

Several other studies have analyzed the rate of genetic gain for YLD. The estimates obtained varied greatly, as a function of the breeding program concerned and its specific features (irrigated or upland rice, type of population, breeding objectives, time scales). In most studies, the rate of genetic gain for YLD was between 0.5% and 2% per year (Atroch and Nunes, 2000; Breseghello et al., 2011; Breseghello et al., 1999; da Silva et al., 2021; de Souza et al., 2007; Juma et al., 2021; Kumar et al., 2021; Moura et al., 2021; Muralidharan et al., 2002; Peng et al., 2000). In upland or rainfed rice breeding programs, the highest estimate was that reported by Soares et al (1999), who found a genetic gain of 46.4 kg.ha−1 (3.37%) for the intermediate-maturity group of the breeding program developed in Minas Gerais State using advanced lines. Conversely, the lowest rate was that reported by Moura et al. (2021) who found a negative genetic gain of −89.8 kg.ha−1 (−2.0%) per year over four years of a pedigree-based breeding program. According to the authors, this negative gain was due to an intensification of selection for grain quality traits. For irrigated rice, Peng et al. (2000) at the International Rice Research Institute (IRRI) used era trials with released varieties to estimate an annual gain of approximately 75 to 81 kg.ha−1, equivalent to 1% for YLD. Lower values were obtained by Juma et al (2021) over a longer period (1964 to 2014): 8.8 kg.ha−1 to 17.4 kg.ha−1 per year.

Genetic gains for yield have been low since the Green Revolution, as the IR8 cultivar set a high reference yield, with the introduction of a semi-dwarf architecture. Studies estimating long-term gains tend to report a lower relative gain (in %). Muralidharan et al. (2002), for example, quantified the genetic gain in India’s national rice breeding program from 1995 to 2013 and found that there had been no genetic gain. Similarly, Santos et al. (1999) analyzed data for 22 years of the breeding program in Minas Gerais State and found non-significant gain. One exception is the study by Breseghello et al. (2011), who evaluated the genetic gain resulting from Embrapa’s upland rice breeding program over a period of 25 years, from 1984 to 2009, and found an annual gain was 45.0 kg.ha−1 (1.44%). Given the diversity of the breeding program and of estimates for the genetic gain in grain yield, it is difficult to draw clear conclusions about the determinants of successful breeding strategies. However, shorter breeding cycles, such as those currently applied to the CNA6 population, can efficiently deliver high rates of genetic gain over long time period.

Plant height and flowering time

YLD and PHT were considered more important than FLW in the first few cycles of selection applied to the CNA6 population. FLW has become more important in recent years, with the perspective of inserting upland rice into sustainable integrated production systems, in which a short cycle is desirable to ensure that different crops are grown in the same area in a single year. The upland rice breeding program aims to develop cultivars with a PHT between 90 and 100 cm, considering that taller plants tend to lodge, especially if fertility and sowing density are high. Other studies investigating the genetic trend for PHT have generally reported stability or a significant decrease (Breseghello et al. 2011; Breseghello et al. 2009; de Souza et al. 2007; Morais Júnior et al. 2017).

The objective of the Embrapa upland rice breeding program is to develop cultivars with a time to flowering of 50–60 days. However, instead of maintaining or reducing the FLW, we found that five cycles of selection increased FLW, by 1.5 days per cycle. However most of this trend was related to non-genetic factors. Breseghello et al. (2009) also reported an annual increase of 0.6 days (0.8%) for the CG3 population. By contrast, Morais Júnior et al. (2017) observed a genetic gain of −6.4 days per cycle (−1.3% per year) for the CNA12 population. The correlations obtained in this study highlight the complexity of multi-trait selection. The use of a well-defined selection index might make it possible to avoid conflicts between the genetic gains for different traits in future cycles (Baker, 2020; Céron-Rojas and Crossa, 2018).

Importance of breeding decisions

Changes in the crossing strategy

By contrast to previous cycles, recombination between selected families was achieved by manual crossing in cycle 6, to generate the population for cycle 7. The decision to change crossing strategy, from the genetic male sterility approach to manual crosses, was due to an observed increase in PHT in the population, despite negative selection for this trait. Pollen dispersal from taller plants is more effective than from shorter plants, favoring taller plants over shorter ones (Morais Júnior et al., 2017). Thus, despite selection for shorter plants, during recombination steps, taller plants are more likely to pass their alleles on to offspring for the next cycle of RS. Male sterility was therefore abandoned for the generation of the population for cycle 7. However, in a program where the number of manual crosses is a limiting factor, manual removal of the tallest plants from the recombination field each cycle may be an alternative.

This change in methodology had a significant impact on the genetic variability of the CNA6 population. The genetic variance of families for YLD in cycle 7 was almost five times that in cycle 6 (Table S4). The manual crossing of the selected families from each subpopulation in a circular diallel scheme, associated with the use of an appropriate number of parents, improves control over population inbreeding, thus helping to maintain genetic diversity. Manual crosses were planned according to the divergence and complementarity between parents. In a study in which SSR markers were used in the first two cycles of RS, Pinheiro et al. (2012) observed a maintenance of genetic diversity in the CNA12 population when a manual crossing strategy was used, contrasting with the loss of diversity observed when random crossing based on male sterility was used.

Manual crossing also makes it possible for breeders to trace the pedigree within the population, which can be used in mixed models to increase the prediction accuracy of breeding values (Piepho et al., 2008). A knowledge of pedigrees can also enable breeders to make better decisions throughout the process, from the planning of crosses to the release of new cultivars for a target environment. Thus, connecting the genotypes in historical datasets through pedigrees renders phenotypic data more useful (Breseghello et al., 2021). This is crucial for single-step genomic prediction models and can facilitate the implementation of genomic selection in breeding programs (Legarra et al., 2014; Morais Júnior et al., 2018b).

Change in trial design for progeny testing

The first four cycles of selection were performed with augmented block design, with no within-site repetition. In the seventh cycle, an alpha-lattice with two repetitions was used. The augmented block design resulted in a lower precision for the phenotypic evaluation, consequently decreasing the gain due to selection. The gain in precision achieved with the lattice design resulted in a two to six-fold increase in heritability for YLD. Augmented block designs are widely used for early-stage evaluations in breeding programs, for which the number of genotypes is high (Clarke and Stefanova, 2011; Zystro et al., 2018). At early stages, breeders usually favor the number of genotypes over the number of replicates for a given number of plots. However, the allocation of resources between genotypes, controls and replicates is always a tricky issue to solve, because more controls or replicates generally result in a better estimation of phenotypic performance (Mead, 1997). In this study, the observed benefits of the lattice design with two replicates support the change in experimental design to favor a greater genetic gain. The lattice design is now the standard design for multisite trials with S0:2 families in the Embrapa rice breeding program.

Connectivity between sites and years is a key aspect in the estimation of realized genetic gain in breeding programs (Rutkoski, 2019). The maintenance of benchmark varieties for the S0:2 trials is therefore essential, to ensure that the analysis is robust, as in this study. The use of varieties as controls in multisite trials provides a link between different cycles, facilitating the evaluation of genetic gain. However, varieties that are grown for many years tend to become more susceptible to disease, which may prevent the meaningful phenotyping of other important traits and increase the magnitude of the nongenetic trend (Fischer and Edmeades, 2010; Mackay et al., 2011; Piepho et al., 2014). It was for this reason that BRS Esmeralda, released in 2013, was introduced as a control in cycle 7, whereas BRS Primavera, which was present in the last five cycles of selection will not be included as a control in cycle 8. The comparison of population means with relevant controls over several years is essential. For example, in cycle 7, the mean YLD of the CNA6 population exceeded that of most of the controls, including BRS Esmeralda, a high-yielding variety released in 2013 (Colombari Filho et al., 2013). The selected families were 45% more productive than the controls, revealing a considerable potential for the eighth cycle of breeding.

Integration of recurrent selection and product development

The potential of the CNA6 population to provide high-quality variation for the product development pipeline is demonstrated by the presence of lines originally from CNA6 in the VCU trial, the last step in the upland rice breeding program. These elite lines were generated from CNA6 families used as parents in crosses combining traits to generate segregating populations and, subsequently, lines with a high potential. These lines were tested in multisite trials of the upland rice breeding network in Brazil for several generations, with an increasing level of selection pressure. At the last stage, lines are selected not only for YLD, PHT and FLW, but also for grain quality, disease resistance, lodging and water deficiency tolerance. The presence of these lines in the VCU trials, after several generations of increasing selection pressure demonstrates the potential of these lines to become new cultivars for release into the market. The increase in the potential of the population to generate superior lines observed here results from a displacement of the population mean or genetic gain in the desired direction. For example, the families selected in cycle 7 were 45.4% more productive than the controls, including BRS Esmeralda, the last cultivar released from the Embrapa upland rice breeding program. Fifteen families from cycle 7 of the CNA6 population were used as new parents in crosses for the product development pipeline.

Next steps in the breeding strategy

In RS programs, the rate of genetic gain can be increased by accelerating the breeding cycles, intensifying the selection pressure, improving the precision of the evaluation, or by any combination of such changes (Bernardo, 2020; Cobb et al., 2019). Off-season nurseries can be used to reduce the cycle time in plant breeding. Nonetheless, in RS schemes based on progeny testing, it is advisable to grow more than one crop per year only for the recombination step and seed multiplication, because plant and progeny selection must be performed during the normal crop season to ensure that the conditions are representative of the target environments (Heinemann et al., 2015). The constraints on the use of off-season nurseries and selection pressure intensification imposed by the evaluation of a larger number of families can be overcome by the use of modern technologies. Population improvement by RS is highly flexible and provides opportunities for the combination of classical genetic improvement tools with molecular genetics or genomics (Bartholomé et al., 2022; Martínez et al., 2014). This integration of innovative technologies into the process has a considerable potential to render rice breeding faster and more effective, with a more intensive exploration of the tremendous genetic diversity present within this species (Wang et al., 2018). One potential alternative for the acceleration of genetic gains would be the use of genomic recurrent selection (GRS), as recently evaluated (Baertschi et al., 2021; Morais Júnior et al., 2018a; Spindel et al., 2015). This approach can accelerate the gain due to selection per unit time by greatly shortening the breeding cycle and through the accurate selection of plants at early stages (e.g., S0 or S0:1 generations) for immediate recombination (Gaynor et al., 2017). Recurrent genomic selection would also make it possible to apply a high selection intensity, given that large populations can be developed and genotyped in a short time and at reasonable cost. This approach therefore has the potential to promote efficient and cost-effective breeding programs for self-pollinated species, such as rice, as demonstrated in other studies (Ahmadi et al., 2020; Bartholomé et al., 2022). The selection of families with a high potential by GRS will also benefit the product development pipeline (Gaynor et al., 2017). The best families would be identified even earlier, making it possible to use them directly for product development, thereby increasing the chances of developing high-value cultivars for commercial use. The integration of GRS in the Embrapa rice breeding program should make it possible to increase the rate of genetic gain in the future.

Conclusion and prospects

The application of five cycles of RS to the CNA6 population resulted in significant genetic gains for grain yield. The integration of this strategy into a two-phase breeding program also led to improvement of quantitative traits while selecting for other traits of interest, such as grain quality, disease resistance and pest resistance.

The efficiency of the RS method can be further improved by the integration of molecular markers to increase selection intensity and to reduce cycle length, to ensure the faster delivery of more diverse material with better performances to the product development pipeline.