Introduction

Heterosis is still the main reason for the success of the commercial maize (Zea mays L.) industry. Therefore, it is of particular interest to identify those genetic factors contributing to hybrid performance (HP) and/or heterosis, and a suitable method that could predict HP and/or heterosis with some accuracy before field evaluation of test hybrids. In maize, the main strategy that has been followed towards new ways of hybrid prediction during the last decade is based on the ‘distance’ model: heterosis, defined and measured as the superiority of the hybrid over the midparent (the average performance of the two parents of the hybrid), is related to the genetic divergence between its parental lines (Lee et al., 1989). The potential of this ‘distance’ model-strategy has been extensively tested in maize, where genetic distances were computed from RFLP data on parental inbreds (Lee et al., 1989; Melchinger et al., 1990a,b, 1992; Smith et al., 1990; Dudley et al., 1991; Boppenmaier et al., 1993; Burstin et al., 1995) and recently from PCR-based AFLP® markers (Ajmone Marsan et al., 1998). The general tendency found was that the prediction efficiency of the ‘distance’ model is high when (i) hybrids between related lines (intraheterotic crosses) and (ii) hybrids between both related and unrelated lines (intra- and interheterotic crosses) are considered. However, correlations between genetic distances of unrelated lines only and their respective interheterotic crosses, were of low practical predictive value. This tendency is in good agreement with quantitative-genetic expectations (Charcosset & Essioux, 1994), ascribing the failure of the ‘distance’ model for interheterotic crosses to the fact that linkage associations between markers and quantitative trait loci (QTL) generally differ randomly from one heterotic group to the other. Because only the interheterotic crosses are of commercial importance and of interest to the breeder, the practical value of the ‘distance’ model-approach is limited. Footnote 1 Footnote 2

Other, more recent strategies for predicting HP, especially between unrelated lines, were proposed by Bernardo (1994) and Charcosset et al. (1998). The first method, based on best linear unbiased prediction (BLUP), uses covariances between HPs, estimated with marker data on parental inbreds, to predict the performance of an untested hybrid from the performance of related, tested hybrids. The second method is based on the principle that two hybrids with parents similar at the marker level, should display similar specific combining ability (SCA) values. Markers are used to generate covariates for SCA by means of principal components analysis.

In maize, many studies have been conducted to identify and map QTL for grain yield (GY) and yield components (Edwards et al., 1987; Stuber et al., 1992; Zehr et al., 1992; Beavis et al., 1994; Veldboom et al., 1994; Ajmone Marsan et al., 1995; Austin & Lee, 1996; Cockerham & Zeng, 1996; Eathington et al., 1997; Austin & Lee, 1998). These studies suggested strongly that there are multiple QTL affecting GY throughout the genome. The results are generally in favour of the hypothesis of dominance of favourable alleles to explain the observed heterosis in GY, although overdominance at individual QTL (Stuber et al., 1992) and epistasis cannot be ruled out.

In this paper, we present a novel approach towards the prediction of HP and heterosis. This approach is based on (i) the assessment of associations between AFLP markers and HP, respectively, SCA across a set of hybrids, and (ii) the assumption that the joint effect of genetic factors determined this way can be obtained by addition. The chromosomal position of the loci involved in HP or heterosis is assumed to be in tight linkage with the marker locus as loose trait locus–marker associations will be broken up by accumulated recombination events during the establishment of the inbred lines. At the same time, because the map position of the selected markers is known, putative QTL affecting the trait of interest are identified.

Materials and methods

Plant material

Six inbred lines from Iowa Stiff Stalk Synthetic (BSSS), five from Lancaster Sure Crop (LSC) and two of miscellaneous origin (MO) were chosen as parents for a half-diallel mating design (Fig. 1). The six BSSS and the five LSC inbred lines from the half-diallel, were also chosen to be tested against Lo881 and C103 (LSC testers) and B14A and B73 (BSSS testers), respectively. Another 16 parental testcrosses were obtained by testing eight BSSS inbreds (Lo999, N28, A1, A2, A3, A4, A5 and A8) against Lo881 and C103. The pedigree backgrounds of all inbreds are given in Vuylsteke et al. (2000).

Fig. 1
figure 1

 Schematic representation of the half-diallel mating design, involving six inbred lines from Iowa Stiff Stalk Synthetic (BSSS), five inbred lines from Lancaster Sure Crop (LSC) and two of miscellaneous origin (MO) chosen as parents. Intraheterotic crosses are marked by ‘×’, interheterotic crosses are marked by ‘’.

Field trials

The 78 single-cross hybrids from the half-diallel and the 38 parental testcrosses were evaluated in 1994 at three different sites (Bergamo, Luignano and Turano) for GY (t ha–1 at 15.5% moisture). The experimental design is described in Ajmone Marsan et al. (1998).

Data handling

The half-diallel and the testcrosses had 22 interheterotic F1 data in common. These duplicate and reciprocal F1 data were averaged. For the application of the ‘distance’ model, all the 78 F1 data were considered. Diallel analysis was performed according to Griffing (1956) Model I of Method 4 excluding parents and reciprocals, partitioning the performance of the hybrid (Yij) between inbreds i and j classically as:

where μ is the mean of the HPs, gcai and gcaj are the GCAs of the inbreds i and j, respectively, and scaij is the SCA between inbreds i and j. GCA and SCA variances were highly significant (P < 0.01) and of a similar order of magnitude.

Because the breeder is interested only in interheterotic crosses, and in order to remove the influence of group effects (intra- vs. interheterotic groups) on further analyses, e.g. marker selection, the intraheterotic group crosses (BSSS × BSSS; LSC × LSC) are excluded, reducing the dataset to 53 hybrids (Fig. 1). A simple ANOVA-test revealed a significant difference (P < 0.001) between the intra-and interheterotic HP and SCA for GY (data not shown).

Considering the 53 interheterotic F1 data only, the GCAs of the parental lines were adjusted for the contribution of the other lines to the mean of the line in question, as there are a small number of parental lines (Falconer, 1989):

where gcai is the GCA of the inbred i, n is the number of lines crossed with inbred i, ī is the mean performance of parental line i and μ is the mean of the HPs of (i) BSSS × LSC and BSSS × MO crosses, when inbred i is a BSSS inbred line (ii) LSC × BSSS and LSC × MO crosses, when inbred i is a LSC inbred line, and (iii) BSSS × MO and LSC × MO crosses, when inbred i is from miscellaneous origin (Fig. 1). If there is no dominance or epistasis, the performance of the hybrid from a cross between the ith female and the jth male is predicted by

Any significant deviation from the observed Yij, referred to as SCA, must be caused by dominance or epistatic effects.

AFLP® and methylation AFLP® analysis

The 13 inbred lines were assayed for their respective AFLP and methylation AFLP profiles as described in Vuylsteke et al. (in press). A total of 1385 AFLP markers (592 EcoRI/MseI (E/M), 532 PstI/MseI (P/M) and 261 mPstI/MseI (mP/M) markers) out of 1539 AFLP markers mapped on the B73 × Mo17 Recombinant Inbred (RI) high-density AFLP linkage map (Vuylsteke et al., 1999), were chosen for further analysis.

‘Distance’ model

Genetic Distances (GD) between pairs of inbred lines were calculated from AFLP data as complement to the genetic similarity coefficient originally devised by Jaccard (1908). Analogous to the partitioning of HP into GCA and SCA of parents, the GD values associated with 78 F1 hybrids were partitioned as:

with the analogous interpretation of general and specific genetic distances (Melchinger et al., 1990b). Linear correlations were calculated for various combinations of HP, SCA, GD and SGD for the 78 F1 hybrids.

Selection of markers using the Kruskal–Wallis test

To find markers that are, across the 53 F1s, significantly associated with HP, the rank sum test of Kruskal–Wallis, one of the three QTL mapping methods handled by MapQTL™ (van Ooijen & Maliepaard, 1996) has been used as a nonparametric statistical method. Given the parental molecular genotypes, the molecular genotypes of the 53 hybrid combinations were inferred, and converted and structured in a way in order to meet the input file structure of MapQTL (van Ooijen & Maliepaard, 1996): A for homozygous absence, H for heterozygosity and B for homozygous presence of the marker allele. Besides the genotype information at each locus, the map position of the loci and the quantitative data are needed as input for the Kruskal–Wallis test as performed by MapQTL. The output of the Kruskal–Wallis test lists for every locus (sorted according the map) the name of the locus and its map position, the number of informative individuals, the Kruskal–Wallis test statistic and corresponding P-value, and, subsequently, for each class, respectively, the genotype, the mean rank, the arithmetic mean and the number of individuals in the class. B73 and Mo17 will be referred to as ‘origin’ of a selected marker allele, when the AFLP marker has been identified as a B73 or Mo17 marker, respectively, mapped on the B73 × Mo17 RI linkage map (Vuylsteke et al., 1999).

In order to keep the overall false positive rate low, a stringent significance level of 0.001 and 0.005 was used in the selection of markers significantly associated with HP and SCA, respectively, across the 53 F1s. Only those loci at the 0.001 and 0.005 significance level, respectively, for which all individuals are informative and the three genotypic classes are represented with at least one individual, were retained for further analysis.

Model for the prediction of HP

For each selected marker, the additive (a) and dominance (d) effects are estimated from the arithmetic means MM, mm and Mm of the genotypic classes B, A and H, respectively. M is considered as the marker allele represented by an AFLP fragment, while m indicates the absence of that marker allele (i.e. m encodes one or different other alleles at the same marker locus).

If the trait is controlled by nl loci acting independently (no epistasis), the genotypic value of the F1 from the cross i × j can be written, using the notation of Hayman (1954), as:

with C=∑nll=1cl, the mean of all homozygotes over all loci, al and dl the additive and dominance effects, respectively, for each locus l, and θil representing the genotype of hybrid ij at locus l, which takes the value −1, 0 and +1 for genotypes mm, Mm and MM, respectively.

Finally, a hybrid value TCSMij can be calculated for any hybrid as TCSMij = YijC, representing the total contribution of the selected markers (TCSM) in terms of their al and dl estimates. Different TCSMs can be calculated as a function of the significance level used in selecting markers, resulting in a TCSM0.001, TCSM0.0005 and TCSM0.0001. Linear regression of the HP on the TCSM results in a model for the prediction of the HP.

Note that, as a parental line is supposed to be either homozygous for the absence or homozygous for the presence of the marker alleles showing significant association with QTL of the trait of interest, its TCSM per se reduces to

Model for the prediction of GCA

Analogous to the partitioning of HP into GCA and SCA of parents, the TCSMij of the hybrid Yij between inbreds i and j can be written as:

with the analogous interpretation of general and specific contributions of selected markers. The GCSM of a line i is calculated as the deviation of its mean from the overall mean μ, adjusted for the contribution of the other lines to the mean of the line in question (Falconer, 1989) in a analogous way as for GCA (Fig. 1).

Linear regression of the GCAs of the parental lines calculated for the trait of interest on the GCSMs results in an additive model for the prediction of the GCA.

An ‘expected’ TCSM of the hybrid from a cross between the ith female and the jth male can now be calculated as

Models for the prediction of SCA

There are two alternative models for the prediction of SCA based on selected markers.

1 The difference between the calculated and ‘expected’ TCSM for a hybrid results in an estimation of the SCSM of the two parental lines in combination. Linear regression of the SCAs of the hybrids calculated for the trait of interest on the SCSMs results in a first model for the prediction of the SCA.

2 In a way similar to finding markers significantly associated with HP, markers associated with SCA can be selected. The estimates of al and dl of the marker alleles selected as being significantly associated with SCA are used to calculate a TCSM value of any hybrid. Linear regression of the SCAs on the TCSMs results in a second model for the prediction of the SCA.

Allelic divergence among groups

Allelic divergence (ald) among groups of inbreds at the marker loci and the QTL produces linkage disequilibrium between marker loci and QTL involved in SCA (Charcosset & Essioux, 1994). Because specific heterotic groups like BSSS and LSC have been classified on the basis of intra- and interheterotic heterosis, these groups should differ for their allelic frequencies at the QTL that exhibit dominance effects. As we were able to determine group membership of the parental inbreds, allelic divergence among the two major groups for the markers showing significant association with SCA has been calculated as follows:

where f1 and f2 are the allelic frequencies at the marker locus in group 1 (BSSS) and group 2 (LSC), respectively. High ald values must (i) provide evidence for the correlation between SCA and heterozygosity at marker loci, and (ii) support the linkage association between the selected marker loci and QTL exhibiting dominance effects. The allelic frequency at the marker locus in the third group containing the two parental lines of miscellaneous origin, was left out of consideration.

Evaluation of the model for the prediction of HP

A first type of cross-validation performed to evaluate the additive model for prediction of HP is by a jack-knife sampling procedure. The jack-knife sampling procedure requires the partition of the initial set of N hybrids into (i) a set of N − 1 predictor hybrids used for parameter estimation and (ii) one ‘removed’ hybrid used to compare predicted HP with observed HP. At each iteration, the selection of markers associated with HP and the calculation of the corresponding TCSM was repeated. Evaluation of prediction efficiency was made by examining plots of observed vs. predicted values and two synthetic statistics: (i) the standard error (SE) estimated as:

where s2 is the sample variance, n is the number of observations, x0 is the predicted value, is the mean of the observed values and xi is the observed value i; (ii) the coefficient of determination (r2, squared correlation coefficient) between observed and predicted values. The minimum value SE can reach is equal to σ, because a new hybrid will show some variation around the regression, equal to at least the residual variance σ2. A second source of variation to be taken into account is the inaccuracy of the regression line: the estimates of the regression coefficients are stochastic, as they are based on a limited set of observations.

A second type of cross-validation to evaluate the additive models for prediction of HP is by linear regression of the HP of additional single crosses on their corresponding TCSM. In this study, the 16 parental test crosses were chosen. Evaluation of prediction efficiencies was made by examining plots of observed vs. predicted values and the corresponding r2 values.

All computations in modelling HP, GCA and SCA, and in cross-validating HP were performed using the GENSTAT program (Genstat-5-Committee, 1993).

Results

Relationship of genetic distance to HP and SCA

The estimates of linear correlations (r) of GD and SGD calculated from the total marker dataset with HP and SCA for GY, respectively, are presented in Table 1. It must be emphasized that the results obtained from the BSSS × BSSS and LSC × LSC groups of crosses, although these groups are of minor interest, should be interpreted with caution because of their small number of observations. The r-value of GD with HP for the entire set of 78 hybrids was highly significant (P < 0.001) but of moderate size (0.48). By contrast, a lack of relationship was noted between GD and HP in the three subsets of crosses. The r-value of SGD with the SCA effect was for the entire set of crosses and the BSSS × BSSS subset highly positive (0.81 and 0.86, respectively) and highly significant (P < 0.001). In addition, highly significant (P < 0.001) and of a high magnitude was the correlation found in the subset of unrelated lines (0.64). The r-values of GD and SGD calculated from the total marker data set with HP and SCA for GY calculated from the 78 F1 data from the half-diallel only, were similar to those reported by Ajmone Marsan et al. (1998) (data not shown).

Table 1 Linear correlations of genetic distance (GD) and specific genetic distance (SGD) based on the total marker data set, with hybrid performance (HP) and specific combining ability (SCA) for grain yield, for the total set of 78 single crosses and for different subsets of single crosses

Prediction of Hybrid Performance, GCA and SCA for GY

Table 2 gives an output list of the 20 marker alleles selected as being significantly (P < 0.001) associated with QTL alleles contributing to GY, as well as their corresponding map position, ‘origin’, Kruskal–Wallis test statistic and the corresponding a- and d-values. It is clear from the a- and d-values that the marker alleles selected for HP for GY fit single-gene models with additive and partial dominance effects (0 ≤ |d| < |a|).

Table 2 Map position, ‘origin’, Kruskal–Wallis test statistic (K), the means for the three genotypic classes and the a- and d- effects for the 20 marker alleles selected as being significantly (P < 0.001) associated with QTL alleles contributing to the hybrid performance for grain yield, across the 53 interheterotic crosses

The selected markers are clearly confined to particular regions of chromosomes, rather than being evenly distributed across the entire maize genome (residing on eight of the 10 chromosomes). Only one putative QTL of GY was revealed on 1/94 where Mo17 contributed the superior allele. In contrast, B73 contributed the superior allele at the putative QTL on 4/56.2-58.0, 5/20.1, 6/10.3-10.8, 6/64.7-68.4, 8/124.6 and 9/54.1.

The r-values of TCSM with HP for GY (0.79, 0.78 and 0.77, respectively) calculated for different numbers of selected markers (20, 16 and 7, respectively) are very highly significant (P < 0.001) and of a much higher magnitude than the r-values of HP with GD (Fig. 2).

Fig. 2
figure 2

 Observed hybrid performance (t ha–1 at 15.5% moisture) vs. the total contribution of the markers selected at a significance level of 0.001 (TCSM0.001). The 53 interheterotic crosses are considered. The straight line represents the linear regression of hybrid performance on TCSM0.001.

The cross Lo881 × Lo951, of which GY is amongst the highest (12.94 t ha–1 at 15.5% moisture), has the highest TCSM0.001 value (23.65) for GY among the 53 hybrids (data not shown). The maximal TCSM0.001 value that can be reached, based on the maximal contribution of each selected marker listed in Table 2 equals 23.65 (∑20l=1|al|). This means there is no additional gain in GY possible using the QTL detected in the germplasm under consideration.

The r-values of GCSM with GCA for GY (0.88, 0.88 and 0.87, respectively) for the 13 inbred lines, calculated for different numbers of selected markers (20, 16 and 7, respectively) are very highly significant (P < 0.001) (Fig. 3).

Fig. 3
figure 3

 General combining ability (GCA) (t ha–1 at 15.5% moisture) vs. the general contribution of the markers selected at a significance level of 0.001 (GCSM0.001). The 13 inbred lines are considered. The straight line represents the linear regression of GCA on GCSM0.001.

The r-values of SCSM with SCA for GY (first model for the prediction of the SCA) calculated for different numbers of selected markers (20, 16 and 7, respectively) are very highly significant (P < 0.001) but of moderate size (0.49, 0.47 and 0.48, respectively) and of a lower magnitude than the r-values of SGD with SCA.

Table 3 gives an output list of the 25 marker alleles selected as being significantly (P < 0.005) associated with QTL alleles contributing to SCA for GY as well as their corresponding map locus, ‘origin’, Kruskal–Wallis test statistic and the corresponding a- and d-values. It is clear from the a- and d-values that the marker alleles selected for SCA for GY fit single-gene models with only overdominance effects (|d| > |a|). Again, the selected markers are confined to particular regions of the chromosomes, rather than being evenly distributed across the entire maize genome (they reside on seven of the 10 chromosomes). The selected markers are showing positive overdominance (e.g. 1/74.7), as well as negative overdominance (e.g. 1/53.1). Where positive overdominance occurs, the superior allele originates evenly from B73 and Mo17. Simultaneous fit of the 25 selected marker alleles (second model for the prediction of the SCA) accounted for 36.8% of the SCA variance among the 53 hybrids which is of a higher extent than explained by SCSM, but of a lower extent than explained by SGD.

Table 3 Map position, ‘origin’, Kruskal–Wallis test statistic (K), the means for the three genotypic classes, the a- and d-effects and the allelic divergence (ald) for the 25 marker alleles slected as being significantly (P < 0.005) associated with QTLalleles contributing to the specific combining ability for grain yield across the 53 intereterotic crosses

All of the selected markers but three show a high ald value (either 80 or 100). These high ald values are in good agreement with the hypothesis that heterotic groups should differ for their allelic frequencies at the QTL that exhibit dominance effects, or at the marker loci tightly linked to these QTL (Charcosset & Essioux, 1994).

Evaluation of the additive model for prediction of HP

Performing the cross-validation of the additive models for prediction of HP by a jack-knife sampling procedure, shows that the highest efficiency for prediction of HP of GY was reached when considering the selected markers at a significance level of P < 0.001 only (Table 4; Fig. 4). In this situation, HP predicted by the model explained 45.1% of the variation observed. The 13 hybrids with a predicted HP value ≥12.0 t ha–1 at 15.5% moisture, had a observed value ≥11.5 t ha–1 at 15.5% moisture. The corresponding mean SE of predicted vs. observed values was 0.88 t ha–1 at 15.5% moisture. The ratio between SE and the total range of variation that was observed (7.5–13.88 t ha–1 at 15.5% moisture), and the fact that most of the best single crosses are identified, suggests that prediction based on the TCSM model is highly efficient for a preliminary screening of test hybrids before field evaluation. Note that cross-validation of the prediction model by a jack-knife procedure involved the 53 interheterotic hybrids of the analysed 13 by 13 half-diallel.

Table 4 Coefficient of determination (r2) between observed and predicted hybrid performance (HP) of (a) the 53 hybrids forming part of the 13 by 13 half-diallel and (b) 16 hybrids partly related to the 13 by 13 half-diallel; the corresponding mean standard error (SE) and the empirical standard deviation of SE (within brackets) over the 53 and 16 cross-validations
Fig. 4
figure 4

 Observed vs. predicted performance (t ha−1 at 15.5% moisture) of maize hybrids based on 52 predictor hybrids used for parameter estimation (jack-knife sampling procedure), considering the selected markers at a significance level of P < 0.001 (Table 2). The 53 interheterotic hybrids are considered. The straight line represents the predicted=observed equation.

Cross-validating the additive model for prediction of HP by linear regression of the HP of the 16 parental testcrosses on their corresponding TCSM, the highest efficiency (r2=33.0%; Table 4; Fig. 5) was reached by simultaneous fit of the 20 selected marker alleles, given in Table 2. Note that here, in contrast with the cross-validation by jack-knifing, the prediction model is evaluated using hybrids of which only one parent (the LSC tester) is forming part of the 13 by 13 half-diallel. Despite a moderate r2 value, Fig. 5 shows that the best single crosses are identified. This suggests that prediction, based on the al and dl estimates of the 20 markers selected at P < 0.001 (Table 2), is efficient as a preliminary screening of related test hybrids before field evaluation.

Fig. 5
figure 5

 Observed vs. predicted performance (t ha−1 at 15.5% moisture) of the 16 hybrids partly related to the 13 by 13 half-diallel, based on the selected markers at a significance level of P < 0.001 (Table 2). The straight line represents the predicted=observed equation.

Discussion

In good accordance with published results (Lee et al., 1989; Melchinger et al., 1990a,b, 1992; Smith et al., 1990; Dudley et al., 1991; Boppenmaier et al., 1993; Burstin et al., 1995; Ajmone Marsan et al., 1998), estimates of the GD between parents did not consistently identify the best crosses, particularly not when the two parents are nonrelated lines. On the contrary, correlations between SGD and SCA found in the subset of unrelated lines were highly significant (P < 0.001) and of a high magnitude suggesting practical utility in predicting SCA effects. Differences between our results and those reported by Ajmone Marsan et al. (1998) are caused solely by differences in field data.

The segregational resolution of conventional segregating populations (e.g. F2, BC) is too low to distinguish tightly from less tightly QTL-marker combinations. The approach to QTL identification followed in the present study has the following potential advantages over QTL detection in a segregating population. First, marker–trait associations are only expected to be found in case a marker is tightly linked to a QTL. This is because across a set of lines, associations between QTL and loosely linked markers will be nonexistent because of accumulated recombination events during the establishment of the lines. Basically, the type of associations we have identified are caused by identity by descent of QTL and marker alleles across lines. Secondly, only a limited number of lines representing the gene pool used by the breeder need to be genotyped. And thirdly, it may allow the detection of QTL that vary across a wide spectrum of the germplasm used. The possible advantages are not easily generalized, especially because the joint identity by descent of alleles of linked loci depends on factors that are largely unknown for most germplasm collections, i.e. the number of generations since the descent from a common ancestor and the amount of exchange between lines of descents by crossing in the past.

The a- and d-values of the markers selected for HP for GY indicate that QTL with additive to partial dominance effects are prevalent. Although the magnitude of the genetic effects for any single QTL contributing to GY can vary considerably, the joint added contribution of single QTL involved in GY explains 59.3–62.4% of the HP variance. Cross-validation of the prediction efficiency of the TCSM model for HP showed that the best crosses were identified, suggesting that the TCSM-approach is efficient as a preliminary screening of test hybrids before field evaluation. The higher prediction efficiency of the TCSM model in comparison with the ‘distance’ model can be explained as follows: rather than converting molecular polymorphism between inbred lines, having direct or no direct effect on the trait of interest, into the metric GD, only specific markers were selected that were supposed to be linked to loci that affect the quantitative trait of interest. Hybrids heterozygous for marker loci significantly associated with SCA, often show a higher GY than hybrids homozygous for those marker loci. This pattern may result from either true overdominance (i.e. particular single loci at which the heterozygote phenotype exceeds that of either homozygote), epistasis or pseudo-overdominance (i.e. closely linked loci at which alleles with dominant or partially dominant advantageous effects are in repulsion phase). With more than one QTL linked to the marker, epistatic effects modify the additive and dominance effects or pseudo-overdominance results. Although all QTL were detected at marker loci and deliver (in this way) the maximal genetic information, and an extensive dissociation of alleles at linked loci is most likely represented in the inbred lines, our results still cannot distinguish these possibilities.

If the joint effect of multiple QTL involved in the heterotic response of GY is additive, 36.8% of the SCA variance among the 53 hybrids can be explained, which is of a higher extent than explained by SCSM (22.1–24.0%), and almost equal to what is explained by SGD (40.1%). The high ald value of the marker alleles showing significant association with the SCA effects is consistent with the fact that the process of inbreeding and selection by which lines are commonly developed (i) generates allelic divergence among groups at the marker loci and the QTL involved in the trait of interest, and (ii) produces linkage disequilibrium between marker loci and QTL involved in SCA (Charcosset & Essioux, 1994).

The aim of the breeder is to accumulate in the same genotype the maximum number of favourable genes. Because all putative QTL mentioned in Table 2 show additive to partial dominance gene effects, fixing one or more of the favourable QTL alleles in the inbred line is desirable. More than GCSM, the TCSM value per se of an inbred line is suited to monitoring the improvement of an inbred line by fixation of favourable alleles and, subsequently, marker-assisted selection, as the TCSM per se value can be calculated directly from the genotype of the inbred line.

Although direct comparisons of QTL are complicated by differences in parental lines, design of the cross, number of progeny and the environments in which the progeny was assessed, as well as by different marker loci and QTL detection methods, other reports have identified some of the same regions detected in the present study to be associated with GY. Austin & Lee (1996, 1998) detected GY QTL on 5S/umc72, 6L/, bnl5.47-npi280 and 8L/umc7 that were also associated with GY in the present study (5S/20.1; 6L/64.7-68.4; 8L/124.6-124.7). Also Zehr et al. (1992) reported the GY QTL on 6L, showing marker association with umc38a. Another GY QTL reported by Zehr et al. (1992) was associated with umc44 on 2S, likely to coincide with 2S/53.4 in our study. Ajmone Marsan et al. (1995) reported a major GY QTL associated with umc051 on chromosome 6, which is in the vicinity of the putative QTL on 6C/10.3-10.8 found in our study. Another GY QTL found on 9C/54.1–58.6, is likely to coincide with the GY QTL found on 9C by Stuber et al. (1992); Zehr et al. (1992) and Ajmone Marsan et al. (1995). Finally, Ajmone Marsan et al. (1995) also detected a GY QTL in the interval 4L/umc42-umc19, associated with GY in the present study (4C/56.2-58.0). Besides the agreement in chromosomal location for a few GY QTL, agreement in origin of the superior and inferior allele is present only for the QTL on 4C, 9C, and on 2S, respectively. Of the chromosomal regions selected as being significantly associated with QTL contributing to the SCA for GY, one (9/58.6:E35/M50-228.1) was also selected as being significantly (P < 0.001) associated with a QTL contributing to the HP for GY in this study, and four (1/53.1, 1/69.0, 2/70.4 and 10/37.7) were associated with heterosis for GY in Stuber et al. (1992).

The efficiency of the TCSM method as a prediction method and/or as a QTL screening method may be increased by fulfilling the following requirements. (i) A higher marker density on the map will allow the detection of more marker alleles tightly linked to specific QTL, consolidating already identified QTL or identifying new putative QTL. Where a higher marker density can be obtained by intensifying the mapping efforts, a higher number of specific alleles per locus can be obtained by integrating linkage maps covering different genomes. (ii) A more reliable and easier evaluation of the effect of a QTL allele will be obtained when the three genotypic classes are represented more equally. This balance can be obtained by enlarging the half-diallel. And (iii) yield data of hybrids, available from multiple trials carried out across different locations and years, are highly desirable in order to reduce the phenotypic variance, representing a gain in accuracy.