 Methodology
 Open access
 Published:
Revisiting superiority and stability metrics of cultivar performances using genomic data: derivations of new estimators
Plant Methods volume 20, Article number: 85 (2024)
Abstract
The selection of highly productive genotypes with stable performance across environments is a major challenge of plant breeding programs due to genotypebyenvironment (GE) interactions. Over the years, different metrics have been proposed that aim at characterizing the superiority and/or stability of genotype performance across environments. However, these metrics are traditionally estimated using phenotypic values only and are not well suited to an unbalanced design in which genotypes are not observed in all environments. The objective of this research was to propose and evaluate new estimators of the following GE metrics: Ecovalence, Environmental Variance, Finlay–Wilkinson regression coefficient, and Lin–Binns superiority measure. Drawing from a multienvironment genomic prediction model, we derived the best linear unbiased prediction for each GE metric. These derivations included both a squared expectation and a variance term. To assess the effectiveness of our new estimators, we conducted simulations that varied in traits and environment parameters. In our results, new estimators consistently outperformed traditional phenotypebased estimators in terms of accuracy. By incorporating a variance term into our new estimators, in addition to the squared expectation term, we were able to improve the precision of our estimates, particularly for Ecovalence in situations where heritability was low and/or sparseness was high. All methods are implemented in a new Rpackage: GEmetrics. These genomicbased estimators enable estimating GE metrics in unbalanced designs and predicting GE metrics for new genotypes, which should help improve the selection efficiency of highperformance and stable genotypes across environments.
Background
The performance of cultivars is dependent on selective breeding or genetic modification of a specific genotype to display distinct agronomic traits. Such performance varies, given different locations and management conditions (e.g. temperature, rainfall, or nitrogen content), a phenomenon referred to as genotypebyenvironment (GE) interaction [1]. As this interaction complicates the selection of suitable genotypes for quantitative traits, it delays the cultivation and distribution of new crop varieties [2]. A good characterization of GE interactions is fundamental to a better understanding of the relationship between genetic and environmental factors that shape complex crop traits [3]. Multienvironment trials (MET), or experimental sets that evaluate a population of genotypes in various conditions, help characterize GE interactions and facilitate the decisionmaking process of breeders [4, 5].
Several metrics have been proposed over the years to characterize the stability of genotypes across multiple environments. Two concepts of stability have been established: static stability, in which a stable genotype maintains a constant performance regardless of the effect of the environment, and dynamic stability, in which the genotype follows environmental response with constant deviation [6]. Accordingly, we can categorize GE metrics as static, such as Environmental Variance defined as the variance in performance of a genotype across environments [7], or dynamic such as Ecovalence defined as the contribution of a genotype to the GE sum of squares of an analysis of variance (ANOVA) [8]. Additionally, the Finlay–Wilkinson regression coefficient of the performance of a genotype on environmental means can be used to assess stability according to the two concepts: static around zero and dynamic around one [9]. A cultivar’s potential is assessed not only by the stability of its performance across environments but also by its superiority to other cultivars. Although mean cultivar performance remains the gold standard in breeding, other measures have been proposed, such as Lin–Binns superiority measure [10]. In the latter, cultivars that perform particularly poorly in one or more environments are penalized in comparison to those with average performance. Stability and superiority measures are traditionally estimated using phenotypic observations only. In practice, the estimates of GE combinations are obtained from a linear model with fixed genotyped and environment effects [6, 11].
Linear mixed models have been employed to estimate breeding values (BVs) considering genotype effects as random [12, 13]. The model’s advantages include shrinkage towards the mean when the reliability of the BV estimates are low, the capability to handle unbalanced data, and the ability to estimate genetic and nongenetic components of variance [14]. They have proven their worth across diverse datasets, estimating multigeneration indices across years and environments [15], dealing with highly unbalanced experimental design as prep [16], and estimating quantitative genetics model parameters using Bayesian inference [17].
Using dense molecular marker panels, genomic prediction has been suggested to predict the breeding value of unobserved genotypes, which revolutionized selection methods for quantitative traits [18]. The incorporation of a genomic relationship kernel into mixed models has amplified genomic selection efficiency [19]. In the context of METs, GE interactions can be viewed as resulting from the differential expression of quantitative trait loci (QTL) in relation to environmental variables. The mixed model’s flexibility with genomic data aids in predicting GE effects and unobserved genotype performance in different environments [20,21,22,23,24].
In the prevailing genomic era, utilizing only adjusted means from a standard linear model to estimate traditional GE metrics does not fully exploit the extensive information available. For instance, [25] proposed a Bayesian approach to estimate the Finlay and Wilkinson regression coefficient while taking genomic information into account. Alternatively, one approach to refine GE metric estimates is to use environmentspecific GEBVs (envGEBVs, i.e. genomic BLUPs of environmentspecific BVs) instead of merely relying on adjusted means. Within the linear mixed model context, the expectation of a random term, given observed data, equates to its BLUP [13, 26]. From this definition, we can compute a genomic BLUP for GE metrics that encompasses both a squared expectation component and a variance component. As far as we are aware, no previous work has presented or evaluated these new GE metric estimators.
In this study, we (i) derived novel estimators for GE stability and superiority metrics, comparing their efficacy in approximating the true GE metric values  those theoretically derived from environmentspecific breeding values (envBVs)—against traditional phenotypebased estimators; (ii) assessed how various traits and environmental parameters influence the performance of these estimators; and (iii) determined the predictability of GE stability and superiority metrics for genotypes not observed in any of the environments. Our analyses utilize three public empirical datasetsmaize, oat, and sorghumand are further supported by simulations.
Methods
Derivation of GE metric estimators
In this study, we focused on five GE metrics used to assess the stability and superiority of genotype performances across environments. To develop new estimators, our approach consisted of (i) presenting formulas to calculate true GE metrics, i.e. the values that one would obtain by calculating them directly using envBVs, (ii) proposing a genomic prediction model adapted to METs data, and (iii) calculate the expectation of GE metrics conditional on the phenotypes based on the genomic prediction model, i.e. BLUPs of the GE metrics. According to this approach, GE metrics are defined as random variables. Note that they could also be defined as fixed or variance model parameters [27,28,29]. The advantage of treating GE metrics as random variables is that it eliminates the need to fit a model that specifically accounts for stability parameters. Instead, GE metric estimates can be obtained from any multienvironment genomic prediction model that considers GE interactions.
GE metrics
Let us assume a METs dataset with N genotypes evaluated in J environments. Each genotype i has an envBV \(G_{ij}\) specific to each environment j, as a result of environmental effects and GE interactions. In what follows, the average over N genotypes or over J environments is indicated by a dot in the BV subscripts, e.g. the BV corresponds to the average envBV of a genotype i over environments is referred to as \(G_{i.}=\frac{1}{J}\sum ^J_{j=1}G_{ij}\), and the average BV over all genotypes is referred to as \(G_{..}=\frac{1}{JN}\sum ^N_{i=1}\sum ^J_{j=1}G_{ij}\).
The first GE metric \(L_{i}\) considered in our study is the Lin–Binns superiority measure [10]:
which aims, for each genotype i, to average over all environments the squared difference between its envBV and that of the bestperforming/reference genotype \(r_{j}\) in environment j. Compared to a simple average of genotype performances over environments, it tends to penalize genotypes that perform particularly poorly in given environments.
The second GE metric \(S_i\) is a measure of static stability named Environmental Variance [7]:
which aims to calculate the variance of the performance of each genotype i over environments.
The third GE metric \(W_i\) is a measure of dynamic stability named Ecovalence [8]:
which aims to quantify the contribution of a genotype to the GE sum of squares of an ANOVA, i.e. to which extent envBVs deviate nonuniformly from the mean of each environment.
The last GE metric \(B_{i}\) presented in our study is the Finlay–Wilkinson regression coefficient [9]:
which aims, for each genotype i, to regress envBVs on the environment means.
Genomic prediction model
We decomposed each envBV \(G_{ij}\) into an environment mean parameter \(\mu _j\) and a centered envBV \(U_{ij}\):
We then modeled the phenotypes as [30]:
where \(\varvec{y}^T=\left( \varvec{y}_1^T,...,\varvec{y}_J^T\right)\) is the concatenated vector of phenotypes in J environments with P plots each, \(\varvec{\beta }=(\mu _1,...,\mu _J)^T\) is the vector of fixed environment means, \(\varvec{X}\) is the design matrix for fixed effects, \(\varvec{u}^T=\left( \varvec{u}_1^T,...,\varvec{u}_J^T\right)\) is the concatenated vector of random centered envBVs with \(\varvec{u}\sim \mathscr {N}(0,\varvec{\Sigma }_{G})\) and \(\varvec{\Sigma }_{G}\) being the covariance matrix of \(\varvec{u}\), \(\varvec{Z}\) is the incidence matrix linking phenotypic observations to environmentspecific breeding values, \(\varvec{e}\) is the vector of errors with \(\varvec{e}\sim \mathscr {N}(0,\varvec{\Sigma }_{E})\) and \(\varvec{\Sigma }_{E}\) being the covariance matrix of \(\varvec{e}\). Independence is assumed between \(\varvec{u}\) and \(\varvec{e}\), and the covariance matrix between phenotypic observations is: \(\varvec{\Sigma }_{Y}=\varvec{Z}\varvec{\Sigma }_{G}\varvec{Z}^T+\varvec{\Sigma }_{E}.\)
Let us assume \(\varvec{\Sigma }_{G}=\varvec{\Omega }_{G}\bigotimes \varvec{A}\) where \(\varvec{A}\) is the genomic relationship matrix between genotypes (or any genetic relationship matrix between them), and \(\varvec{\Omega }_{G}\) is the genetic covariance matrix between environments used to account for GE interactions. Similarly, let us assume \(\varvec{\Sigma }_{E}=\varvec{\Omega }_{E}\bigotimes \varvec{I}_P\) where \(\varvec{I}_P\) is the identity matrix of dimension P and \(\varvec{\Omega }_{E}\) is the error covariance matrix between environments. Note that \(\varvec{\Omega }_{E}\) can be written using a Kronecker product as the number of plots/observations P is considered constant in all environments. We will assume the following form for \(\varvec{\Omega }_{G}\) and \(\varvec{\Omega }_{E}\):
where \(\sigma _{G_{j}}^2\) is the genetic variance in environment j, \(\sigma _{G_{j,j'}}\) is the genetic covariance between environments j and \(j'\), and \(\sigma _{E_{j}}^2\) is the error variance in environment j. Let us also define \(\rho _{j,j'}=\dfrac{\sigma _{G_{j,j'}}}{\sigma _{G_{j}}\sigma _{G_{j'}}}\) as the genetic correlation between environments j and \(j'\).
BLUP of GE metrics
Let us first consider the envGEBVs (i.e. the BLUPs of envBVs), which are the cornerstone of MET genomic prediction. They can be obtained by considering the distribution of \(G_{ij}\) conditional on \(\varvec{y}\), which is Gaussian with expectation:
and variance:
In practice, \(\varvec{\beta }\) is unknown and is thus replaced by its best linear unbiased estimate (BLUE):
which, after replacing fixed effects by their estimates, gives the following formula for envGEBVs:
with:
Note that \(P_{ij,ij}\) in Eq. (8) can be seen as the prediction error variance (PEV) associated with the prediction of \(G_{ij}\) for which \(\varvec{\beta }\) would be known [i.e. not estimated using Eq. (9)].
The BLUP of the average envBVs (Average) over environments can be calculated as:
Let us now consider a squared linear combination of envBVs named \(Q^2\). One can calculate the BLUP of \(Q^2\) and split it into two terms following the König–Huygens theorem:
with a first term corresponding to the squared expectation of Q conditional on phenotypes (i.e. the squared BLUP of Q) and a second term corresponding to the variance of Q conditional on phenotypes. This result can be used to calculate the BLUP of each GE metric, which involves quantities previously presented (see matrix derivations in Appendix and algebraic derivations in Supplementary File S1):
and
and
and
For the BLUP of the first three GE metrics, the first part of their expression corresponds to the squared expectation term and consists of replacing envBVs in Eq. (1), Eqs. (2) and (3) by envGEBVs. The rest of the expression corresponds to the variance term and involves summing elements of the \(\varvec{P}\) matrix from Eq. (8). For the BLUP of Finlay–Wilkinson, which consists of a ratio between two terms, a simplifying assumption was made that the expectation of this ratio was approximately equal to the ratio of the expectation of the two terms. Each term of the ratio was then considered similarly to the other GE metrics. Note that this approximation corresponds to the first order Taylor series approximation of the BLUP of Finlay–Wilkinson [31].
In this study, we evaluated three estimators for each GE metric. First, we considered the traditional “NoGeno” estimator, which consists of replacing \(G_{ij}\) by estimates based solely on phenotypic information, i.e. BLUEs of genotypic effects, in Eqs. 1–4. Based on our theoretical results, we then proposed two novel estimators accounting for genomic information: “GenoExp” which incorporates only the squared expectation term of Eqs. 14–17, and “GenoExpVar” which encompasses all quantities from the abovementioned formulas. Note that only the “GenoExpVar” estimators are unbiased in the sense that the expectation (on the phenotypes) of the BLUPs of GE metrics equals the expectation of the GE metrics [32], e.g. \(\textrm{E}_{\varvec{y}}\left[ \textrm{E}\left( L_{i}\varvec{y}\right) \right] =\textrm{E}\left( L_{i}\right)\).
Materials
To validate the new GE estimators, we utilized both simulated and empirical datasets. We performed simulations based on the model in Eq. (6), varying the following parameters: (i) the standard normal deviation of the environment means (\(\sigma _{\mu }\)), where \(\mu _{j} \sim \mathscr {N}(100, \sigma ^2_{\mu })\) independent and identically distributed (IID), (ii) the genetic correlation between 10 environments pairwise (\(\rho _{j,j'}\)), (iii) the heritability in each environment (\(h^2_j\)), where \(h_{j}^2 = c + d_j\) with a constant c and a deviation \(d_j\sim \mathscr {U}(0.2, 0.2)\hspace{2pt}\) IID, and (iv) the sparsness of the dataset defined as the percentage of missing genotypeenvironment combinations. In this study, to examine the influence of the simulation parameters, we first considered a basic scenario with intermediate parameter levels: \(\sigma _{\mu }\) = 1, \(\rho _{j,j'}\) = 0.5 for all \(j\ne j'\), \(h_{j}^2 = 0.5 + d_j\), and sparseness = 0%. We studied the influence of each parameter by modulating them one by one, considering the following levels: \(\sigma _{\mu } \in \{0.1,1,10\}\), \(\rho _{j,j'} \in \{0.2,0.5,0.8\}\) for all \(j\ne j'\), \(c \in \{0.2,0.5,0.8\}\) for \(h_{j}^2\), and sparseness \(\in \{0\%,25\%,50\%,75\%\}\) (Supplementary Table S1).
For simulated traits, the vector of random centered envBVs was generated using the product between the Cholesky decomposition of \(\varvec{\Sigma }_{G}\) (scaled with \(\sigma ^2_{G_{j}}=1\) for all j and \(\rho _{j,j'}\) chosen according to the simulation scenario) and a vector of independent draws from a standard normal distribution [30]. As genomic data, we used data from a 200genotyped wheat population, which consisted of 1279 SNPs obtained through diversity array technology (DArT) sequencing [33].
Three publicly available datasets were used to validate the reliability of our novel GE metric estimators, as presented in Table 1. These datasets span various species, genotypes, and environments, all genotypedbysequencing. They vary in the number of genotypes (111 to 699), environments (3 to 16), and molecular markers (17,288 to 58,960). For each dataset, yield and plant height traits were evaluated in randomized complete block designs in each environment. Adjusted means were calculated to correct for withinenvironment spatial/design effects and were used directly as response variables in the genomic prediction model. Further details on the experiments and adjusted means are available in Table 1.
Workflow scheme
In this study, the workflow was divided into four stages: input data, parameter inference, GE metrics estimation, and estimator performance (Fig. 1). The specific details are expounded upon as follows:

Input data: Three empirical datasets were considered (Table 1), as well as simulated datasets considering combinations of trait and environment parameter levels: \(\sigma _\mu\), \(\rho _{j,j'}\), \(h_{j}^2\), and sparseness (Supplementary Table S1). A detailed description of the input datasets and the simulation procedure is presented in the previous section.

Parameter inference: The second phase involved adjusting the model from Eq. (6) on the input data. From this model, estimates were obtained for fixed effects \(\varvec{\beta }\) and for the \(\varvec{\Omega }_{G}\) and \(\varvec{\Omega }_{E}\) variance components, which are used to calculate \(\varvec{\Sigma }_{G}\), \(\varvec{\Sigma }_{E}\), and \(\varvec{\Sigma }_{Y}\). A Bayesian multivariate approach based on a Gibbs sampler was used to infer parameters [37].

GE metrics estimations: The third step consisted of estimating the GE metrics. New estimators require the calculation of envGEBVs (Eq. 10) as well as the conditional variance of envBVs (Eq. 8). Two genomicbased estimators were derived: one based only on the squared expectation term (GenoExp) and a second also involving the variance term (GenoExpVar). Note that the phenotypebased estimator (NoGeno) only requires phenotypic data. For the latter, missing genotypeenvironment combinations were imputed with the average value of the environment, unlike genomicbased estimators for which they could always be predicted.

Estimator performance: The fourth step entailed comparing the different estimators (NoGeno, GenoExp, and GenoExpVar) in terms of the precision of their estimates. Their performance was evaluated based on the correlation and the rootmeansquared error (RMSE) between the GE metric estimates and either (i) the true GE metrics based on envBVs for simulated datasets or (ii) the GEmetrics calculated based on adjusted means (i.e. NoGeno estimates) for empirical datasets. Therefore, the NoGeno estimators were not evaluated for empirical datasets, as these estimates were used as reference values for validation. Using simulation, a selection coincidence was determined by calculating the proportion of common genotypes obtained with a given selection intensity (e.g. 10% best) between genotypes selected based on GE metric estimates and GE metric true values. For the genomic prediction of GE metrics of nonobserved genotypes, crossvalidation (CV) was performed by repeatedly splitting the datasets into training and test sets, considering 50 repetitions. The CV approach consisted of discarding all phenotypic data associated with test genotypes, commonly referred to as CV1 in MET experiments [20]. Different training set sizes were defined in terms of percentage of the full dataset (25%, 50%, and 75%) and precision criteria were calculated based on the test set only.
Statistical software
We conducted all analysis using the R programming language version 4.2.2 [38]. We developed a new Rpackage named “GEmetrics” available from the CRAN and from GitHub (https://github.com/TheRocinantelab/GEmetrics), providing functions for simulating data and calculating new GE metric estimators. The inference of model parameters has been performed using the R package “BGLR” [39], with the “Multitrait” function and the following parameters: nIter = 20000, burnIn = 2000, and thin = 2. Data visualization and harmonization were carried out using the tidyverse Rpackage [40]. The plots and figures in this study are colorblindfriendly, following the Safe pallet of color from the colorblindcheck Rpackage [41].
Results
Simulationbased evaluation of estimator performance
Simulation of trait and environment parameters: In this study, to evaluate the influence of trait and environment parameters on the precision of GE metric estimates, we simulated MET data varying the following parameters: standard deviation of the environmental means (\(\sigma _\mu\)), the genetic correlation between environments (\(\rho _{j,j'}\)), heritability (\(h_{j}^2\)), and sparseness of the data. We considered a basic scenario with 0% sparseness and intermediate parameter levels for other parameters (\(\sigma _\mu\) = 1, \(\rho _{j,j'}\) = 0.5, and \(h_{j}^2\) = 0.5). We assessed the impact of each parameter by modulating it while keeping the others fixed (Supplementary Table S1).
As a precision criterion, we first used the correlation between GE metric estimates and GE metrics calculated using simulated envBVs (Fig. 2). Results revealed large differences in precision between GE metrics, regardless of the estimator. The two GE metrics assessing the genotype superiority (i.e. Lin–Binns and Average) had higher correlations than the other three metrics assessing stability. Increasing the values of \(\sigma _\mu\) and \(h_{j}^2\) parameters increased the correlations while the sparseness parameter decreased the correlation for all GE metrics. Regarding the genetic correlation \(\rho _{j,j'}\), increasing the parameter values was associated with a slightly decreasing correlation for Ecovalence, Environmental Variance, and Finlay–Wilkinson and a slightly increasing correlation for Lin–Binns and Average. The phenotypebased estimator (NoGeno) underperformed compared with the estimators taking advantage of the genomic information (GenoExp and GenoExpVar) for all scenarios. The GenoExpVar estimator had a slightly higher correlation for Ecovalence than GenoExp for all simulation parameters. This difference between the two estimators was amplified with increasing sparseness and decreasing heritability. Unlike Ecovalence, the other GE metrics presented similar correlation values for all simulation scenarios when comparing GenoExp and GenoExpVar estimators. We confirmed all these results using the scaled RMSE as an alternative precision criterion (Supplementary Figure S1).
Coincidence of Selection: To evaluate the ability of the new estimators to identify the most superior and/or stable genotypes among a set of individuals, we defined the selection coincidence as the proportion of common genotypes selected using GE metric estimates and GE metric true values, with a given selection intensity. Note that this approach was not applied to Finlay–Wilkinson, as stability cannot be assessed in terms of low (or high) values, unlike Ecovalence, Environmental Variance and Lin–Binns for which a low value is desirable. For Average, the direction towards which the selection is made depends on the trait. A low value was considered desirable here, but the opposite choice could have been considered without changing the results. The basic simulated scenario was used for comparison, but with 25% sparseness to highlight the differences between GenoExp and GenoExpVar, which tend to increase with sparseness. The coincidence of selection was first illustrated based on one simulation run with 10% selection intensity (Fig. 3), displaying coincident genotypes as large bullet points, and further summarized over 50 replicates (Table 2). Overall, the percentages of coincidence of selection presented in Table 2 confirmed the superiority of GenoExp and GenoExpVar estimators over NoGeno in all tested scenarios. When compared, GenoExp and GenoExpVar estimators generally presented a similar coincidence of selection. However, for Ecovalence, GenoExpVar had higher values than GenoExp when considering 10%, 15%, and 20% selection intensity (Table 2). Regardless of the estimator, the highest mean values of coincidence were achieved by Average followed by Lin–Binns, and the lowest values were observed for Ecovalence. Similar conclusions could be made considering 0%, 50%, and 75% sparseness (Supplementary Tables S2, S3, and S4).
Genomic prediction of GE metrics
Simulation: Using simulations, we investigated the ability of the new estimators GenoExp and GenoExpVar to predict the GE metrics of genotypes that were never observed over the whole design. To do so, we performed crossvalidation with increasing training set size (Fig. 4). The different estimators were evaluated for their predictive ability, i.e. the correlation between the GE metrics predictions and the true GE metric values calculated on the test set genotypes. The predictive ability of all GE metrics increased with training set size regardless of the estimator. The highest values were obtained for Average, followed by LinBinns, and the lowest values were obtained for Ecovalence. For Environmental Variance, Finlay–Wilkinson and Lin–Binns, the performance of the new estimators (GenoExp and GenoExpVar) was similar across training set sizes. However, for Ecovalence, GenoExpVar outperformed GenoExp, in particular for a small training test size of 25% of the total number of genotypes. We confirmed all these results using the scaled RMSE as an alternative precision criterion (Supplementary Figure S2).
Empirical Datasets: For all empirical datasets, the adjusted means for yield and plant height were plotted in a scaled boxplot to observe the variability between environments (Supplementary Figure S3). In general, the variability between environments was higher for yield than plant height. Unlike simulations, we did not have access to the true value of the GE metrics and could only use the phenotypebased estimates obtained from the NoGeno estimator as a reference. To avoid overfitting caused by the inclusion of phenotypes in both the training and validation data, we evaluated the performance of the GenoExp and GenoExpVar estimators by crossvalidation only, accessing the predictive ability (Fig. 5). In general, the predictive abilities obtained were moderate to high for superiority measures (Lin–Binns and Average) depending on the dataset and the trait. In comparison, the predictive abilities obtained for the other stability GE metrics were low to moderate. Like with simulations, the predictive abilities generally increased with increasing training set size. The performance of GenoExp and GenoExpVar was very similar across GE metrics, datasets, and traits. However, for the prediction of Ecovalence using the maize dataset, GenoExpVar outperformed GenoExp for yield and plant height. The opposite situation was observed for the prediction of Ecovalence using the Sorghum for the same two traits. We confirmed all these results using the scaled RMSE as an alternative precision criterion (Supplementary Figure S4).
Discussion
New genomicsbased GE metric estimators
In this study, we presented new estimators of the following GE metrics: Environmental Variance [7], Ecovalence [8], Finlay–Wilkinson regression coefficient [9], Lin–Binns superiority measure [10], and the Average performance. Our strategy consisted of calculating the BLUP of each GE metric, defined as random variables, based on any multienvironment genomic prediction model [30]. This approach contrasts with modeling GE metrics as model parameters [27,28,29], which requires fitting dedicated models.
Except for Average, all BLUPs of GE metrics involved a squared expectation and a variance term. From these theoretical results, we have built two types of estimators: GenoExp involving only the squared expectation term and GenoExpVar involving both terms. In practice, applying GenoExp essentially involves calculating GE metrics using envGEBVs. These envGEBVs are common quantities from the application of a multienvironment genomic prediction model. Regarding the variance term, it requires the \(\varvec{P}\) matrix from Eq. (8), which can be easily calculated from the model parameter estimates and design and covariance matrices.
The precision of new genomicsbased estimators was compared to that of phenotypebased estimators using simulation. Their superiority in terms of estimation accuracy was confirmed for all GE metrics (Fig. 2 and Table 2). The substantial gains observed could probably be explained by the ability of the new genomicsbased estimators to borrow information from relatives. Similarly, when focusing on estimating TBVs rather than GE metrics, gains in precision have been observed when combining genomic and phenotypic data [42].
The most interesting feature of our novel estimators is their ability to handle missing values, which is referred to here as sparseness. The gains, compared to the phenotypebased estimator (NoGeno), generally increased with sparseness (Fig. 2). This ability to handle missing values could be extended to the extreme case of predicting the GE metric values of genotypes that have never been observed over the design. Note that this application was only possible using genomicsbased estimators. We demonstrated that GE metrics could be predicted with moderate accuracies using simulated and empirical data (Figs. 4 and 5). The accuracies obtained from the empirical data were often lower than those obtained from simulations. This could be explained by the fact that, unlike simulations, we did not have access to true GE metric values (i.e. values calculated using envBVs), but only to phenotypebased estimates as reference values. Similarly, assessing the accuracy of genomic prediction on empirical data sets is impacted by the heritability of phenotypebased BV estimates used to validate predictions. This phenotypebased accuracy is commonly divided by the square root of the heritability to obtain an estimate of the true accuracy [43]. However, when it comes to the predictive ability of GE metrics, a simple division by the square root of the heritability is not sufficient, and additional research is required to determine the appropriate adjustment.
Regarding the differences between the two genomicsbased estimators, we demonstrated that the use of the complete BLUP (GenoExpVar) was beneficial for Ecovalence but less so for the other GE metrics. From a theoretical perspective, the additional variance term included in GenoExpVar is used to add variance to environmentgenotype combinations that are poorly predicted. Without this term, a genotype without any observation and unrelated to the rest of the population would be considered perfectly stable according to Ecovalence (i.e. estimate close to zero), since it would be predicted at the mean value of each environment. Due to the presence of the variance term, this hypothetical genotype would not be considered perfectly stable as its Ecovalence would be estimated above 0.
In practice, we recommend applying the complete genomic BLUP of GE metrics, which we define here as GenoExpVar estimators, because they are unbiased. However, replacing phenotypes with envGEBVs in traditional phenotypebased estimators has also proven to perform well for most GE metrics, which justifies this approach.
For Finlay–Wilkinson, the genomicsbased estimators were obtained by approximating the expectation of the ratio from in Eq. 4 by the ratio of the expectation of the numerator and denominator, which consists of a first order Taylor series approximation [31]. In the future, genomicsbased estimators of Finlay–Wilkinson may be improved by considering higher order Taylor series approximations.
All GE metric estimators could theoretically be applied in the absence of genomic information, provided that the model from Eq. (6) is identifiable. To achieve this, the genomic relationship matrix should be replaced by the identity matrix, and certain genotypes should have replicates in certain environments to separate genetic covariance from error covariance. Because there would be no information shared between relatives, the accuracies associated with predicting GE metrics would probably be inferior to those obtained with genomic information.
Impact of environment and trait parameters on GE metric estimates
The GE metrics considered in this study serve as tools for assessing both genotype stability and superiority across various environments. These metrics are sensitive to many parameters that characterize the trait and environment. Likewise, the ability to estimate these metrics can be influenced by these same trait and environment parameters, possibly differently depending on the estimator. We have, therefore, investigated the impact of such parameters on the estimation accuracy of our new estimators.
The first parameter considered was the environmentspecific heritability (\(h^2_j\)), which quantifies the proportion of the variance that can be attributed to genetics in a given environment. As expected, the estimation accuracy increased with the heritability, regardless of the GE metric and estimator. Phenotypic data with high heritability are less affected by environmental variations, making their estimation more straightforward. This concept is also crucial in genomic prediction, where high heritability has been shown to improve the accuracy of predictions [44].
The second parameter was the level of sparseness monitoring the amount of information available to estimate the GE metrics. As expected, increasing sparseness was associated with decreasing estimation accuracy, regardless of the GE metric and estimator.
The third parameter considered in this study was the genetic correlation between pairs of environments (\(\rho _{j,j'}\)). This parameter was used to control the level of GE interaction generating differences in genotype ranking between environments, commonly referred to as crossovers and as opposed to the convergence/divergence of envBVs according to the environment [45]. The impact of genetic correlations between environments differed depending on whether the GE metric characterizes stability or superiority. For superiority GE metrics, i.e. Average and Lin–Binns, increasing genetic correlations were associated with increased accuracy (Fig. 2). In the absence of GE interaction, the identification of the bestperforming genotype becomes easier as all environments are consistent with each other. For stability measures, increasing genetic correlations was associated with decreased accuracy (Fig. 2). In the absence of GE interaction, all genotypes tend to become equally stable, with differences in performance due solely to shared environmental effects. As a consequence, assessing which genotype is the most stable becomes very difficult, regardless of the estimator.
Finally, the last parameter was the variance of the environment means \(\sigma _{\mu }^2\), which controls the proportion of overall variance due to interenvironmental variability relative to intraenvironmental variability. This parameter only had an impact on the precision of Finlay–Wilkinson and Environmental Variance (Fig. 2). For Finlay–Wilkinson, a higher variance of the environment means directly impacts the variable on which envBVs are regressed. The greater the variance, the easier it is to adjust the regression and estimate the regression coefficient.
The main drawback of the genomic prediction model in Eq. 6 is the difficulty associated with estimating variance parameters (\(\varvec{\Omega }_G\) and \(\varvec{\Omega }_E\) ) correctly when the number of environments becomes large [20]. In such cases, factor analysis approaches may be considered to help structure the genetic covariance matrix \(\varvec{\Omega }_G\) and reduce the number of parameters to estimate [46, 47]. However, the simplified GE structure resulting from those approaches may limit the ability to identify unstable genotypes according to specific GE patterns not accounted for in the factor analysis model. Alternatively, if many genotypes per environment are connected by pedigree relationships, it may not be necessary to borrow information across environments, as sufficient data from observed relatives in each environment can enable accurate predictions. In this scenario, it would be feasible to apply a simplified model that ignores genetic covariance between environments, using a heterogeneous diagonal \(\varvec{\Omega }_G\) matrix.
Conclusions
Assessing the superiority and/or the stability of genotype performance over a set of target environments is crucial to making the most suitable varieties available to farmers. An accurate evaluation generally involves costly MET experiments where all genotypeenvironment combinations are observed repeatedly in each environment. In practice, the data resulting from these experiments is often incomplete due to various issues at different stages of the experiment (e.g. germination defects, plant establishment, floral abortion, etc.). Our new estimators offer an elegant way to deal with incomplete datasets as missing genotypeenvironment combinations can be predicted in the presence of genomic information. Even when the dataset is complete, we have demonstrated that gains in accuracy can be expected by exploiting information shared between relatives.
METs for which genomic information is available for all genotypes are likely to become the norm in breeding companies. We recommend estimating the proposed GE metrics along with the average genotype performance, which is the most common metric characterizing genotype performance. Two genotypes with similar average performance may exhibit contrasting behavior in terms of stability, and an unstable genotype is susceptible to underperforming in given environmental conditions. Such information is of particular interest from late breeding stages to variety registration. The availability of accurate GE metric estimates provided by our new estimators should facilitate the identification of highperformance and stable genotypes in a range of target environments.
Availability of data and materials
All the code and data used in this study can be found in the repository: https://github.com/TheRocinantelab/Publications/GEmetrics.
Change history
17 July 2024
A Correction to this paper has been published: https://doi.org/10.1186/s13007024012311
References
Cooper M, Messina CD. Can we harness “enviromics’’ to accelerate crop improvement by integrating breeding and agronomy? Front Plant Sci. 2021;12: 735143.
Philipo M, Ndakidemi PA, Mbega ER. Environmentally stable common bean genotypes for production in different agroecological zones of Tanzania. Heliyon. 2021;7: e05973.
Sakai T, et al. Wholegenome analysis of recombinant inbred rice lines reveals a quantitative trait locus on chromosome 3 with genotypebyenvironment interaction effects. G3 Genes Genomes Genetics. 2023;13:jkad082.
BustosKorts D, et al. Genotype by environment interaction and adaptation. In: Meyers RA, editor., et al., Encyclopedia of sustainability science and technology. New York: Springer; 2018. p. 1–44.
Cooper M, et al. Predicting genotype\(\times\) environment\(\times\) management (g\(\times\) e\(\times\) m) interactions for the design of crop improvement strategies: integrating breeder, agronomist, and farmer perspectives. Plant Breed Rev. 2022;46:467–585.
Becker H, Leon J. Stability analysis in plant breeding. Plant Breed. 1988;101:1–23.
Roemer J. Sind die ertragreichen sorten ertagssicherer? Mitt DLG. 1917;32:87–9.
Wricke G. Über eine methode zur erfassung der ökologischen streubreite in feldversuchen. Z pflanzenzuchtg. 1962;47:92–6.
Finlay K, Wilkinson G. The analysis of adaptation in a plantbreeding programme. Aust J Agric Res. 1963;14:742–54.
Lin C, Binns M. A superiority measure of cultivar performance for cultivar \(\times\) location data. Can J Plant Sci. 1988;68:193–8. https://doi.org/10.4141/cjps88018.
Crossa J. Statistical analyses of multilocation trials. Adv Agron. 1990;44:55–85.
Henderson C. Statistical methods in animal improvement: historical overview. In: Advances in statistical methods for genetic improvement of livestock. Springer; 1990. p. 2–14.
Searle SR, Casella G, McCulloch CE. Variance components. John Wiley & Sons; 2006.
Piepho H, Möhring J, Melchinger A, Büchse A. Blup for phenotypic selection in plant breeding and variety testing. Euphytica. 2008;161:209–28.
de Resende MD, Ramalho MA, Guilherme SR, de Abreu FBÂ. Multigeneration index in the withinprogenies bulk method for breeding of selfpollinated plants. Crop Sci. 2015;55:1202–11.
Moehring J, Williams ER, Piepho HP. Efficiency of augmented prep designs in multienvironmental trials. Theor Appl Genet. 2014;127:1049–60.
e Silva FF, Viana JMS, Faria VR, de Resende MDV. Bayesian inference of mixed models in quantitative genetics of crop species. Theor Appl Genet. 2013;126:1749–61.
Meuwissen THE, Hayes BJ, Goddard ME. Prediction of total genetic value using genomewide dense marker maps. Genetics. 2001;157:1819–29. https://doi.org/10.1093/genetics/157.4.1819.
VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91:4414–23.
Burgueño J, de los Campos G, Weigel K, Crossa J. Genomic prediction of breeding values when modeling genotype\(\times\) environment interaction using pedigree and dense molecular markers. Crop Sci. 2012;52:707–19.
Jarquín D, et al. A reaction norm model for genomic selection using highdimensional genomic and environmental data. Theor Appl Genet. 2014;127:595–607.
LopezCruz M, Genes Genomes, et al. Increased prediction accuracy in wheat breeding trials using a marker\(\times\) environment interaction genomic selection model. G3 Genet. 2015;5:569–82.
Cuevas J, et al. Genomic prediction of genotype\(\times\) environment interaction kernel regression models. Plant Genome. 2016. https://doi.org/10.3835/plantgenome2016.03.0024.
CostaNeto G, FritscheNeto R, Crossa J. Nonlinear kernels, dominance, and envirotyping data increase the accuracy of genomebased prediction in multienvironment trials. Heredity. 2021;126:92–106.
Lian L, de Los Campos G. Fw: an r package for Finlay–Wilkinson regression that incorporates genomic/pedigree information and covariance structures between environments. G3 Genes Genomes Genet. 2016;6:589–97.
Foulley JL, Hanocq E, Boichard D. A criterion for measuring the degree of connectedness in linear models of genetic evaluation. Genet Select Evol. 1992;24:315–30.
Shukla GK. Some statistical aspects of partitioning genotypeenvironmental components of variability. Heredity. 1972;29:237–45. https://doi.org/10.1038/hdy.1972.87.
Lin CS, Binns MR, Lefkovitch LP. Stability analysis: where do we stand? Crop Sci. 1986;26:894–900. https://doi.org/10.2135/cropsci1986.0011183X002600050012x.
Gogel BJ, Cullis BR, Verbyla AP. REML estimation of multiplicative effects in multienvironment variety trails. Biometrics. 1995;51:744. https://doi.org/10.2307/2532960.
Rio S, Akdemir D, Carvalho T, Isidroy Sánchez J. Assessment of genomic prediction reliability and optimization of experimental designs in multienvironment trials. Theor Appl Genet. 2021. https://doi.org/10.1007/s00122021039722.
ElandtJohnson RC, Johnson NL. Survival distributions. Ch. 3. John Wiley & Sons Ltd; 1999. p. 50–79. https://doi.org/10.1002/9781119011040.ch3.
Searle S, Casella G, McCulloch C. Appendix S: some results in statistics. John Wiley & Sons, Ltd; 1992. p. 461–74. https://doi.org/10.1002/9780470316856.app3.
Crossa J, et al. Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics. 2010;186:713–24.
Westhues CC, et al. Prediction of maize phenotypic traits with genomic and environmental predictors using gradient boosting frameworks. Front Plant Sci. 2021;12: 699589.
Rio S, et al. Genomic prediction and training set optimization in a structured Mediterranean oat population. Theor Appl Genet. 2021;134:3595–609.
Fernandes SB, Dias KO, Ferreira DF, Brown PJ. Efficiency of multitrait, indirect, and traitassisted genomic selection for improvement of biomass sorghum. Theor Appl Genet. 2018;131:747–55.
PérezRodríguez P, de los Campos G. Multitrait Bayesian shrinkage and variable selection models with the BGLRR package. Genetics. 2022. https://doi.org/10.1093/genetics/iyac112.
R Core Team. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2022. https://www.Rproject.org/.
Perez P, de los Campos G. Genomewide regression and prediction with the BGLR statistical package. Genetics. 2014;198:483–95.
Wickham H, et al. Welcome to the tidyverse. J Open Source Softw. 2019;4:1686.
Nowosad J. Check color palettes for problems with color vision deficiency; 2019. https://jakubnowosad.com/colorblindcheck/. R package version 0.1.0.
Endelman JB, et al. Optimal design of preliminary yield trials with genomewide markers. Crop Sci. 2014;54:48–59. https://doi.org/10.2135/cropsci2013.03.0154.
Estaghvirou SBO, et al. Evaluation of approaches for estimating the accuracy of genomic prediction in plant breeding. BMC Genomics. 2013. https://doi.org/10.1186/1471216414860.
Daetwyler HD, Villanueva B, Woolliams JA. Accuracy of predicting the genetic risk of disease using a genomewide approach. PLoS ONE. 2008;3:1–8. https://doi.org/10.1371/journal.pone.0003395.
Malosetti M, Ribaut JM, van Eeuwijk FA. The statistical analysis of multienvironment data: modeling genotypebyenvironment interaction and its genetic basis. Front Physiol. 2013. https://doi.org/10.3389/fphys.2013.00044.
Smith A, Cullis B, Thompson R. Analyzing variety by environment data using multiplicative mixed models and adjustments for spatial field trend. Biometrics. 2001;57:1138–47. https://doi.org/10.1111/j.0006341x.2001.01138.x.
de los Campos G, Gianola D. Factor analysis models for structuring covariance matrices of additive genetic effects: a bayesian implementation. Genet Select Evol. 2007;39:481. https://doi.org/10.1186/12979686395481.
Funding
HFC received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No 818144. JGA is working under a UPM predoctoral grant as part of the program “Programa Propio I+D+i” financed by the Technical University of Madrid. JIyS was supported by the Beatriz Galindo Program (BEAGAL18/00115) from the Ministerio de Educación y Formación Profesional of Spain and the Severo Ochoa Program for Centers of Excellence in R &D from the Agencia Estatal de Investigación of Spain, grant SEV2016 0672 (20172021) to the CBGP.
Author information
Authors and Affiliations
Contributions
HFC was responsible for carrying out all analyses, providing support for the development of the Rpackage, and taking the lead in writing the initial draft of the manuscript. SR played a significant role in the study by contributing to the derivation of the new estimators, the development of the Rpackage and assisting in the drafting of the manuscript. JGA provided valuable support in data harmonization and visualization R codes. JIyS played a crucial role in study conceptualization, coordination, manuscript drafting, and securing study funding. All authors actively participated in the review process, gave their approval, and provided final consent for the publication of the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no Conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: the ESM has been revised.
Supplementary Information
Appendix
Appendix
Note that Lin–Binns (Eq. 1), Environmental Variance (Eq. 2), Ecovalence (Eq. 3), and the numerator/denominator of Finlay–Wilkinson (Eq. 4) can be expressed as:
where \(Q_i\) is a GE metric with quadratic form, a is a coefficient, \(\varvec{g}\) is the vector of envBVs of dimension NJ, \(\varvec{C}_i\) and \(\varvec{D}_i\) are contrast matrices of dimension \(J\times NJ\) specific to individual i.
The expected value of \(Q_i\) conditional on phenotypes is
where \(\widehat{\varvec{g}}\) is the vectors of BLUPs of environmentspecific values and “\(\textrm{Tr}\)” stands for the trace of the matrix (i.e. the sum of diagonal elements).
For Lin–Binns (Eq. 1), \(\varvec{C}_i=\varvec{D}_i\) has the following form:
For Environmental Variance (Eq. 2), \(\varvec{C}_i=\varvec{D}_i\) has the following form:
For Ecovalence (Eq. 3), \(\varvec{C}_i=\varvec{D}_i\) has the following form:
For the numerator of Finlay–Wilkinson (Eq. 4), \(\varvec{C}_i\) has the same form as Eq. (21) and \(\varvec{D}_i\) has the following form:
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Carvalho, H.F., Rio, S., GarcíaAbadillo, J. et al. Revisiting superiority and stability metrics of cultivar performances using genomic data: derivations of new estimators. Plant Methods 20, 85 (2024). https://doi.org/10.1186/s13007024012071
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13007024012071