Points de repère dans l'analyse de la stabilité et de l'interaction génotype-milieu en amélioration des plantes

M. Brancourt-Hulmel; V. Biarnès-Dumoulin; J.B. Denis

doi:doi:10.1051/agro:19970403

All issues

Volume 17 / No 4 (1997)

Agronomie, 17 4 (1997) 219-246

Abstract

Free Access

Issue		Agronomie Volume 17, Number 4, 1997


Page(s)		219 - 246
DOI		https://doi.org/10.1051/agro:19970403

Agronomie 17 (1997) 219-246
DOI: 10.1051/agro:19970403

Points de repère dans l'analyse de la stabilité et de l'interaction génotype-milieu en amélioration des plantes

M. Brancourt-Hulmel^a, V. Biarnès-Dumoulin^a and J.B. Denis^b

^a Laboratoire de génétique et d'amélioration des plantes, Inra, F-80200 Estrées-Mons
^b Laboratoire de biométrie, Inra, route de Saint-Cyr, F-78026 Versailles cedex, France

Abstract - Guiding marks on stability and genotype-environment interaction analyses in plant breeding. In plant breeding studies, different statistical stabilities between genotypes or genotype-environment interactions (GEI) must often be considered since genotype responses differ from one environment to another. This paper reviews the statistical techniques used in recent literature up to 1996 and the most recent developments are described. First, stability concepts are reviewed and genotype-environment interaction is defined according to the following notation: where E [Y _ge] is the expectation of a given observation Y_ge for genotype g and environment e, p is the grand mean, α_g is the genotype main effect, β_e the environment main effect and αβ_ge is the interaction between genotype and environment, defined as the complement from the additive model (p + α_g + β_e). Then, main statistical methods are presented and classified from an interpreting point of view into five main approaches: (1) Uniparametric approaches: stability or GEI is described with a single parameter. Environmental variance can be set so as to differ for each genotype, which was first introduced by Roemer (1917, cited from Becker and Léon, 1988) and written as follows: μ and a have the same meaning as in the first model and σ² are variance parameters associated with each genotype. The joint regression model, first proposed by Yates and Cochran (1938), which uses environment main effect as a pseudo covariate for modelling the interaction term, also belongs to this category: where ρ_g is the genotype slope or genotype regression coefficient that describes the genotype response to environment potentiality estimated by β_e, its main effect. Other terms of the model, E [Y _ge], μ and α_g are defined as in the first model. This family of models is attractive for the simplicity of its interpretations. Most authors have concluded that these models oversimplify and have added new parameters such as goodness of fit. This leads to more sophisticated families of models. (2) Multiparametric fixed approaches: GEI is modelled by means of several parameters associated with each genotype. There are two basic models: biadditive (or AMMI) models and factorial regression models. They can be extended and combined in several ways, see Gauch (1992) and van Eeuwijk et al (1996). The multiplicative model is written: where λ₁ is the singular value that accounts for the interactive part explained by the first term, γ_g1 is the normalised genotype vector describing genotype differences and δ_e1 similarly describes the environments; λ₂, γ_g2 and δ_e2 are assigned to the second term involving orthogonality constraints with the first term and so on. As previously, other terms of the model, (μ + α_g + β_e), correspond to the additive part of the model. The factorial regression model can be written: where θ_kh, α'_gh and β'_ek are regression parameters involving H environment covariates E_eh and K genotype covariates G_gk. Again (μ + α_g + β_e) is the additive part of the model. A common feature of the AMMI model and factorial regression is that both describe the interaction multiplicatively as a genotype score times an environment score. However, in the AMMI model, both parameters are unknown (bilinear model in parameters), while only a single parameter is unknown in regression, implying a linear model. From a practical point of view, regression is thought to be easier for interpretation but on the other hand it requires that relevant covariates be available. (3) Mixed (random and fixed) parametric approaches: starting from a pioneer work of Shukla (1972), factorial regression models can also be used when environments are considered as a random factor and heteroscedastic genotype variances are introduced; see Denis et al (1997) for a recent development. (4) Nonparametric approaches: this family includes different methods whose common feature is based on genotype ranking and not on estimation or prediction of genotype performances. This is indeed an attractive aim in many breeding programs where breeders are interested in rank order for choosing the best genotypes. In such cases, relative comparisons are sufficient and there is no need to assess the levels. (5) Clustering approaches: here the idea is not to obtain a continuous function modelling the interaction but to identify clusters of similar genotypes and/or clusters of similar environments such that most of the interactive variability is captured by the groups of genotypes and/or environments (defining 'between' effects). From a statistical as well as an interpreting point of view, a crucial distinction has to be made according to whether the clusters are determined a priori (by additional information) or a posteriori (based on the data to be explained). In the last section, comparisons of most of the previous methods are carried out, mainly by means of tables summarising results obtained from the literature (tables II, IV, V, VIII and IX and fig 2). Among them, figure 2 depicts 52 interaction studies using either joint regression, multiplicative approach or factorial regression. These interaction studies are characterised by the proportion of parameters used by the model with respect to the complete interaction (the 'cost' or in the reverse term the 'parsimony') and the proportion of interaction explained by the model (the 'efficiency'). As illustrated in this figure, the AMMI model and factorial regression are equally efficient and much better than joint regression. Our advice is to use factorial regression when relevant covariates are available, owing to its easier interpretation.

Résumé - En amélioration des plantes, le chercheur est souvent amené à réaliser des analyses de stabilité ou d'interaction génotype-milieu. Des revues bibliographiques existent sur le sujet et décrivent des approches différentes selon les auteurs. Le présent article propose une classification des principales méthodes utilisées pour une période allant jusqu'en 1996 en mettant l'accent sur les plus récentes, notamment les méthodes qui font intervenir plusieurs paramètres pour décrire la stabilité des génotypes. En vue de comparer la régression conjointe, la modélisation multiplicative de l'interaction (ou modèle AMMI) et la régression factorielle sur la base de l'efficacité (mesurée par le pourcentage de la somme des carrés des écarts de l'interaction décomposée par le modèle) et de la parcimonie (appréciée par le nombre de degrés de liberté utilisés par le modèle), diverses récapitulations ont été réalisées. Pour chaque méthode, elles s'appuient sur la littérature et mentionnent diverses caractéristiques telles que l'espèce étudiée, la variable analysée, le nombre de génotypes et d'environnements, l'efficacité, la parcimonie et le rapport entre les deux dernières. En général, ceci met en valeur la modélisation multiplicative de l'interaction et la régression factorielle. Cette dernière permet en outre de proposer une explication biologique à l'interaction.

Key words: stability / genotype-environment interaction / plant breeding

Mots clés : stabilité / interaction génotype-milieu / amélioration des plantes