I dedicate this package to my colleague, Professor Dr. Manjit S. Kang, a great biostatistician and quantitative geneticist, who passed away recently and has devoted his life to PLANT BREEDING science and has provided me with invaluable supports!.
Stability analysis is crucial in plant breeding to select superior genotypes that perform consistently across different environments. Several models and methods have been developed for this purpose, including Additive Main Effect and Multiplicative Interaction (AMMI), Weighted Average of Absolute Scores (WAASB), and Genotype plus Genotype-Environment (GGE) interactions in multi-environmental trials (MET) (Mishra et al. 2024; Sakata 2021; Pour-Aboughadareh et al. 2022; Danakumara et al. 2023). These analyses help breeders identify genotypes with stable performance and adaptability under varying conditions, leading to optimal yield stability and successful crop production. Utilizing these stability analysis tools, breeders can navigate the complexities of genotype-environment interactions and select genotypes that consistently excel across different locations and seasons, ensuring the development of resilient and high-performing plant varieties (SWARUP and SINGH 2014).
By combining the strengths of AMMI for assessing stability and BLUP
for prediction accuracy, breeders can effectively select genotypes that
consistently perform well across different environmental conditions.
This is crucial for developing sustainable agricultural systems and
improving food security. To estimate the stability index of genotypes in
multi-environment trials (METs), the WAASB index
(weighted average of the absolute values obtained from the singular
value decomposition of the BLUP matrix for the genotype by environment
interaction effects, generated by a linear mixed-effects model) is
calculated using the formula provided by (Olivoto et al. 2019). They suggest
that the function *waasb()*
computes the Weighted Average
of the Absolute Scores considering all possible IPCA from the Singular
Value Decomposition of the BLUPs for genotype-vs-environment interaction
effects obtained by an Linear Mixed-effect Model (Olivoto et al.
2019) , as follows:
$$ WAASB_i = \sum_{k = 1}^{p} |IPCA_{ik} \times EP_k|/ \sum_{k = 1}^{p}EP_k $$
where WAASBi is the weighted average of absolute scores of the ith genotype; IPCAik is the scores of the ith genotype in the kth IPCA; and EPk is the explained variance of the kth PCA for k = 1, 2, .., p, p = min(g − 1; e − 1).
Further, WAABSY is a superiority or simultaneous selection index allowing weighting between mean performance and stability (Olivoto et al. 2019):
$$ WAASBY_i=\frac{\left({rY}_i\times\theta_Y\right)+\left({rW}_i\times\theta_s\right)}{\theta_Y+\theta_s} $$ where WAASBYi is the superiority index for genotype }_i and {rW}_i are the rescaled values for mean performance $\bar{Y_i}$ and stability Wi, respectively of the genotype i. For the details of calculations, rescalling and mathematics notations see (Olivoto et al. 2019).
In this package, my new index (rYWAASB) gives results which can be compared with WAASBY index provided by Olivoto et al. (Olivoto et al. 2019).
For working with rYWAASB package, firstly, if the metan or rYWAASB packages are not already installed, they should be installed on the system. The analysis requires the following packages to be installed:
if(!require('rYWAASB')){
install.packages('rYWAASB') # call the package
}
#> Loading required package: rYWAASB
library('rYWAASB')
The analyses requires the following R packages:
## For graphical displays
library(metan)
library(ggplot2)
library(graphics)
library(factoextra)
library(FactoMineR)
The codes provided below form the metan package, allow you to access the WAASB index values, rankings, and other information for genotypes (or entries) in general.
waasb_model <-
waasb(data_ge,
env = ENV,
gen = GEN,
rep = REP,
resp = everything(),
random = "gen", #Default
verbose = TRUE) #Default
#> Evaluating trait GY |====================== | 50% 00:00:01 Evaluating trait HM |============================================| 100% 00:00:02
#> Method: REML/BLUP
#> Random effects: GEN, GEN:ENV
#> Fixed effects: ENV, REP(ENV)
#> Denominador DF: Satterthwaite's method
#> ---------------------------------------------------------------------------
#> P-values for Likelihood Ratio Test of the analyzed traits
#> ---------------------------------------------------------------------------
#> model GY HM
#> COMPLETE NA NA
#> GEN 1.11e-05 5.07e-03
#> GEN:ENV 2.15e-11 2.27e-15
#> ---------------------------------------------------------------------------
#> All variables with significant (p < 0.05) genotype-vs-environment interaction
data <- waasb_model$GY$model
print(data)
#> # A tibble: 24 × 22
#> type Code Y PC1 PC2 PC3 PC4 PC5 PC6 PC7
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 GEN G1 2.60 0.246 -0.0482 -0.0314 -0.0513 -0.146 -0.419 0.117
#> 2 GEN G10 2.47 -0.819 -0.421 -0.128 -0.266 -0.116 0.0575 0.0480
#> 3 GEN G2 2.74 0.119 0.154 -0.585 0.376 -0.0810 0.202 0.0975
#> 4 GEN G3 2.96 0.0437 -0.0793 0.187 0.138 -0.175 0.155 -0.370
#> 5 GEN G4 2.64 -0.243 0.393 -0.0723 0.113 -0.113 -0.279 -0.126
#> 6 GEN G5 2.54 -0.256 0.206 0.193 0.145 0.365 0.0546 0.187
#> 7 GEN G6 2.53 -0.0753 0.195 0.445 0.192 0.0685 0.0353 0.0180
#> 8 GEN G7 2.74 0.247 0.458 -0.166 -0.565 0.172 0.103 -0.0930
#> 9 GEN G8 3.00 0.404 -0.166 0.259 -0.136 -0.310 0.175 0.222
#> 10 GEN G9 2.51 0.334 -0.692 -0.102 0.0537 0.335 -0.0845 -0.0991
#> # ℹ 14 more rows
#> # ℹ 12 more variables: PC8 <dbl>, PC9 <dbl>, WAASB <dbl>, PctResp <dbl>,
#> # PctWAASB <dbl>, wRes <dbl>, wWAASB <dbl>, OrResp <dbl>, OrWAASB <dbl>,
#> # OrPC1 <dbl>, WAASBY <dbl>, OrWAASBY <dbl>
The output generated by the waasb() function is very similar to that generated by the waas() function. The main difference is that the singular value decomposition is based on the BLUP for GEI effects matrix. For more information, refer to (Olivoto et al. 2019).
Several indexes have been developed to identify genotypes that exhibit both high performance and stability in plant breeding programs. One example is the kangranksum index, which was developed by Kang (Kang 1988). This index combines yield and stability ranks based on the Shukla stability index. Olivoto (Olivoto et al. 2019), on the other hand, created the WAASBY index by assigning weights to yield and stability. Our rYWAASB index can be compared to these earlier indexes, as it follows a similar computational approach. However, our index (rYWAASB) is a powerful tool, because it incorporates a trait and WAASB rankings in the process.
First, let’s examine the Y*WAASB
biplot generated by the
metan package. This will allow us to compare the results of the
rYWAASB package with the Y*WAASB
biplot. In the
Y*WAASB
or GY*WAASB
biplot proposed by (Olivoto et al.
2019) (Fig. 1), the quadrants illustrate the stability and
trait patterns (specifically, the grain yield of oat genotypes in the
data_ge dataset) as follows:
Fig. 1: Y*WAASB biplot of metan
package built-in oat
data
In this section, we will utilize the built-in data maize to generate ranking scores for different genotypes, along with their corresponding plots. For further details, please refer to the ?maize documentation. It is also possible to use other datasets as long as they contain the following columns: genotype, trait, and WAASB index for genotypes. To understand how the HTML tables were created, please refer to the Rendering engine section.
We recommend that users address (handle/overcome/substitute) any missing data in their inputs before proceeding with analyses. This is because the rank codes do not incorporate a comprehensive algorithm to handle this task.
data(maize)
ranki(maize)
#> GEN Y=Trait WAASB WAASBY rY rWAASB rWAASBY rY+rWAASB rYWAASB
#> 1 Dracma 262.2230 0.8107018 81.6 5 3 2.0 8 1.0
#> 2 DKC6630 284.0391 2.2006718 88.5 1 8 1.0 9 2.0
#> 3 NS770 243.4864 0.3272558 71.4 10 1 7.0 11 3.0
#> 4 89 MAY 70 258.8993 1.9638360 72.2 6 7 6.0 13 5.0
#> 5 BOLSON 252.7882 1.1512639 72.9 8 5 5.0 13 5.0
#> 6 Sy Hydro 243.7789 0.9741668 67.6 9 4 8.0 13 5.0
#> 7 KSC704 234.5755 0.6564340 63.1 13 2 11.0 15 8.0
#> 8 NS6010 277.7849 3.1780198 78.0 2 13 3.0 15 8.0
#> 9 Sy Inove 276.2174 3.0619396 77.6 3 12 4.0 15 8.0
#> 10 ZP606 255.3309 2.6157585 65.7 7 10 9.0 17 10.0
#> 11 ZP600 265.8830 4.2201795 63.2 4 17 10.0 21 11.0
#> 12 BK 74 217.7637 1.2910417 47.3 16 6 13.5 22 12.0
#> 13 Mv Massil 228.5190 2.2487080 49.0 14 9 12.0 23 13.0
#> 14 Barekat2 239.1610 3.7488510 47.3 11 15 13.5 26 14.0
#> 15 KSC705 218.7118 3.2878162 35.7 15 14 16.0 29 15.0
#> 16 DKC6589 234.7383 4.4766285 39.7 12 18 15.0 30 16.5
#> 17 Sy Miami 212.3417 2.7648339 34.4 19 11 17.0 30 16.5
#> 18 Gazda 214.8292 3.8871065 29.3 18 16 18.0 34 18.0
#> 19 DKC7211 215.7097 4.7286087 24.7 17 19 19.0 36 19.0
#> 20 DKC6101 191.8700 6.0110405 0.0 20 20 20.0 40 20.0
In the table above, the genotype with the lowest rank (Dracma) is considered the best due to its high grain yield and low WAASB score.
Fig.2. The first barplot of the rYWASSB
package
#> [,1]
#> [1,] 1.25
#> [2,] 3.00
#> [3,] 4.75
#> [4,] 6.50
#> [5,] 8.25
#> [6,] 10.00
#> [7,] 11.75
#> [8,] 13.50
#> [9,] 15.25
#> [10,] 17.00
#> [11,] 18.75
#> [12,] 20.50
#> [13,] 22.25
#> [14,] 24.00
#> [15,] 25.75
#> [16,] 27.50
#> [17,] 29.25
#> [18,] 31.00
#> [19,] 32.75
#> [20,] 34.50
Fig.3. The second barplot of the rYWASSB
package
data(maize)
PCA_biplot(maize)
#> $`Coordinates of variables making a scatter plot`
#> Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
#> Y=Trait -0.8795116 0.4699999 0.06670390 0.023109283 -0.023999674
#> WAASBY -0.9873211 0.0945990 0.12448256 0.009671875 0.025662431
#> rWAASB 0.7315848 0.6774116 -0.07122906 0.025521488 0.013121693
#> rWAASBY 0.9779744 -0.1480762 0.10777970 0.100077607 -0.002739954
#> rYWAASB 0.9742173 0.1601292 0.13167024 -0.088964078 -0.002762174
#>
#> $`Squaring the variable coordinates: var.cos2 = var.coord^2`
#> Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
#> Y=Trait 0.7735406 0.22089995 0.004449410 5.340389e-04 5.759843e-04
#> WAASBY 0.9748030 0.00894897 0.015495908 9.354516e-05 6.585604e-04
#> rWAASB 0.5352164 0.45888653 0.005073579 6.513464e-04 1.721788e-04
#> rWAASBY 0.9564339 0.02192657 0.011616464 1.001553e-02 7.507345e-06
#> rYWAASB 0.9490993 0.02564137 0.017337053 7.914607e-03 7.629606e-06
#>
#> $`The % contributions of the variables to the PCs`
#> Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
#> Y=Trait 18.46559 30.001213 8.243859 2.7801402 40.5092017
#> WAASBY 23.27002 1.215392 28.710794 0.4869844 46.3168062
#> rWAASB 12.77643 62.323023 9.400319 3.3908280 12.1094044
#> rWAASBY 22.83153 2.977926 21.522965 52.1395883 0.5279945
#> rYWAASB 22.65644 3.482446 32.122063 41.2024591 0.5365932
#>
#> $`eigenvalues data`
#> eigenvalue variance.percent cumulative.variance.percent
#> Dim.1 4.189093275 83.78186550 83.78187
#> Dim.2 0.736303386 14.72606772 98.50793
#> Dim.3 0.053972413 1.07944826 99.58738
#> Dim.4 0.019209065 0.38418130 99.97156
#> Dim.5 0.001421861 0.02843721 100.00000
Figs.4-6. The PCA biplot with loadings for compare genotypes
The Average Silhouette Method
. The
details of this method is given in the description and
details parts of the function h_clust()
. For
running the cluster analysis by menas of shipunov
package
(courtesy of late professor Alexy Shipunov) act as: data(maize)
maize <- as.data.frame(maize)
row.names(maize) <- maize[, 1]
maize[, 1] = NULL
GEN <- row.names(maize)
maize <- scale(maize)
nbclust(maize, verbose = FALSE)
#> [1] 2
#> attr(,"class")
#> [1] "data frame"
# Perform bootstrap or jackknife clustering by shipunov package.
# The examples should be run in the console manually due to
# problems occurs in the ORPHANED package "shipunov".
#
# 1- Bootstrap clustering:
# data.jb <- Jclust(maize,
# method.d = "euclidean",
# method.c = "average", n.cl = 2,
# bootstrap = TRUE)
#
# plot.Jclust(data.jb, top=TRUE, lab.pos=1,
# lab.offset=1, lab.col=2, lab.font=2)
# Fence(data.jb$hclust, GEN)
#
# data.jb <- Jclust(maize,
# method.d = "euclidean",
# method.c = "ward.D", n.cl = 2,
# bootstrap = TRUE)
#
# plot.Jclust(data.jb, top=TRUE, lab.pos=1,
# lab.offset=1, lab.col=2, lab.font=2)
# Fence(data.jb$hclust, GEN)
#
#
# if(verbose = TRUE):
# cat("\nnumber of iterations:\n", data.jb$iter, "\n")
#
# For "bootstrap":
# data.jb$mat <- as.matrix((data.jb$mat))
# cat("\nmatrix of results:\n", data.jb$mat, "\n")
# cat("clustering info, by eucledean distance measure:\n")
# print(data.jb$hclust)
# cat("groups:\n", data.jb$gr, "\n")
# cat("\nsupport values:\n", data.jb$supp, "\n")
# cat("\nnumber of clusters used:\n", data.jb$n.cl, "\n")
#
# 2- Jackknife clustering:
# data.jb <- Bclust(maize,
# method.d = "euclidean", method.c = "average",
# bootstrap = FALSE)
# plot(data.jb)
#
# data.jb <- Bclust(maize,
# method.d = "euclidean", method.c = "ward.D",
# bootstrap = FALSE)
# plot(data.jb)
#
# if(verbose = TRUE):
# For"jackknife":
# cat("Consensus:\n", data.jb$consensus, "\n")
# cat("Vlaues:\n", data.jb$values, "\n")
Figures 7-10. Cluster analysis by 1000 bootstrap iterations and determine optimum N clusters using The Average Silhouette algorithm.