Optimizing PGRs for in vitro shoot proliferation of pomegranate with bayesian-tuned ensemble stacking regression and NSGA-II: a comparative evaluation of machine learning models

Zarbakhsh, Saeedeh; Shahsavar, Ali Reza; Soltani, Mohammad

doi:10.1186/s13007-024-01211-5

Research
Open access
Published: 31 May 2024

Optimizing PGRs for in vitro shoot proliferation of pomegranate with bayesian-tuned ensemble stacking regression and NSGA-II: a comparative evaluation of machine learning models

Saeedeh Zarbakhsh¹,
Ali Reza Shahsavar¹ &
Mohammad Soltani²

Plant Methods volume 20, Article number: 82 (2024) Cite this article

341 Accesses
4 Altmetric
Metrics details

Abstract

Background

The process of optimizing in vitro shoot proliferation is a complicated task, as it is influenced by interactions of many factors as well as genotype. This study investigated the role of various concentrations of plant growth regulators (zeatin and gibberellic acid) in the successful in vitro shoot proliferation of three Punica granatum cultivars (‘Faroogh’, ‘Atabaki’ and ‘Shirineshahvar’). Also, the utility of five Machine Learning (ML) algorithms—Support Vector Regression (SVR), Random Forest (RF), Extreme Gradient Boosting (XGB), Ensemble Stacking Regression (ESR) and Elastic Net Multivariate Linear Regression (ENMLR)—as modeling tools were evaluated on in vitro multiplication of pomegranate. A new automatic hyperparameter optimization method named Adaptive Tree Pazen Estimator (ATPE) was developed to tune the hyperparameters. The performance of the models was evaluated and compared using statistical indicators (MAE, RMSE, RRMSE, MAPE, R and R²), while a specific Global Performance Indicator (GPI) was introduced to rank the models based on a single parameter. Moreover, Non‑dominated Sorting Genetic Algorithm‑II (NSGA‑II) was employed to optimize the selected prediction model.

Results

The results demonstrated that the ESR algorithm exhibited higher predictive accuracy in comparison to other ML algorithms. The ESR model was subsequently introduced for optimization by NSGA‑II. ESR-NSGA‑II revealed that the highest proliferation rate (3.47, 3.84, and 3.22), shoot length (2.74, 3.32, and 1.86 cm), leave number (18.18, 19.76, and 18.77), and explant survival (84.21%, 85.49%, and 56.39%) could be achieved with a medium containing 0.750, 0.654, and 0.705 mg/L zeatin, and 0.50, 0.329, and 0.347 mg/L gibberellic acid in the ‘Atabaki’, ‘Faroogh’, and ‘Shirineshahvar’ cultivars, respectively.

Conclusions

This study demonstrates that the 'Shirineshahvar' cultivar exhibited lower shoot proliferation success compared to the other cultivars. The results indicated the good performance of ESR-NSGA-II in modeling and optimizing in vitro propagation. ESR-NSGA-II can be applied as an up-to-date and reliable computational tool for future studies in plant in vitro culture.

Background

Over the past decade, the pomegranate tree (Punica granatum L.) has attained significant attention as an economically super fruit cultivated throughout the world, particularly in the arid and semiarid regions. This is due to its high medicinal effects, rich content of bioactive compounds such as antioxidant polyphenol, and numerous health advantages [1, 2]. Traditional methods of propagating pomegranates include sexual propagation through seeds and vegetative methods. However, both conventional propagation methods may face several limitations that cause pomegranate propagation to be difficult. Vegetative methods are time-consuming, dependent on seasonal production, and require intensive labor. Moreover, a large number of plants derived from cuttings often fail to survive [3]. On the other hand, sexual methods are challenging due to the high heterozygosis and a long juvenile period in plants. In addition, seedlings propagated by mentioned methods are strongly affected by pest infestation and diseases [4]. So, to achieve large-scale pomegranate cultivation, in vitro cell and organ culture techniques have been developed. Plant tissue culture methods offer a promising approach for the rapid production of true-to-type pomegranate plants and the biotechnological exploitation of pomegranate and other plant species with valuable properties [5]. Previous studies have attempted to apply in vitro culture techniques to propagate different cultivars of pomegranate [6, 7]. However, the findings have clearly emphasized that pomegranate micropropagation is moderately difficult and can vary depending on the cultivar, probably due to genetic variations among them [6, 8]. Nevertheless, the successful propagation of economically important woody plant species like pomegranate still presents challenges, due to the emergence of some problems during the proliferation stage including defoliation of explants, shoot tip necrosis, callusing, and hyperhydricity. These plant physiological disorders arise from factors such as undesirable medium composition, unsuitable type and concentration of plant growth regulators (PGRs), microbial contamination, phenolic browning caused by phenol secretion, ethylene accumulation, and tissue recalcitrance to proliferation (Fig. 1) [8,9,10].

The successful in vitro propagation of fruit trees is an intricate process that is influenced by numerous factors, including culture conditions, plant materials, and the composition of culture media, particularly PGRs [11]. Extensive research has emphasized the crucial role of PGRs, such as cytokinins and auxins, and their different combinations with gibberellic acid (GA₃) in promoting shoot regeneration in different pomegranate cultivars [7]. However, certain PGRs have shown varying levels of effectiveness in promoting proliferation. For example, 6-γ,γ-dimethylallylaminopurina (2-iP) has been reported to have lower proliferative efficiency, while others like 6-Benzylaminopurine (BAP), a commonly used cytokinin in tissue culture, can produce short and thin shoots, sometimes accompanied by excessive callus proliferation. Among the cytokinins, zeatin (ZT), a natural cytokinin, has been found to play a vital role in stimulating the maximum axillary buds and is applied at various concentrations either alone or in combination with other growth regulators. ZT is considered desirable for its stability in nutrient media, as it does not easily degrade or break down, thus providing sustained benefits for rapid and high rates of proliferation in most plant explants [12, 13]. Although different growth regulators, including BAP, kinetin, thidiazuron (TDZ), GA₃, and IBA, have been used in various combinations with or without ZT to promote the stimulation of axillary buds, GA₃ is particularly known for inducing rapid shoot elongation, which is beneficial for subsequent rooting. Considering the high cost of ZT, researchers are actively exploring the combined use of ZT with other cytokinins while maintaining the proliferative potential of shoot cultures [14]. However, it is important not to overlook the role of ZT in ensuring a good rate of proliferation [12]. Nonetheless, it is crucial to acknowledge that the responses of different pomegranate cultivars to in vitro propagation are significantly vary depending on the interacting factors during the in vitro process, even in closely related species [15]. Therefore, to achieve optimal results, optimizing of specific in vitro culture condition is necessary for each cultivar.

In vitro micropropagation is a multifactorial and complex biological process influenced by genotype/cultivar and various interacting factors that are crucial for optimizing this process. Traditional statistical techniques encounter with significant challenges in deciphering the large datasets of biological interactions, especially when datasets are nonlinear, complex, noisy, and ambiguous in nature, as observed in in vitro culture processes [16]. To overcome these challenges, advanced computer-based technologies such as Machine Learning (ML) tools have emerged as capable solutions for analyzing and predicting complex and multivariate datasets with high accuracy. ML approaches offer the advantage of autonomous learning and data transformation into useful information without being humanly programmed [17]. Recent studies have highlighted the superior predictive performance of MLs over traditional statistics in various in vitro culture systems, including optimizing culture conditions for shoot proliferation and rooting [10, 18, 19], androgenesis [20], seed germination [21], somatic embryogenesis [22], gene transformation [23], and enhancing of the secondary metabolite biosynthesis [24].

Among the various algorithm-based ML tools, ensemble learning methods have gained significant attention due to their simplicity and their ability to create powerful and robust predictions. These methods can be broadly categorized into bagging, boosting, and stacking/blending. Notably, three prominent ensemble learning methods are Extreme Gradient Boosting (XGB), which utilizes the boosting concept, Random Forest (RF), based on bagging concept, and Ensemble Stacking Regression (ESR), based on stacking concept [25]. Support Vector Machine (SVM) is a robust ML method that has been widely recognized for its remarkable accuracy in plant in vitro micropropagation, as evidenced by the findings of previous studies [19, 26]. One notable advantage of SVM is its ability to effectively handle high-dimensional data without encountering difficulties. Researchers have explored the potential of SVM to address the challenges by utilizing a small training dataset, further highlighting the versatility and effectiveness of SVM in providing accurate and reliable predictions even with limited training data [27]. The Elastic Net Multivariate Linear Regression (ENMLR) was introduced by Zou and Hastie [28] as a robust approach for analyzing high-dimensional datasets. It was designed to overcome the limitations of the LASSO method. By incorporating regression techniques, ENMLR effectively regularizes and selects important predictor variables, thereby improving prediction accuracy of sparse modeling. This method has demonstrated its value in addressing the challenges associated with multicollinearity among predictor variables [29]. Selecting the most appropriate ML method depends on the association between input and output variables, as well as the optimization of hyperparameters [19]. In addition, the combination of ML techniques with evolutionary optimization algorithms confers significant advantages in predicting the critical factors that influence plant growth parameters in in vitro culture systems. One powerful algorithm in this regard is the non-dominated sorting genetic algorithm-II (NSGA-II), which is widely recognized as a search algorithm for optimizing multi-objective problems. NSGA-II enables efficient solving and prediction of complex processes while providing a simplified interpretation of results, simultaneously [30]. In previous studies, the combining approach of ML with NSGA-II (ML-NSGA-II) has been acknowledged as a robust modeling technique for complex datasets, such as in optimizing the protocol of in vitro tissue culture on micropropagation phases [21, 31, 32] and in various plant science fields [30, 33].

Based on our current knowledge, the application of ML algorithms as a novel strategy for modeling and predicting the in vitro shoot proliferation of pomegranate plants remains largely unexplored. The overall objective of this study is (i) to evaluate the effects of ZT at different concentrations and in combination with GA₃ on optimizing the tissue culture protocol of three commercially significant cultivars, namely ‘Faroogh’, ‘Atabaki’ and ‘Shirineshahvar’; (ii) to compare the potential robustness of the most commonly used ML algorithms, including SVR, RF, XGB, ESR, and ENMLR, in terms of their ability to model and optimize of the in vitro shoot proliferation process of pomegranate cultivars; and (iii) to employ the NSGA-II in order to predict the most effective level of PGRs for enhancing the proliferation of pomegranate. To our knowledge, this study is the first application of ML models for optimizing pomegranate tissue culture media. In addition, despite the potential advantages of ESR and ENMLR, no study has been conducted on applying these procedures in plant science.

Materials and methods

Plant material and explant preparation

The experiments were conducted using single nodal explants from three different pomegranate cultivars: ‘Faroogh’, ‘Atabaki’ and ‘Shirineshahvar’. These explants were obtained from pomegranate plants grown in a greenhouse of College of Agriculture, Shiraz University, Iran. Explants were pre-sterilized using a liquid soap solution and rinsed several times with tap water. Subsequently, the explants were subjected to surface sterilization by immersing them in 70% aqueous ethanol for 30 s, followed by treatment with 5% sodium hypochlorite for 10 min. Afterward, the explants were washed three times with sterilized distilled water under a laminar airflow chamber. Following the sterilization process, the stem explants were cut into 2–3 cm segments with lateral buds (Fig. 2a).

In vitro culture establishment

A preliminary test was carried out using different combinations of culture media: MS (Murashige and Skoog) [34], VS (Van der Salm) [35], WPM (woody plant medium) [36], half-strength MS, and modified MS (mMS), PGRs (BAP and NAA), phenol-controlling compounds (polyvinylpyrolidon, ascorbic acid, and activated charcoal), and silver nitrate (AgNO₃) as ethylene inhibitor. The main experiment was set up based on the pre-test results, which indicated that the mMS medium supplemented with activated charcoal and AgNO₃ in combination with either BAP or NAA was the best treatment for stimulating new shoot regeneration. In this experiment, the explants (2–3 cm stem segments with lateral buds) were immediately cultured in the capped glass containers containing 25 mL of mMS as a basal medium supplemented with 1 mg/L BAP, 0.5 mg/L NAA, 250 mg/L activated charcoal, 4.5 mg/L AgNO₃, 0.7% agar, and 3% sucrose. To obtain the best hormonal composition at the protocol of pomegranate proliferation, the effects of different concentrations of GA₃ (0, 0.1, 0.25, and 0.5 mg/L) and ZT (0, 0.25, 0.5, and 0.75 mg/L) on shoot proliferation were evaluated. Prior to autoclaving at 121 ℃ for 15 min, the pH of the medium was adjusted to 5.7–5.8. To mitigate tissue culture browning, the cultures were incubated in darkness for 7 days in a growth chamber at a temperature of 25 ± 2 ℃, and then transferred to a 16-h photoperiod with a light intensity of 80 µmol m⁻²s⁻¹ and an 8-h dark period. After three subcultures on the same culture medium, various morphological responses of the plants were measured for each cultivar; including the proliferation rate (PR; number of new shoots per explant), shoot length (SL; length of new regenerated shoots per explant in cm), leave number (LN; the number of leaves per explant), and explant survival (ES; the survival rate of explants in percent) (Fig. 3a).

Experimental design and data analysis

The proliferation experiment was carried out using a Completely Randomized Design (CRD) with a factorial arrangement. Each set of treatments consisted of 20 replicates, and subcultures were conducted over a three-week period. The variances analysis was performed using statistical analysis software (version 9.4; SAS Institute, Cary, NC).

Description of ML models and optimization algorithm

Model development

In this study, we employed a range of ML algorithms to build computational models using the datasets as training and testing data. Specifically, we selected most widely used ML algorithms such as SVR, RF, XGB, ENMLR, and ESR to analyze the effect of the independent variables on in vitro pomegranate plant growth responses. These five ML algorithms were applied to different pomegranate cultivars (‘Faroogh’, ‘Atabaki’, and ‘Shirineshahvar’), with two independent variables consisting of various concentrations of GA₃ and ZT as inputs, and four plant growth responses (PR, SL, LN, and ES) considered as outputs. Prior to applying ML modeling, data scaling was employed to standardize the training set for each cultivar. The features are transformed into a mean of zero and a variance of one by standardizing the data using the Eq. 1. Additionally, Principal Component Analysis (PCA) was used to identify any outlier data; however, no outlier data was found in analysis. To train and test all five models, the experimental data (960 data points) were randomly divided into 80% and 20% for training and testing sets, respectively.

$${X}_{std}=\frac{{X}_{o}-\mu }{\sigma }$$

(1)

where ${X}_{std}$ is standardized value, ${X}_{o}$ is original value, $\mu$ and $\sigma$ are mean and standard deviation, respectively.

Hyper parameter optimization in ML models

In ML, the optimization and tuning of hyperparameters in advance play a crucial role in training ML models [37]. These hyperparameters have a significant impact on prediction accuracy and overall performance. Various strategies exist for hyperparameter optimization, including babysitting, grid search, random search, and bayesian optimization [38]. Among these strategies, Bayesian optimization is widely recognized for its generalizability across different test sets and its ability to achieve optimal hyperparameters with fewer iterations. In this study, a novel automatic tuning hyperparameter algorithm called Adaptive Three-structured Parzen Estimator (ATPE) was utilized in Bayesian optimization. This algorithm aimed to adjust the initial hyperparameters of five ML models to achieve optimized performance. It has not yet been applied to the optimization of in vitro PGRs. To improve the generalization performance of these models and avoid overfitting and underfitting, the study combined the ATPE method with K-fold cross-validation (K = 10). By employing the K-fold cross-validation method, all data points were involved in the training phase. The process is illustrated in Fig. 3b. The ML’s hyperparameters and their search space are shown in Table 1. The investigation was conducted with K values ranging from 1 to 10 for K-fold cross-validation. Each K value represented the ATPE algorithm for optimal ML model selection and hyperparameter tuning. One fold was randomly selected as the validation set, while the remaining folds were used to train the model. By employing the K-fold cross-validation method, all data points were involved in the training process.

Table 1 Hyperparameter tuning of the constructed models using ATPE

Full size table

Support vector regression (SVR)

SVM is a supervised ML method that developed by Vapnik [39]. Initially developed for classification problems (Support Vector Classifier or SVC), SVM was later extended to handle regression problems (SVR) [40]. The fundamental concept behind SVR involves the use of a kernel function to map the original input data into a feature space. The SVM model estimates regression by utilizing a series of kernel functions to convert the original input data from its lower-dimensional representation to a higher-dimensional feature space. Unlike Artificial Neural Network (ANN) models, which often encounter multiple local minima, SVM provides a unique solution results that are at the global optimum. The approximated function within the SVR algorithm can be expressed as follows:

$$f\left(x\right)={\omega }^{T}x+b with \omega \epsilon x, b\epsilon R$$

(2)

where $f\left(x\right)$ represents the estimated output value, $\omega$ denotes weight for the ${\text{i}}^{\text{th}}$ sample point, and $b$ represents the bias. The values of $\omega$ and $b$ are determined by minimizing the regularized risk function, which is expressed as:

$$R(C)=C\frac{1}{n}\sum_{i=1}^{n}L\left({d}_{i},{y}_{i}\right)+\frac{1}{2}{\Vert \omega \Vert }^{2}$$

(3)

where $C$ represents the penalty parameter that balances the trade-off between model complexity and training error, ${d}_{i}$ denotes the desired value, $n$ represents the total number of observations, and $C\frac{1}{n}\sum_{i=1}^{n}L\left({d}_{i},{y}_{i}\right)$ is the empirical error. The following equation is employed to determine the insensitive loss function (${l}_{\varepsilon })$:

$${l}_{\varepsilon }\left(d,y\right)=\left|d-y\right|-\varepsilon \left|d-y\right|\ge \varepsilon or 0 otherwise$$

(4)

where $\frac{1}{2}{\Vert \omega \Vert }^{2}$ represents the regularization term, while ɛ (epsilon) represents the insensitive tube. The approximated function in Eq. (2) can be explicitly expressed by incorporating Lagrange multipliers and leveraging the optimality constraints. By introducing the Lagrange multipliers $({a}_{i})$, the function is given by:

$$f\left(x,{a}_{i},{a}_{i}^{*}\right)=\sum_{i=1}^{n}\left({a}_{i}-{a}_{i}^{*}\right)K({x}_{i},{x}_{i}^{T})+b$$

(5)

where $K({x}_{i},{x}_{i}^{T})$ represents the kernel function. The Radial Basis Function (RBF) non-linear kernel function plays a crucial role in mapping of input vectors nonlinearly into a high-dimensional feature space. In this study, the RBF was utilized due to its superior performance in estimating the H estimations compared to other kernel functions.

$${K}_{rbf}({x}_{i},{x}_{i}^{T})=\text{exp}\left[\frac{{-\left({x}_{i}-{x}_{i}^{T}\right)}^{2}}{{2\sigma }^{2}}\right]$$

(6)

Random forest (RF)

RF introduced for classification or regression prediction algorithm introduced by Breiman [41]. It solves the performance limitations of decision trees and exhibits favorable characteristics such as robustness to noise and outliers, scalability, and parallelism in high-dimensional data classification tasks. RF overcomes the "dimensionality disaster" often encountered in big data scenarios that often other models fail to perform effectively. Additionally, RF demonstrates comparable error rates to other methods across various learning tasks and exhibits a reduced tendency to overfitting. Notably, RF is a well-known bagging algorithm that excels in regression problems [38]. RF algorithm combines decision tree-based techniques with ensemble methods, effectively leveraging their synergistic benefits, making it a suitable choice as one of the foundational models in the ensemble model employed in this study. The formula of RF is as follows:

$${\text{i}}\widehat{\text{y}}\left({\text{x}}_{\text{i}}\right)\text{=}\frac{1}{{\text{K}}}\sum_{k=1}^{K}{\text{T}}_{{\text{D}}\left({\theta }_{\text{k}}{}\right)}\left({\text{x}}_{\text{i}}\right), k =\{1, 2, \dots ,K\}$$

(7)

where ${\text{x}}_{\text{i}}$ refers to the value of the sample proportion, ${\text{D}}\left({\theta }_{\text{k}}\right)$ denotes a different bootstrapped sample, and ${\text{K}}$ is tree number (${\text{T}}_{{\text{D}}\left({\theta }_{K}\right)}\text{)}$.

eXtreme Gradient Boosting (XGB)

XGB is an advanced supervised learning algorithm proposed by Chen and Guestrin [42]. This method is based on the Gradient-Boosted Decision Tree (GBDT) approach. XGB aims to create a “strong” learner by combining predictions from a collection of “weak” learners using additive training strategies. This algorithm incorporates a second-order Taylor expansion of the loss function and a regular term, which effectively mitigates overfitting and expedites convergence. The XGB algorithm enhances prediction accuracy by iteratively constructing new decision trees with continuously diminish the residuals between predicted and observed values. XGB stands out as a prominent open-source boosting tree toolkit, offering remarkable speed and performance advantages over other gradient-boosting methods. It is more than 10 times faster than common toolkits, making it the preferred selection for massively parallel boosting tree tasks. XGB prediction for i instance is:

$${f}_{i}^{(d)}=\sum_{k=1}^{d}{f}_{k}({x}_{i})={f}_{i}^{(d-1)}{f}_{d}{(x}_{i})$$

(8)

where ${f}_{k}({x}_{i})$ represents the learner at step $d$, the predictions at steps $d$ and $d-1$ are denoted as ${f}_{i}^{(d)}$ and ${f}_{i}^{(d-1)}$, respectively and ${x}_{i}$ represents the input variable.

In order to prevent the problem of overfitting without sacrificing the computational speed of the model, XGB employs an analytical expression to evaluate the “goodness” of the model in relation to the original function. This analytical formula, denoted as Eq. (2), is created by XGB to provide an estimate of the model’s “goodness” while also reducing the computational speed associated with mathematical computations.

$${\text{Objective}}^{(d)}=\sum_{k=1}^{n}l\left({\overline{y} }_{i},{y}_{i}\right)+\sum_{k=1}^{d}\sigma {(f}_{i})$$

(9)

where $l$ is the loss function, $n$ indicates the observation number used, and $\sigma$ denotes the regularization term as represented in Eq. (3).

$$\sigma \left(f\right)=\gamma T+0.5\lambda {\Vert \omega \Vert }^{2}$$

(10)

where $\omega$ denote the vector of scores associated with leaves, $\lambda$ represents the regularization parameter, and $\gamma$ indicates the minimum loss required for further partitioning of a leaf node.

Elastic net multivariate linear regression (ENMLR)

ENMLR is a regression technique that combines two effective shrinkage regression methods: Ridge regression (L2 penalty) and LASSO regression (L1 penalty). Ridge regression is employed to address high-multicollinearity problems, while LASSO regression focuses on feature selection in regression coefficients. The elastic net estimator in ENMLR benefits from ridge regularization, which allows for better handling of correlations between predictors compared to LASSO regression. Simultaneously, the L1 regularization in elastic net promotes sparsity, facilitating the identification of essential features. However, similar to LASSO regression, the bias issue is still present in ENMLR. The elastic net estimator minimizes the following expression:

$$EN \left(\beta \right)=\sum_{i=1}^{n}{\left({y}_{i}-{x}_{i}^{T}\beta \right)}^{2}+{\lambda }_{1} \sum_{j=1}^{p}\left|{\beta }_{j}\right|+{\lambda }_{2}\sum_{j=1}^{p}{\left|{\beta }_{j}\right|}^{2}$$

(11)

where $\beta$ is the regression coefficients, ${\beta }_{j}$ is the regression coefficient of the ${j}^{th}$ predictor variable, ${\lambda }_{1}$ and ${\lambda }_{2}$ are the tuning parameters coming from Lasso and Ridge, respectively and positive numeric values (${\lambda }_{1}$, ${\lambda }_{2}$> 0). λ is a penalty parameter and has the effect of a compression variable, and its numerical value indicates the severity of punishment.

Ensemble stacking regression (ESR)

The stacking regressor, initially introduced by Wolpert [43], is an effective ensemble learning technique that combines multiple regression models to improve prediction accuracy. In this approach, a meta-regressor is trained to aggregate the predictions of the base regressors, thereby leveraging the collective knowledge of the individual models Li et al. [44]. Different techniques, such as stacking, weighted averaging, and direct averaging, can be employed to create ensemble regressors by integrating the predictions of the base models [45]. The choice of the specific technique depends on finding an optimal balance for combining the predictions, and the meta-regressor can be any type of regression models [46]. To implement stacking regression, the new meta feature sets generated by each base regressor are merged to form the meta training set, and the new target sets produced by each base regressor are combined to create the meta testing set. The final predictions are then generated by the meta-regressor, which is trained using the new meta training set Wu et al. [25]. The stacking regression methodology has gained popularity in various domains, including molecular quantum characteristics [44], daily reference evapotranspiration estimation [25], genome prediction [47], and stock portfolio prediction [48]. In this particular study, XGB, SVR, and ENMLR models were utilized as the base regressors, while RF was employed as the meta-regressor.

Performance evaluation

In order to evaluate and compare the accuracy and performance of the developed ML algorithms in predicting the proliferation of pomegranate, five popular statistical quantitative indicators, namely the correlation coefficient (R), Coefficient of Determination (R²), Root Mean Square Error (RMSE), Relative Root Mean Squared Error (RRMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE), were utilized. These quantitative indicators can be found in Table 2.

Table 2 Description of statistical indicators for the constructed models evaluation

Full size table

Global performance indicator (GPI)

In order to enhance the accuracy and reliability of statistical analysis and to mitigate any potential discrepancies, we employed the GPI method. Despotovic et al. [49] were the pioneers in introducing GPI as a novel aspect. GPI is a remarkable technique that combines the effects of multiple statistical indicators. During the process, all statistical indicators are scaled to a range between 0 and 1. Subsequently, the appropriate median value of all models is subtracted from each scaled value of a statistical indicator. These differences are then aggregated using appropriate weighting factors (a weight of -1 for $R$ and ${R}^{2}$ and a weight of 1 for all other statistical indicators). The model with higher GPI values is considered the best. The following equation represents the GPI model:

$${GPI}_{i}=\sum_{j=1}^{5}{\alpha }_{j}\left({\text{\rm M}}_{j}^{S}-{\text{\rm I}}_{ij}^{S}\right)$$

(12)

where ${GPI}_{i}$ represents global performance indicator for model $i$, ${\text{\rm M}}_{j}^{S}$ is median of scaled values of indicator $j$, ${\text{\rm I}}_{ij}^{S}$ is the scaled value of indicator $j$ for model $i$, ${\alpha }_{j}$ equals -1 for both $R$ and ${R}^{2}$ and 1 for other performance criteria.

Optimization of ML model via non‑dominated sorting genetic algorithm‑II (NSGA-II)

The best ML algorithm as the fitness function was introduced to the Non-dominating Sorting Genetic Algorithm (NSGA-II) as optimization algorithm in order to find the optimal combination of inputs (GA₃ and ZT) for achieving maximal growth responses in three cultivars (Fig. 3c). Based on natural selection, this study employed several parameters to ensure the effectiveness of the NSGA-II optimization process. The first step in the NSGA-II process involved the creation of an initial population, where all the chromosomes were constructed. Then the tournament selection method was adopted to select an elite population for crossover. A binary crossover function, a well-known crossover technique, was considered to generate the next generation of chromosomes. To introduce diversity into the population and prevent convergence to local optima, a mutation operator was applied. It introduced random variations into the chromosomes, reducing the possibility of having similar chromosomes within the population [50]. The non-dominated sorting concept was utilized to derive non-dominated solutions, with each non-dominated front assigned a rank or level date. The non-dominated front with the highest rank is removed, and the remaining solutions were used to generate the parent population for the next generation. Crowding distance was employed to estimate the objective function, and solutions categorized by crowding distance in descending order based on the lowest density of solutions with less priority. In order to achieve an improved fitness function during the optimization process, the optimal values for crucial operators such as the crossover rate, maximum generation, initial population, and mutation rate were regulated through trial and error. In the current study, the crossover rate was set at 90% with a distribution index of 15, the maximum generation was set to 200, the initial population size was 100, and a distribution index of 20 was used for the mutation operator which was real-valued polynomial mutation (real_pm) (Fig. 3c).

All mathematical codes for implementing and evaluating ESR, RF, SVR, and ENMLR models were performed using the Python library Scikit-learn version 1.3.2 [51]. Additionally, XGB was performed using the XGBoost library version 2.0.3 [42]. The tuning of hyperparameters for each of the five models (SVR, RF, XGB, ESR, and ENMLR) was conducted using the Hyperopt library version 0.2.7 [52], and the Pymoo library version 0.6.1.1 [53], specifically applied for multi-objective optimization (NSGA-II algorithm).

Results

The effect of PGRs on in vitro shoot proliferation and development of pomegranate

According to data analysis using factorial ANOVA, the growth responses of pomegranate, including LN, PR, ES, and SL were found to be significantly influenced by different concentrations and combinations of PGRs (GA₃ and ZT), as well as the cultivar type. The detailed results can be found in Table 3.

Table 3 Effect of different concentrations of PGRs on in vitro growth parameters of pomegranate cultivars

Full size table

The addition of ZT to the growth medium, particularly at a concentration of 0.75 mg/L, resulted in improved shoot regeneration favorable vegetative growth characteristics per explant when compared to the control medium. Based on the results of Table 3, although the positive changes in the growth parameters were primarily attributed to increasing the concentrations of PGRs and the interaction between them, the combination of the highest concentration of ZT and GA₃ treatment was the most effective treatment in promoting overall growth response. Specifically, when the media was augmented with 0.50 mg/L GA₃ and 0.75 mg/L ZT the average growth response was significantly enhanced (Table 3). It is important to note that the observed changes in the growth parameters were different based on the cultivar type. Among the three cultivars studied, the ‘Faroogh’ cultivar exhibited the maximum values of LN (23.62), and PR (4). Similarly, the ‘Atabaki’ cultivar showed the highest growth responses in SL (6.75 cm) when treated with 0.50 mg/L GA₃ and 0.75 mg/L ZT. Regarding ES, both ‘Faroogh’ and ‘Atabaki’ cultivars demonstrated a maximum value of ES which was 100% when exposed to three treatments involving the interaction of 0.25, 0.50, 0.75 mg/L ZT with 0.50 mg/L GA₃. In contrast, the ‘Shirineshahvar’ cultivar exhibited lower ES rates than other cultivars. For this particular cultivar, the same treatment interaction as mentioned earlier led to the highest values of LN (18.94), PR (3.56), ES (61.87%), and SL (1.95 cm). Generally, the highest and lowest overall growth responses were achieved in the ‘Faroogh’ and ‘Shirineshahvar’, respectively (Table 3).

Comparison of ML performance

In the present study, we utilized the advantages of five ML algorithms namely RF, XGB, SVR, ESR, and ENMLR to build the mathematical models. The scatter plots in Figs. 5, 6 and 7 illustrate the prediction results of these models, while the corresponding prediction evaluation indexes are shown in Tables 4, 5, and 6. Violin plots of the performance metrics are presented in Fig. 4. When comparing the ENMLR to other ML algorithms for all parameters (outputs), both the training and test subset R-values, which measure the correlation between observed (experimental) and predicted values of ML algorithms, were lower. This indicates that all five ML models had a good performance and predictability. However, the ESR with higher R and R² and smaller RRMSE, RMSE, MAE, and MAPE values in both training and testing sets was the best algorithm in comparison to four other models for all growth parameters (Tables 4, 5 and 6). In this regard, the results derived by comparing the statistical indicators of the different models on the measured growth parameters revealed that the values of the ESR was very close to the other ML algorithms in all three cultivars. Moreover, the impact of statistical quantitative indicators was not clearly distinguishable and different statistical indicator values are in favor for different models; therefore, to address this vagueness, the GPI for the test dataset of overall ML logarithms was calculated and presented in Table 7. The GPI estimation ranked the ESR model as the top performer among all other models. Calculated GPI revealed the order of ESR vs. XGB, RF, SVR, and ENMLR models were: 1.829 vs. − 1.674, 0.647, 0, − 4.171, for LN of ‘Atabaki’ cultivar; 1.312 vs. − 2.562, 0, 0.525, and − 4.688 for LN of ‘Faroogh’ cultivar; 0.089, − 3.040, 0.032, 0.004, and − 5.911 for LN of ‘Shirineshahvar’ cultivar; 1.383 vs 0.980, 0.738, − 2.326, and − 3.801, for PR of ‘Atabaki’ cultivar; 1.182 vs. − 1.199, 0.567, − 2.121, and − 2.104 for PR of ‘Faroogh’ cultivar; 1.911, 0.574, − 2.616, 0.255, and − 3.807 for PR of ‘Shirineshahvar’ cultivar; 0.933 vs. − 4.870, 0.573, 0, − 4.814, for ES of ‘Atabaki’ cultivar; 0.748 vs. − 3.813, 0.483, 0.085, and − 5.240 for ES of ‘Faroogh’ cultivar; 0.973, 0.818, − 1.501, − 2.966, and − 2.507 for ES of ‘Shirineshahvar’ cultivar; 0.180 vs. − 5.158, 0.108, 0.035, − 5.782, for SL of ‘Atabaki’ cultivar; 0.619 vs. − 4.058, 0.092, 0.405, and − 5.380 for SL of ‘Faroogh’ cultivar; 0.513, − 0.913, 0.193, 0.150, and − 5.487 for SL of ‘Shirineshahvar’ cultivar (Table 7). Additionally, the regression lines demonstrated the good fit correlation between the observed and predicted data for all growth parameters during both the training and testing phases of the ML models (Figs. 5, 6, and 7).

Table 4 Statistical evaluation of the constructed models for the micropropagation of the pomegranate cultivar ‘Atabaki’

Full size table

Table 5 Statistical evaluation of the constructed models for the micropropagation of the pomegranate cultivar ‘Faroogh’

Full size table

Table 6 Statistical evaluation of the constructed models for the micropropagation of the pomegranate cultivar ‘Shirineshahvar’

Full size table

Table 7 Ranking of the best-performing ML models for growth parameters of pomegranate

Full size table

Optimization process via non-dominated sorting genetic algorithm-II

The NSGA-II algorithm, as multi-objective evolutionary optimization, was linked to the ESR model which was identified as the most accurate algorithm. ESR-NSGA-II algorithm has successfully determined the optimal values for four growth parameters (LN, PR, ES, and SL) in response to different concentrations of PGRs. The results of the ESR-NSGA-II algorithm are summarized in Table 8. In the ‘Atabaki’ cultivar, the ESR-NSGA-II algorithm identified that the culture medium supplemented with 0.750 mg/L ZT along with, 0.50 mg/L GA₃, resulted in the most significant improvements in growth parameters. Specifically, this combination treatment displayed the best outputs with 18.18 LN, 3.47 PR, 84.21% ES, and 2.74 cm SL. For the ‘Faroogh’ cultivar, the optimization algorithm determined that the culture medium supplemented with 0.654 mg/L ZT along with, 0.329 mg/L GA₃ were the optimal input variables to achieve the best outputs with 19.76 LN, 3.84 PR, 85.49% ES, and 3.32 cm SL. In the ‘Shirineshahvar’ cultivar, the culture medium supplemented with 0.705 mg/L ZT, combined with 0.347 mg/L GA₃, were the significant input variables to achieve the best outputs with 18.77 LN, 3.22 PR, 56.39% ES, and 1.86 cm SL (Table 8).

Table 8 Optimization of pomegranate cultivars and different concentrations of ZT, and GA₃ according to the ESR-NSGA-II algorithm to obtain the best plant growth parameters

Full size table

Discussion

The success of in vitro plant tissue culture strongly depends on several external and internal factors, including environmental conditions, PGRs types, culture medium composition, and gelling agents, and genotype [18]. The application of PGRs, particularly cytokinin and auxin, are commonly used to optimize protocols for in vitro tissue culture and shoot regeneration [17, 54, 55]. Auxin increases the susceptibility of apical meristem cells that are less mitotically active cells to cytokinin [56], while cytokinin promotes cell proliferation, including cell division and shoot elongation [10]. In the case of pomegranate, which is a recalcitrant woody plant for in vitro culture, the optimization of type and concentration of PGRs, as well as their interactions, play a crucial role [8, 57,58,59].

In previous studies to efficiently multiply various pomegranate species, it has been reported that integrating BAP with or without NAA at specific concentrations ranging from 0.4 to 2 mg/L for BAP and 0.5 to 1 mg/L for NAA, has proven effective [57]. However, it is important to note that the results of these studies are often specific to particular cultivars and cannot be universally applied. The optimization of PGR concentrations is necessary due to genetic factors and complexities associated with the oxidation of phenols in explants and culture media, which can lead to tissue death. Furthermore, pomegranate tissue culture protocols are highly dependent on the cultivar and may differ due to variations in uptake rates, translocation rates, or metabolic processes within the meristematic regions of the plant. Additionally, cytokinin metabolism plays a crucial role, as cytokinins may undergo degradation or conjugation with sugars or amino acids, leading to the formation of biologically inert compounds, as reported by Desai et al. [60].

Although ZT has been recognized as highly effective in promoting shoot proliferation in various plant species [61,62,63], its use in pomegranate tissue culture has limited compared to other cytokinins. Similarly, the use of GA₃ in shoot proliferation, particularly in recalcitrant woody trees like pomegranate, has received limited attention. However, several studies have demonstrated that the interaction between cytokinins with GA₃ can improve the development of shoot/root apical meristems [8, 64, 65]. This study introduces a new shoot proliferation protocol for pomegranate cultivars, which utilizes a combination of ZT and GA₃. The results demonstrate the remarkable efficacy of this combination in stimulating shoot development compared to using BAP alone. Notably, the treatment involving the highest concentration of both ZT and GA₃ exhibited the most significant growth response, highlighting its effectiveness. Additionally, GA₃ enhanced shoot regeneration and increased the ES% of all three tested pomegranate cultivars when combined with cytokinins and auxins. Although limited reports exist on the effect of ZT on shoot proliferation in pomegranate, Naik et al. [66] reported significant improvements in regeneration frequency and shoot growth by adding zeatin riboside (ZR) to the culture medium. The analysis of current study also highlighted that different pomegranate cultivars exhibited different reactions to the same culture medium, despite their close genetic relationship. It is noteworthy that the ‘Faroogh’ cultivar exhibited the highest growth responses among the three cultivars investigated. However, the ‘Shirineshahvar’ cultivar displayed higher recalcitrant to shoot proliferation compared to the other cultivars. This could be attributed to variations in the concentration of endogenous phytohormones within the plants and their interaction with the applied exogenous PGRs in the culture of explants [67].

Developing and optimizing tissue culture protocols is a complex task that poses significant challenges to the field as a whole. The multifactorial nature of in vitro culture processes makes them difficult to understand and interpret using traditional statistical approaches such as ANOVA, t-tests, correlation, and regression, specifically when the variables investigated are nonlinear, noisy, complex, and vague in nature [68]. The knowledge derived from MLs, as complex mathematical tools, offer promise in understanding and interpreting the intricate, nonlinear relationships within datasets. ML models have demonstrated superior predictive power over traditional statistical methodologies when analyzing unpredictable variables and big dataset. Despite the advantages of ML, uncertainty in ML outcomes remains a major constraint in its application [69]. Uncertainty in ML studies arises from three primary sources: data quality, the sample of data collected from the domain, and model fitting [70]. To avoid uncertainties, researchers have recommended the application of different ML algorithms [69, 70]. In this study, five ML approaches (XGB, RF, SVR, ESR, and ENMLR) were employed for modeling the effects of various parameters (PGRs) on in vitro shoot proliferation of pomegranate. While similar performance was observed across the ML models in predicting pomegranate shoot multiplication, the results of the GPI analysis indicated that the ESR model stood out as the best performer. It exhibited robustness and superior predictive accuracy in both the training and testing subsets. It is worth noting that there is a lack of specific investigations regarding the use of the ESR algorithm in the field of plant tissue culture. Nonetheless, numerous studies in other scientific disciplines have demonstrated the robust performance of the ESR model in various prediction tasks [71, 72]. In recent research has shown that integrating optimization algorithms, particularly NSGA-II, with ML models can provide valuable insights and effective utilization of the models. The application of NSGA-II in conjunction with ML enables the answering of "How to get" questions by identifying the optimal culture medium that simultaneously improves multiple desired parameters for the studied parameters [18, 73]. In the current research, the ESR was linked to the NSGA-II algorithm as a computational forecasting approach for predicting and identifying critical factors affecting the in vitro proliferation stage of pomegranate cultivars. The successful application of optimization algorithms, especially NSGA-II, in the field of plant tissue culture has already been accomplished [31]. Additionally, various ML algorithms based on different optimization algorithms have shown promising results in modeling and predicting optimal plant tissue culture media for other fruit tree species such as kiwi berry [18], pear [74], prunus [15], pistachio rootstocks [74], and Persian walnut [10]. The outcomes obtained through the ESR-NSGA-II method accurately predicted that the highest plant growth responses would be achieved by supplementing the culture medium with 0.750 mg/L ZT, and 0.500 mg/L GA₃ for the ‘Atabaki’ cultivar, 0.654 mg/L ZT, and 0.329 mg/L GA₃ for the ‘Faroogh’ cultivar, and 0.705 mg/L ZT, and 0.347 mg/L GA₃ for the ‘Shirineshahvar’ cultivar. Overall, the ESR-NSGA-II algorithm revealed that the interaction between genotype and different concentrations of PGRs caused the most significant influence on pomegranate shoot proliferation. These findings are consistent with a study by Sadat-Hoseini et al. [10], which employed ML approaches to model growth parameters of in vitro Persian walnut using different concentrations of BAP, tidiazuran (TDZ), and indole butyric acid (IBA), and reported that the genotype-PGR interaction plays a crucial role in the proliferation of Persian walnut.

To the best of the author’s knowledge, this study represents the first investigation examining the specific effects of ZT and GA₃, as well as their interactions, in enhancing the efficiency of pomegranate tissue culture protocol, especially with the studied pomegranate cultivars on in vitro conditions for enhancing growth parameters. While previous studies have reported in vitro shoot proliferation success of different pomegranate cultivars, the focus on the specific combination of ZT and GA₃, and their interactions effects, is a novel aspect of this research. By evaluating the influence of these growth regulators on growth parameters, this study contributes to the advancement of pomegranate tissue culture techniques.

Conclusion

In vitro shoot proliferation is a multifactorial and complex process influenced by various interacting factors. So, to evaluate the extensive datasets and optimize the pomegranate protocol, ML techniques such as RF, SVR, XGB, ESR, and ENMLR were employed as promising alternatives to traditional statistical methods. Based on our results, ESR-NSGA-II exhibited superior accuracy and efficacy in studying pomegranate growth responses to multivariable stimuli in vitro and optimizing the pomegranate protocol. Furthermore, the in vitro responses of pomegranate were found to be positively influenced by the concentrations of PGRs (ZT and GA₃) and their interaction. Moreover, the optimization of in vitro condition of pomegranate was strongly depended on the specific cultivar. Specifically, the ‘Shirineshahvar’ cultivar demonstrated as a recalcitrant cultivar to in vitro shoot proliferation compared to other cultivars, while the ‘Faroogh’ cultivar exhibited the highest growth and shoot development. The main objective of the current research was to provide a reliable and robust technology, ESR-NSGA-II based on soft computing methodology, to provide new insight into the crucial factors that impact the growth parameters of pomegranate cultivars cultured in vitro.

Data availability

The authors confirm that the datasets analyzed during the current study are available from the corresponding author on reasonable request.

References

Zarbakhsh S, Kazemzadeh-Beneh H, Rastegar S. Quality preservation of minimally processed pomegranate cv. Jahrom arils based on chitosan and organic acid edible coatings. J Food Saf. 2019. https://doi.org/10.1111/jfs.12752.
Article Google Scholar
Zarbakhsh S, Shahsavar AR. Exogenous γ-aminobutyric acid improves the photosynthesis efficiency, soluble sugar contents, and mineral nutrients in pomegranate plants exposed to drought, salinity, and drought-salinity stresses. BMC Plant Biol. 2023;23:543. https://doi.org/10.1186/s12870-023-04568-2.
Article CAS PubMed PubMed Central Google Scholar
Dinesh RM, Patel AK, Vibha JB, Shekhawat S, Shekhawat NS. Cloning of mature pomegranate (Punica granatum) cv. Jalore seedless via in vitro shoot production and ex vitro rooting. Vegetos. 2019;32(2):181–9. https://doi.org/10.1007/S42535-019-00021-8.
Article Google Scholar
Pathania M, Arora PK, Pathania S, Kumar A. Studies on population dynamics and management of pomegranate aphid, Aphis punicae Passerini (Hemiptera: Aphididae) on pomegranate under semi-arid conditions of South-western Punjab. Sci Hortic. 2019;243:300–6. https://doi.org/10.1016/j.scienta.2018.07.027.
Article Google Scholar
Guney M. Development of an in vitro micropropagation protocol for Myrobalan 29C rootstock. Turk J Agric For. 2019;43:569–75. https://doi.org/10.3906/tar-1903-4.
Article CAS Google Scholar
Mulaei S, Jafari A, Shirmardi M, Kamali K. Micropropagation of Arid Zone Fruit Tree, Pomegranate, cvs. ‘Malase Yazdi’ and ‘Shirine Shahvar.’ Int J Fruit Sci. 2020;20(4):825–36. https://doi.org/10.1080/15538362.2019.1680334.
Article Google Scholar
Zareian B, Abadi Z, Kamali A, Abad K, Tabandeh SA. Comparison of different culture media and hormonal concentrations for In-Vitro propagation of pomegranate. Int J Fruit Sci. 2020;20:1721–8. https://doi.org/10.1080/15538362.2020.1830916.
Article Google Scholar
Kanwar K, Joseph J, Deepika R. Comparison of in vitro regeneration pathways in Punica granatum L. Plant Cell Tissue Organ Cult. 2010;100(2):199–207. https://doi.org/10.1007/s11240-009-9637-4.
Article CAS Google Scholar
Nezami-Alanagh E, Garoosi GA, Landín M, Gallego PP. Combining DOE with neurofuzzy logic for healthy mineral nutrition of pistachio rootstocks in vitro culture. Front Plant Sci. 2018;9:1474. https://doi.org/10.3389/fpls.2018.01474.
Article PubMed PubMed Central Google Scholar
Sadat Hosseini M, Arab MM, Soltani M, Eftekhari M, Soleimani A, Vahdati K. Predictive modeling of Persian walnut (Juglans regia L.) in vitro proliferation media using machine learning approaches: a comparative study of ANN KNN and GEP models. Plant Methods. 2022;18:48. https://doi.org/10.1186/s13007022008715.
Article CAS PubMed PubMed Central Google Scholar
Hassan SAM, Zayed NS. Factor controlling micropropagation of fruit trees: a review. Sci Int. 2018;6(1):1–10. https://doi.org/10.17311/sciintl.2018.1.10.
Article CAS Google Scholar
Benelli C, De Carlo A. In vitro multiplication and growth improvement of Olea europaea L. cv Canino with temporary immersion system (Plantform™). 3 Biotech. 2018;8(7):317. https://doi.org/10.1007/s13205-018-1346-4.
Article PubMed PubMed Central Google Scholar
Haq IU, Umar H, Akhtar N, Iqbal MA, Ijaz M. Techniques for micropropagation of olive (Olea europaea L.): a systematic review. Pak J Agric Res. 2021;34(1):184–92. https://doi.org/10.17582/journal.pjar/2021/34.1.184.192.
Article Google Scholar
Lambardi M, Rugini E. Micropropagation of olive (Olea europaea L.). In: Micropropagation of woody trees and fruits. Dordrecht: Springer Netherlands; 2003. p. 621–46. https://doi.org/10.1007/978-94-010-0125-0_21.
Chapter Google Scholar
Nezami-Alanagh E, Garoosi GA, Haddad R, Maleki S, Landín M, Gallego PP. Design of tissue culture media for efficient Prunus rootstock micropropagation using artificial intelligence models. Plant Cell Tissue Organ Cult. 2014;117:349–59. https://doi.org/10.1007/s11240-014-0444-1.
Article CAS Google Scholar
García-Pérez P, Zhang L, Miras-Moreno B, Lozano-Milo E, Landin M, Lucini L, et al. The combination of untargeted metabolomics and machine learning predicts the biosynthesis of phenolic compounds in Bryophyllum medicinal plants (Genus Kalanchoe). Plants. 2021;10(11):2430. https://doi.org/10.3390/plants10112430.
Article CAS PubMed PubMed Central Google Scholar
García-Pérez P, Lozano-Milo E, Landin M, Gallego PP. Machine Learning unmasked nutritional imbalances on the medicinal plant Bryophyllum sp. cultured in vitro. Front Plant Sci. 2020;11:576177. https://doi.org/10.3389/fpls.2020.576177.
Article PubMed PubMed Central Google Scholar
Hameg R, Arteta TA, Landin M, Gallego PP, Barreal ME. Modeling and optimizing culture medium mineral composition for in vitro propagation of Actinidia arguta. Front Plant Sci. 2020;11:2088. https://doi.org/10.3389/fpls.2020.554905.
Article Google Scholar
Mirza K, Aasim M, Katırcı R, Karataş M, Ali SA. Machine learning and artificial neural networks-based approach to model and optimize ethyl methanesulfonate and sodium azide induced in vitro regeneration and morphogenic traits of water hyssops (Bacopa monnieri L.). J Plant Growth Regul. 2023;42(6):3471–85. https://doi.org/10.1007/s00344-022-10808-w.
Article CAS Google Scholar
Niazian M, Shariatpanahi ME, Abdipour M, Oroojloo M. Modeling callus induction and regeneration in an anther culture of tomato (Lycopersicon esculentum L.) using image processing and artificial neural network method. Protoplasma. 2019;256:1317–32. https://doi.org/10.1007/s00709-019-01379-x.
Article CAS PubMed Google Scholar
Rezaei H, Mirzaie-Asl A, Abdollahi MR, Tohidfar M. Comparative analysis of different artificial neural networks for predicting and optimizing in vitro seed germination and sterilization of petunia. PLoS ONE. 2023;18(5): e0285657. https://doi.org/10.1371/journal.pone.0285657.
Article CAS PubMed PubMed Central Google Scholar
Türkoğlu A, Bolouri P, Haliloğlu K, Eren B, Demirel F, Işık Mİ, et al. Modeling callus induction and regeneration in hypocotyl explant of fodder pea (Pisum sativum var. arvense L.) using machine learning algorithm method. Agronomy. 2023;13:2835. https://doi.org/10.3390/agronomy13112835.
Article Google Scholar
Niazian M, Niedbała G, Sabbatini P. Modeling Agrobacterium –mediated gene transformation of tobacco (Nicotiana tabacum)—a model plant for gene transformation studies. Front Plant Sci. 2021;11:695110. https://doi.org/10.3389/fpls.2021.695110.
Article Google Scholar
Salehi M, Farhadi S, Moieni A, Safaie N, Hesami M. A hybrid model based on general regression neural network and fruit fly optimization algorithm for forecasting and optimizing paclitaxel biosynthesis in Corylus avellana cell culture. Plant Methods. 2021;17(1):1–13. https://doi.org/10.1186/s13007-021-00714-9.
Article CAS Google Scholar
Wu T, Zhang W, Jiao X, Guo W, Hamoud YA. Evaluation of stacking and blending ensemble learning methods for estimating daily reference evapotranspiration. Comput Electron Agric. 2021;184:106039. https://doi.org/10.1016/j.compag.2021.106039.
Article Google Scholar
Sadat-Hosseini M, Arab MM, Soltani M, Eftekhari M, Soleimani A. Applicability of soft computing techniques for in vitro micropropagation media simulation and optimization: a comparative study on Salvia macrosiphon Boiss. Ind Crops Prod. 2023;199:116750. https://doi.org/10.1016/j.indcrop.2023.116750.
Article CAS Google Scholar
Lee S, Park J, Kim N, Lee T, Quagliato L. Extreme gradient boosting-inspired process optimization algorithm for manufacturing engineering applications. Mater Des. 2023;226:111625. https://doi.org/10.1016/j.matdes.2023.111625.
Article Google Scholar
Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc Series B Stat Methodol. 2005;67(2):301–20. https://doi.org/10.1111/j.1467-9868.2005.00527.x.
Article Google Scholar
Al-Jawarneh AS, Ismail MT, Awajan AM, Alsayed AR. Improving accuracy models using elastic net regression approach based on empirical mode decomposition. Comm Statist Simul Comput. 2022;51(7):4006–25. https://doi.org/10.1080/03610918.2020.1728319.
Article Google Scholar
Zarbakhsh S, Shahsavar AR. Artificial neural network-based model to predict the effect of γ-aminobutyric acid on salinity and drought responsive morphological traits in pomegranate. Sci Rep. 2022;12(1):16662. https://doi.org/10.1038/s41598-022-04507-5.
Article PubMed PubMed Central Google Scholar
Fakhrzad F, Jowkar A, Hosseinzadeh J. Mathematical modeling and optimizing the in vitro shoot proliferation of wallflower using multilayer perceptron non-dominated sorting genetic algorithm-II (MLP-NSGAII). PLoS ONE. 2022;17: e0273009. https://doi.org/10.1371/journal.pone.0273009.
Article CAS PubMed PubMed Central Google Scholar
Aasim M, Ayhan A, Katırcı R, Acar AŞ, Ali SA. Computing artificial neural network and genetic algorithm for the feature optimization of basal salts and cytokinin-auxin for in vitro organogenesis of royal purple (Cotinus coggygria Scop). Ind Crops Prod. 2023;199: 116718. https://doi.org/10.1016/j.indcrop.2023.116718.
Article CAS Google Scholar
Chen Y, Xu M, Shen X, Zhang G, Lu Z, Xu J. A multi-objective modeling method of multi-satellite imaging task planning for large regional mapping. Remote Sens. 2020;12:344. https://doi.org/10.3390/rs12030344.
Article Google Scholar
Murashige T, Skoog FA. Revised medium for rapid growth and bio assays with tobacco tissue cultures. Physiol Plant. 1962;15:473–97. https://doi.org/10.1111/j.1399-3054.1962.tb08052.x.
Article CAS Google Scholar
Van der Salm TPM, Van der Toorn CJG, Hanisch ten Cate CH, Dubois LAM, De Vries DP, Dons HJM. Importance of the iron chelate formula for micropropagation of Rosa hybrida L. Moneyway. Plant Cell Tissue Organ Cult. 1994;37:73–7. https://doi.org/10.1007/BF00048152.
Article Google Scholar
McCown BH. Woody plant medium (WPM)-a mineral nutrient formulation for microculture for woody plant species. HortSci. 1981;16:453.
Google Scholar
Rong G, Li K, Su Y, Tong Z, Liu X, Zhang J, et al. Comparison of tree-structured parzen estimator optimization in three typical neural network models for landslide susceptibility assessment. Remote Sens. 2021;13(22):4694. https://doi.org/10.3390/rs13224694.
Article Google Scholar
Lu M, Hou Q, Qin S, Zhou L, Hua D, Wang X, et al. A stacking ensemble model of various machine learning models for daily runoff forecasting. Water. 2023;15(7):1265. https://doi.org/10.3390/w15071265.
Article Google Scholar
Vapnik V. The nature of statistical learning theory. Dordrecht: Springer Science & Business Media; 2013. https://doi.org/10.1007/978-1-4757-3264-1.
Book Google Scholar
Wang L, Zhou X, Zhu X, Dong Z, Guo W. Estimation of biomass in wheat using random forest regression algorithm and remote sensing data. The Crop J. 2016;4:212–9. https://doi.org/10.1016/j.cj.2016.01.008.
Article Google Scholar
Breiman L. Random forests. Mach Learn. 2001;45:5–32.
Article Google Scholar
Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ACM. 2016;785–794. https://doi.org/10.1145/2939672.2939785
Wolpert DH. Stacked generalization. Neural Netw. 1992;5(2):241–59. https://doi.org/10.1016/S0893-6080(05)80023-1.
Article Google Scholar
Li Y, Tang Z, Yang S. Deep regressor stacking to learn molecular quantum properties. In: Second International Conference on Electronic Information Engineering, Big Data, and Computer Technology (EIBDCT 2023). 2023. https://doi.org/10.1117/12.2674796
Lu X, Zhou W, Ding X, Shi X, Luan B, Li M. Ensemble learning regression for estimating unconfined compressive strength of cemented paste backfill. IEEE ACCESS. 2019;7:72125–33. https://doi.org/10.1109/access.2019.2918177.
Article Google Scholar
Zhu X, Hu J, Xiao T, Huang S, Wen Y, Shang D. An interpretable stacking ensemble learning framework based on multi-dimensional data for real-time prediction of drug concentration: the example of olanzapine. Front Pharmacol. 2022. https://doi.org/10.3389/fphar.2022.975855.
Article PubMed PubMed Central Google Scholar
Sapkota S, Boatwright J, Jordan K, Boyles R, Kresovich S. Multi-trait regressor stacking increased genomic prediction accuracy of sorghum grain composition. Agronomy. 2020;10(9):1221. https://doi.org/10.1101/2020.04.03.023531.
Article CAS Google Scholar
Santana E, Silva J, Mastelini S, Barbon S. Stock portfolio prediction by multi-target decision support. Isys Braz J Inf Syst. 2019;12(1):05–27. https://doi.org/10.5753/isys.2019.381.
Article Google Scholar
Despotovic M, Nedic V, Despotovic D, Cvetanovic S. Review andstatistical analysis of different global solar radiation sunshine models. Renew Sust Energ Rev. 2015;52:1869–80. https://doi.org/10.1016/j.rser.2015.08.035.
Article Google Scholar
Yoosefzadeh-Najafabadi M, Tulpan D, Eskandari M. Application of machine learning and genetic optimization algorithms for modeling and optimizing soybean yield using its component traits. PLoS ONE. 2021;16: e0250665. https://doi.org/10.1371/journal.pone.0250665.
Article CAS PubMed PubMed Central Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in python. J Mach Learn Res. 2011;12:2825–30.
Google Scholar
Bergstra J, Yamins D, Cox DD. Hyperopt: a python library for optimizing the hyperparameters of machine learning algorithms. Proc Python 12th Sci Conf. 2013;13:20. https://doi.org/10.25080/Majora-8b375195-003.
Article Google Scholar
Blank J, Deb K. Pymoo: multi-objective optimization in python. IEEE Access. 2020;8:89497–509. https://doi.org/10.1109/ACCESS.2020.2990567.
Article Google Scholar
Krasteva G, Georgiev V, Pavlov A. Recent applications of plant cell culture technology in cosmetics and foods. Eng Life Sci. 2020;21:68–76. https://doi.org/10.1002/elsc.202000078.
Article CAS PubMed PubMed Central Google Scholar
Amiri S, Mohammadi R. Establishment of an efficient in vitro propagation protocol for Sumac (Rhus coriaria L.) and confirmation of the genetic homogeneity. Sci Rep. 2021;11(1):1–9. https://doi.org/10.1038/s41598-020-80550-4.
Article CAS Google Scholar
Schaller GE, Bishopp A, Kieber JJ. The yin yang of hormones: cytokinin and auxin interactions in plant development. Plant Cell. 2015;27(1):44–63. https://doi.org/10.1105/tpc.114.133595.
Article CAS PubMed PubMed Central Google Scholar
Singh SK, Singh A, Singh NV, Ramajayam D. Pomegranate tissue culture and biotechnology. Fruit Veg Cereal Sci Biotechnol. 2010;4:120–5.
Google Scholar
Patil VM, Dhande GA, Thigale DM, Rajput JC. Micropropagation of pomegranate (Punica granatum L.) ‘Bhagava’ cultivar from nodal explant. Afr J Biotechnol. 2011;10:18130–6. https://doi.org/10.5897/AJB11.1437.
Article CAS Google Scholar
da Silva JAT, Rana TS, Narzary D, Verma N, Meshram DT, Ranade SA. Pomegranate biology and biotechnology: a review. Sci Hortic. 2013;160:85–107. https://doi.org/10.1016/j.scienta.2013.05.017.
Article CAS Google Scholar
Desai P, Patil G, Dholiya B, Desai S, Patel F, Narayanan S. Development of an efficient micropropagation protocol through axillary shoot proliferation for pomegranate variety ‘Bhagwa.’ Ann Agric Sci. 2018;16(4):444–50. https://doi.org/10.1016/j.aasci.2018.06.002.
Article Google Scholar
Schuchovski CS, Biasi LA. In Vitro establishment of ‘Delite’rabbiteye blueberry microshoots. Hortic. 2019;5(1):24. https://doi.org/10.3390/horticulturae5010024.
Article Google Scholar
Debnath SC, Arigundam U. In vitro propagation strategies of medicinally important berry crop, lingonberry (Vaccinium vitis-idaea L.). Agronomy. 2020;10(5):744. https://doi.org/10.3390/agronomy10050744.
Article CAS Google Scholar
Arigundam U, Variyath AM, Yaw LS, Marshall D, Debnath SC. Liquid culture for efficient in vitro propagation of adventitious shoots in wild Vaccinium vitis-idaea ssp. minus (lingonberry) using temporary immersion and stationary bioreactors. Sci Hortic. 2020;264:1091–9. https://doi.org/10.1016/j.scienta.2020.109199.
Article CAS Google Scholar
Devidas T, Tiwari Sharad T, Nagesh D. Multiple shoot induction of pomegranate (Punica granatum L.) through different juvenile explants. Bull Env Pharmacol Life Sci. 2017;7:29–33.
Google Scholar
Ahmad A, Ahmad N, Anis M, Alatar AA, Abdel-Salam EM, Qahtan AA, et al. Gibberellic acid and thidiazuron promote micropropagation of an endangered woody tree (Pterocarpus marsupium Roxb.) using in vitro seedlings. Plant Cell Tissue Organ Cult. 2021;144(2):449–62. https://doi.org/10.1007/s11240-020-019691.
Article CAS Google Scholar
Naik SK, Pattnaik S, Chand PK. In vitro propagation of pomegranate (Punica granatum L. cv. Ganesh) through axillary shoot proliferation from nodal segments of mature tree. Sci Hortic. 1999;79:175–83. https://doi.org/10.1016/S0304-4238(98)00218-0.
Article CAS Google Scholar
Ikeuchi M, Sugimoto K, Iwase A. Plant callus: mechanisms of induction and repression. Plant Cell. 2013;25:3159–73. https://doi.org/10.1105/tpc.113.116053.
Article CAS PubMed PubMed Central Google Scholar
Niedbała G, Niazian M, Sabbatini P. Modeling agrobacterium-mediated gene transformation of Tobacco (Nicotiana tabacum)—a model plant for gene transformation studies. Front Plant Sci. 2021;12:695110. https://doi.org/10.3389/fpls.2021.695110.
Article PubMed PubMed Central Google Scholar
Yang F, Wanik DW, Cerrai D, Bhuiyan MA, Anagnostou EN. Quantifying uncertainty in machine learning-based power outage prediction model training: a tool for sustainable storm restoration. Sustainability. 2020;12:1525. https://doi.org/10.3390/su12041525.
Article Google Scholar
Saltzman B, Yung J. A machine learning approach to identifying different types of uncertainty. Econ Lett. 2018;171:58–62. https://doi.org/10.1016/j.econlet.2018.07.003.
Article Google Scholar
Khairalla MA, Ning X, Al-Jallad NT, El-Faroug MO. Short-term forecasting for energy consumption through stacking heterogeneous ensemble learning model. Energies. 2018;11(6):1605. https://doi.org/10.3390/en11061605.
Article Google Scholar
Kandel I, Castelli M, Popovič A. Comparing stacking ensemble techniques to improve musculoskeletal fracture image classification. J Imaging. 2021;7(6):100. https://doi.org/10.3390/jimaging7060100.
Article PubMed Central Google Scholar
Nezami-Alanagh E, Garoosi GA, Maleki S, Landín M, Gallego pp. Predicting optimal in vitro culture medium for Pistacia vera micropropagation using neural networks models. Plant Cell Tissue Organ Cult. 2017;129(1):19–33. https://doi.org/10.1007/s11240-016-1152-9.
Article CAS Google Scholar
Jamshidi S, Yadollahi A, Arab MM, Soltani M, Eftekhari M, Sabzalipoor H, et al. Combining gene expression programming and genetic algorithm as a powerful hybrid modeling approach for pear rootstocks tissue culture media formulation. Plant Methods. 2019;15:136. https://doi.org/10.1186/s13007-019-0520-y.
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We acknowledge the college of agriculture, Shiraz University, Iran for providing experimental facilities. We also acknowledge the reviewers for their valuable comments and suggestions.

Funding

No funding was received.

Author information

Authors and Affiliations

Department of Horticultural Science, College of Agriculture, Faculty of Agriculture, Shiraz University, Shiraz, Iran
Saeedeh Zarbakhsh & Ali Reza Shahsavar
Independent Researcher, Tehran, Iran
Mohammad Soltani

Authors

Saeedeh Zarbakhsh
View author publications
You can also search for this author in PubMed Google Scholar
Ali Reza Shahsavar
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Soltani
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.Z. and A.R.S. designed the experiment and conceptualization. S.Z. and M.S. implemented the machine learning algorithms and analyzed the results. S.Z. and M.S. wrote the original draft. A.R.S. revised the manuscript. S.Z. finalized the manuscript. All authors contributed to the manuscript and approved the submitted version for publication.

Corresponding author

Correspondence to Ali Reza Shahsavar.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Zarbakhsh, S., Shahsavar, A.R. & Soltani, M. Optimizing PGRs for in vitro shoot proliferation of pomegranate with bayesian-tuned ensemble stacking regression and NSGA-II: a comparative evaluation of machine learning models. Plant Methods 20, 82 (2024). https://doi.org/10.1186/s13007-024-01211-5

Download citation

Received: 28 February 2024
Accepted: 17 May 2024
Published: 31 May 2024
DOI: https://doi.org/10.1186/s13007-024-01211-5

Optimizing PGRs for in vitro shoot proliferation of pomegranate with bayesian-tuned ensemble stacking regression and NSGA-II: a comparative evaluation of machine learning models

Abstract

Background

Results

Conclusions

Background

Materials and methods

Plant material and explant preparation

In vitro culture establishment

Experimental design and data analysis

Description of ML models and optimization algorithm

Model development

Hyper parameter optimization in ML models

Support vector regression (SVR)

Random forest (RF)

eXtreme Gradient Boosting (XGB)

Elastic net multivariate linear regression (ENMLR)

Ensemble stacking regression (ESR)

Performance evaluation

Global performance indicator (GPI)

Optimization of ML model via non‑dominated sorting genetic algorithm‑II (NSGA-II)

Results

The effect of PGRs on in vitro shoot proliferation and development of pomegranate

Comparison of ML performance

Optimization process via non-dominated sorting genetic algorithm-II

Discussion

Conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Plant Methods

Contact us