Skip to main content

Optimizing PGRs for in vitro shoot proliferation of pomegranate with bayesian-tuned ensemble stacking regression and NSGA-II: a comparative evaluation of machine learning models

Abstract

Background

The process of optimizing in vitro shoot proliferation is a complicated task, as it is influenced by interactions of many factors as well as genotype. This study investigated the role of various concentrations of plant growth regulators (zeatin and gibberellic acid) in the successful in vitro shoot proliferation of three Punica granatum cultivars (‘Faroogh’, ‘Atabaki’ and ‘Shirineshahvar’). Also, the utility of five Machine Learning (ML) algorithms—Support Vector Regression (SVR), Random Forest (RF), Extreme Gradient Boosting (XGB), Ensemble Stacking Regression (ESR) and Elastic Net Multivariate Linear Regression (ENMLR)—as modeling tools were evaluated on in vitro multiplication of pomegranate. A new automatic hyperparameter optimization method named Adaptive Tree Pazen Estimator (ATPE) was developed to tune the hyperparameters. The performance of the models was evaluated and compared using statistical indicators (MAE, RMSE, RRMSE, MAPE, R and R2), while a specific Global Performance Indicator (GPI) was introduced to rank the models based on a single parameter. Moreover, Non‑dominated Sorting Genetic Algorithm‑II (NSGA‑II) was employed to optimize the selected prediction model.

Results

The results demonstrated that the ESR algorithm exhibited higher predictive accuracy in comparison to other ML algorithms. The ESR model was subsequently introduced for optimization by NSGA‑II. ESR-NSGA‑II revealed that the highest proliferation rate (3.47, 3.84, and 3.22), shoot length (2.74, 3.32, and 1.86 cm), leave number (18.18, 19.76, and 18.77), and explant survival (84.21%, 85.49%, and 56.39%) could be achieved with a medium containing 0.750, 0.654, and 0.705 mg/L zeatin, and 0.50, 0.329, and 0.347 mg/L gibberellic acid in the ‘Atabaki’, ‘Faroogh’, and ‘Shirineshahvar’ cultivars, respectively.

Conclusions

This study demonstrates that the 'Shirineshahvar' cultivar exhibited lower shoot proliferation success compared to the other cultivars. The results indicated the good performance of ESR-NSGA-II in modeling and optimizing in vitro propagation. ESR-NSGA-II can be applied as an up-to-date and reliable computational tool for future studies in plant in vitro culture.

Background

Over the past decade, the pomegranate tree (Punica granatum L.) has attained significant attention as an economically super fruit cultivated throughout the world, particularly in the arid and semiarid regions. This is due to its high medicinal effects, rich content of bioactive compounds such as antioxidant polyphenol, and numerous health advantages [1, 2]. Traditional methods of propagating pomegranates include sexual propagation through seeds and vegetative methods. However, both conventional propagation methods may face several limitations that cause pomegranate propagation to be difficult. Vegetative methods are time-consuming, dependent on seasonal production, and require intensive labor. Moreover, a large number of plants derived from cuttings often fail to survive [3]. On the other hand, sexual methods are challenging due to the high heterozygosis and a long juvenile period in plants. In addition, seedlings propagated by mentioned methods are strongly affected by pest infestation and diseases [4]. So, to achieve large-scale pomegranate cultivation, in vitro cell and organ culture techniques have been developed. Plant tissue culture methods offer a promising approach for the rapid production of true-to-type pomegranate plants and the biotechnological exploitation of pomegranate and other plant species with valuable properties [5]. Previous studies have attempted to apply in vitro culture techniques to propagate different cultivars of pomegranate [6, 7]. However, the findings have clearly emphasized that pomegranate micropropagation is moderately difficult and can vary depending on the cultivar, probably due to genetic variations among them [6, 8]. Nevertheless, the successful propagation of economically important woody plant species like pomegranate still presents challenges, due to the emergence of some problems during the proliferation stage including defoliation of explants, shoot tip necrosis, callusing, and hyperhydricity. These plant physiological disorders arise from factors such as undesirable medium composition, unsuitable type and concentration of plant growth regulators (PGRs), microbial contamination, phenolic browning caused by phenol secretion, ethylene accumulation, and tissue recalcitrance to proliferation (Fig. 1) [8,9,10].

Fig. 1
figure 1

A schematic view of different factors that influence physiological disorders of in vitro plants

The successful in vitro propagation of fruit trees is an intricate process that is influenced by numerous factors, including culture conditions, plant materials, and the composition of culture media, particularly PGRs [11]. Extensive research has emphasized the crucial role of PGRs, such as cytokinins and auxins, and their different combinations with gibberellic acid (GA3) in promoting shoot regeneration in different pomegranate cultivars [7]. However, certain PGRs have shown varying levels of effectiveness in promoting proliferation. For example, 6-γ,γ-dimethylallylaminopurina (2-iP) has been reported to have lower proliferative efficiency, while others like 6-Benzylaminopurine (BAP), a commonly used cytokinin in tissue culture, can produce short and thin shoots, sometimes accompanied by excessive callus proliferation. Among the cytokinins, zeatin (ZT), a natural cytokinin, has been found to play a vital role in stimulating the maximum axillary buds and is applied at various concentrations either alone or in combination with other growth regulators. ZT is considered desirable for its stability in nutrient media, as it does not easily degrade or break down, thus providing sustained benefits for rapid and high rates of proliferation in most plant explants [12, 13]. Although different growth regulators, including BAP, kinetin, thidiazuron (TDZ), GA3, and IBA, have been used in various combinations with or without ZT to promote the stimulation of axillary buds, GA3 is particularly known for inducing rapid shoot elongation, which is beneficial for subsequent rooting. Considering the high cost of ZT, researchers are actively exploring the combined use of ZT with other cytokinins while maintaining the proliferative potential of shoot cultures [14]. However, it is important not to overlook the role of ZT in ensuring a good rate of proliferation [12]. Nonetheless, it is crucial to acknowledge that the responses of different pomegranate cultivars to in vitro propagation are significantly vary depending on the interacting factors during the in vitro process, even in closely related species [15]. Therefore, to achieve optimal results, optimizing of specific in vitro culture condition is necessary for each cultivar.

In vitro micropropagation is a multifactorial and complex biological process influenced by genotype/cultivar and various interacting factors that are crucial for optimizing this process. Traditional statistical techniques encounter with significant challenges in deciphering the large datasets of biological interactions, especially when datasets are nonlinear, complex, noisy, and ambiguous in nature, as observed in in vitro culture processes [16]. To overcome these challenges, advanced computer-based technologies such as Machine Learning (ML) tools have emerged as capable solutions for analyzing and predicting complex and multivariate datasets with high accuracy. ML approaches offer the advantage of autonomous learning and data transformation into useful information without being humanly programmed [17]. Recent studies have highlighted the superior predictive performance of MLs over traditional statistics in various in vitro culture systems, including optimizing culture conditions for shoot proliferation and rooting [10, 18, 19], androgenesis [20], seed germination [21], somatic embryogenesis [22], gene transformation [23], and enhancing of the secondary metabolite biosynthesis [24].

Among the various algorithm-based ML tools, ensemble learning methods have gained significant attention due to their simplicity and their ability to create powerful and robust predictions. These methods can be broadly categorized into bagging, boosting, and stacking/blending. Notably, three prominent ensemble learning methods are Extreme Gradient Boosting (XGB), which utilizes the boosting concept, Random Forest (RF), based on bagging concept, and Ensemble Stacking Regression (ESR), based on stacking concept [25]. Support Vector Machine (SVM) is a robust ML method that has been widely recognized for its remarkable accuracy in plant in vitro micropropagation, as evidenced by the findings of previous studies [19, 26]. One notable advantage of SVM is its ability to effectively handle high-dimensional data without encountering difficulties. Researchers have explored the potential of SVM to address the challenges by utilizing a small training dataset, further highlighting the versatility and effectiveness of SVM in providing accurate and reliable predictions even with limited training data [27]. The Elastic Net Multivariate Linear Regression (ENMLR) was introduced by Zou and Hastie [28] as a robust approach for analyzing high-dimensional datasets. It was designed to overcome the limitations of the LASSO method. By incorporating regression techniques, ENMLR effectively regularizes and selects important predictor variables, thereby improving prediction accuracy of sparse modeling. This method has demonstrated its value in addressing the challenges associated with multicollinearity among predictor variables [29]. Selecting the most appropriate ML method depends on the association between input and output variables, as well as the optimization of hyperparameters [19]. In addition, the combination of ML techniques with evolutionary optimization algorithms confers significant advantages in predicting the critical factors that influence plant growth parameters in in vitro culture systems. One powerful algorithm in this regard is the non-dominated sorting genetic algorithm-II (NSGA-II), which is widely recognized as a search algorithm for optimizing multi-objective problems. NSGA-II enables efficient solving and prediction of complex processes while providing a simplified interpretation of results, simultaneously [30]. In previous studies, the combining approach of ML with NSGA-II (ML-NSGA-II) has been acknowledged as a robust modeling technique for complex datasets, such as in optimizing the protocol of in vitro tissue culture on micropropagation phases [21, 31, 32] and in various plant science fields [30, 33].

Based on our current knowledge, the application of ML algorithms as a novel strategy for modeling and predicting the in vitro shoot proliferation of pomegranate plants remains largely unexplored. The overall objective of this study is (i) to evaluate the effects of ZT at different concentrations and in combination with GA3 on optimizing the tissue culture protocol of three commercially significant cultivars, namely ‘Faroogh’, ‘Atabaki’ and ‘Shirineshahvar’; (ii) to compare the potential robustness of the most commonly used ML algorithms, including SVR, RF, XGB, ESR, and ENMLR, in terms of their ability to model and optimize of the in vitro shoot proliferation process of pomegranate cultivars; and (iii) to employ the NSGA-II in order to predict the most effective level of PGRs for enhancing the proliferation of pomegranate. To our knowledge, this study is the first application of ML models for optimizing pomegranate tissue culture media. In addition, despite the potential advantages of ESR and ENMLR, no study has been conducted on applying these procedures in plant science.

Materials and methods

Plant material and explant preparation

The experiments were conducted using single nodal explants from three different pomegranate cultivars: ‘Faroogh’, ‘Atabaki’ and ‘Shirineshahvar’. These explants were obtained from pomegranate plants grown in a greenhouse of College of Agriculture, Shiraz University, Iran. Explants were pre-sterilized using a liquid soap solution and rinsed several times with tap water. Subsequently, the explants were subjected to surface sterilization by immersing them in 70% aqueous ethanol for 30 s, followed by treatment with 5% sodium hypochlorite for 10 min. Afterward, the explants were washed three times with sterilized distilled water under a laminar airflow chamber. Following the sterilization process, the stem explants were cut into 2–3 cm segments with lateral buds (Fig. 2a).

Fig. 2
figure 2

In vitro propagation of pomegranate cultivar ‘Faroogh’. a Single-node explants, b shoot proliferation in mMS medium supplemented with 0.750 mg/L zeatin and 0.500 mg/L gibberellic acid, c shoot proliferation in control medium, and (d) shoots propagated in mMS medium supplemented with 0.750 mg/L zeatin and 0.500 mg/L gibberellic acid

In vitro culture establishment

A preliminary test was carried out using different combinations of culture media: MS (Murashige and Skoog) [34], VS (Van der Salm) [35], WPM (woody plant medium) [36], half-strength MS, and modified MS (mMS), PGRs (BAP and NAA), phenol-controlling compounds (polyvinylpyrolidon, ascorbic acid, and activated charcoal), and silver nitrate (AgNO3) as ethylene inhibitor. The main experiment was set up based on the pre-test results, which indicated that the mMS medium supplemented with activated charcoal and AgNO3 in combination with either BAP or NAA was the best treatment for stimulating new shoot regeneration. In this experiment, the explants (2–3 cm stem segments with lateral buds) were immediately cultured in the capped glass containers containing 25 mL of mMS as a basal medium supplemented with 1 mg/L BAP, 0.5 mg/L NAA, 250 mg/L activated charcoal, 4.5 mg/L AgNO3, 0.7% agar, and 3% sucrose. To obtain the best hormonal composition at the protocol of pomegranate proliferation, the effects of different concentrations of GA3 (0, 0.1, 0.25, and 0.5 mg/L) and ZT (0, 0.25, 0.5, and 0.75 mg/L) on shoot proliferation were evaluated. Prior to autoclaving at 121 ℃ for 15 min, the pH of the medium was adjusted to 5.7–5.8. To mitigate tissue culture browning, the cultures were incubated in darkness for 7 days in a growth chamber at a temperature of 25 ± 2 ℃, and then transferred to a 16-h photoperiod with a light intensity of 80 µmol m−2 s−1 and an 8-h dark period. After three subcultures on the same culture medium, various morphological responses of the plants were measured for each cultivar; including the proliferation rate (PR; number of new shoots per explant), shoot length (SL; length of new regenerated shoots per explant in cm), leave number (LN; the number of leaves per explant), and explant survival (ES; the survival rate of explants in percent) (Fig. 3a).

Fig. 3
figure 3

The schematic diagram of the step-by-step procedure of the present research includes (A) pomegranate micropropagation, B modeling growth parameters based on K-fold cross-validation and ATPE algorithm using MLs, and (C) optimization process of growth parameters via non-dominated sorting genetic algorithm-II (NSGA-II)

Experimental design and data analysis

The proliferation experiment was carried out using a Completely Randomized Design (CRD) with a factorial arrangement. Each set of treatments consisted of 20 replicates, and subcultures were conducted over a three-week period. The variances analysis was performed using statistical analysis software (version 9.4; SAS Institute, Cary, NC).

Description of ML models and optimization algorithm

Model development

In this study, we employed a range of ML algorithms to build computational models using the datasets as training and testing data. Specifically, we selected most widely used ML algorithms such as SVR, RF, XGB, ENMLR, and ESR to analyze the effect of the independent variables on in vitro pomegranate plant growth responses. These five ML algorithms were applied to different pomegranate cultivars (‘Faroogh’, ‘Atabaki’, and ‘Shirineshahvar’), with two independent variables consisting of various concentrations of GA3 and ZT as inputs, and four plant growth responses (PR, SL, LN, and ES) considered as outputs. Prior to applying ML modeling, data scaling was employed to standardize the training set for each cultivar. The features are transformed into a mean of zero and a variance of one by standardizing the data using the Eq. 1. Additionally, Principal Component Analysis (PCA) was used to identify any outlier data; however, no outlier data was found in analysis. To train and test all five models, the experimental data (960 data points) were randomly divided into 80% and 20% for training and testing sets, respectively.

$${X}_{std}=\frac{{X}_{o}-\mu }{\sigma }$$
(1)

where \({X}_{std}\) is standardized value, \({X}_{o}\) is original value, \(\mu\) and \(\sigma\) are mean and standard deviation, respectively.

Hyper parameter optimization in ML models

In ML, the optimization and tuning of hyperparameters in advance play a crucial role in training ML models [37]. These hyperparameters have a significant impact on prediction accuracy and overall performance. Various strategies exist for hyperparameter optimization, including babysitting, grid search, random search, and bayesian optimization [38]. Among these strategies, Bayesian optimization is widely recognized for its generalizability across different test sets and its ability to achieve optimal hyperparameters with fewer iterations. In this study, a novel automatic tuning hyperparameter algorithm called Adaptive Three-structured Parzen Estimator (ATPE) was utilized in Bayesian optimization. This algorithm aimed to adjust the initial hyperparameters of five ML models to achieve optimized performance. It has not yet been applied to the optimization of in vitro PGRs. To improve the generalization performance of these models and avoid overfitting and underfitting, the study combined the ATPE method with K-fold cross-validation (K = 10). By employing the K-fold cross-validation method, all data points were involved in the training phase. The process is illustrated in Fig. 3b. The ML’s hyperparameters and their search space are shown in Table 1. The investigation was conducted with K values ranging from 1 to 10 for K-fold cross-validation. Each K value represented the ATPE algorithm for optimal ML model selection and hyperparameter tuning. One fold was randomly selected as the validation set, while the remaining folds were used to train the model. By employing the K-fold cross-validation method, all data points were involved in the training process.

Table 1 Hyperparameter tuning of the constructed models using ATPE

Support vector regression (SVR)

SVM is a supervised ML method that developed by Vapnik [39]. Initially developed for classification problems (Support Vector Classifier or SVC), SVM was later extended to handle regression problems (SVR) [40]. The fundamental concept behind SVR involves the use of a kernel function to map the original input data into a feature space. The SVM model estimates regression by utilizing a series of kernel functions to convert the original input data from its lower-dimensional representation to a higher-dimensional feature space. Unlike Artificial Neural Network (ANN) models, which often encounter multiple local minima, SVM provides a unique solution results that are at the global optimum. The approximated function within the SVR algorithm can be expressed as follows:

$$f\left(x\right)={\omega }^{T}x+b with \omega \epsilon x, b\epsilon R$$
(2)

where \(f\left(x\right)\) represents the estimated output value, \(\omega\) denotes weight for the \({\text{i}}^{\text{th}}\) sample point, and \(b\) represents the bias. The values of \(\omega\) and \(b\) are determined by minimizing the regularized risk function, which is expressed as:

$$R(C)=C\frac{1}{n}\sum_{i=1}^{n}L\left({d}_{i},{y}_{i}\right)+\frac{1}{2}{\Vert \omega \Vert }^{2}$$
(3)

where \(C\) represents the penalty parameter that balances the trade-off between model complexity and training error, \({d}_{i}\) denotes the desired value, \(n\) represents the total number of observations, and \(C\frac{1}{n}\sum_{i=1}^{n}L\left({d}_{i},{y}_{i}\right)\) is the empirical error. The following equation is employed to determine the insensitive loss function (\({l}_{\varepsilon })\):

$${l}_{\varepsilon }\left(d,y\right)=\left|d-y\right|-\varepsilon \left|d-y\right|\ge \varepsilon or 0 otherwise$$
(4)

where \(\frac{1}{2}{\Vert \omega \Vert }^{2}\) represents the regularization term, while ɛ (epsilon) represents the insensitive tube. The approximated function in Eq. (2) can be explicitly expressed by incorporating Lagrange multipliers and leveraging the optimality constraints. By introducing the Lagrange multipliers \(({a}_{i})\), the function is given by:

$$f\left(x,{a}_{i},{a}_{i}^{*}\right)=\sum_{i=1}^{n}\left({a}_{i}-{a}_{i}^{*}\right)K({x}_{i},{x}_{i}^{T})+b$$
(5)

where \(K({x}_{i},{x}_{i}^{T})\) represents the kernel function. The Radial Basis Function (RBF) non-linear kernel function plays a crucial role in mapping of input vectors nonlinearly into a high-dimensional feature space. In this study, the RBF was utilized due to its superior performance in estimating the H estimations compared to other kernel functions.

$${K}_{rbf}({x}_{i},{x}_{i}^{T})=\text{exp}\left[\frac{{-\left({x}_{i}-{x}_{i}^{T}\right)}^{2}}{{2\sigma }^{2}}\right]$$
(6)

Random forest (RF)

RF introduced for classification or regression prediction algorithm introduced by Breiman [41]. It solves the performance limitations of decision trees and exhibits favorable characteristics such as robustness to noise and outliers, scalability, and parallelism in high-dimensional data classification tasks. RF overcomes the "dimensionality disaster" often encountered in big data scenarios that often other models fail to perform effectively. Additionally, RF demonstrates comparable error rates to other methods across various learning tasks and exhibits a reduced tendency to overfitting. Notably, RF is a well-known bagging algorithm that excels in regression problems [38]. RF algorithm combines decision tree-based techniques with ensemble methods, effectively leveraging their synergistic benefits, making it a suitable choice as one of the foundational models in the ensemble model employed in this study. The formula of RF is as follows:

$${\text{i}}\widehat{\text{y}}\left({\text{x}}_{\text{i}}\right)\text{=}\frac{1}{{\text{K}}}\sum_{k=1}^{K}{\text{T}}_{{\text{D}}\left({\theta }_{\text{k}}{}\right)}\left({\text{x}}_{\text{i}}\right), k =\{1, 2, \dots ,K\}$$
(7)

where \({\text{x}}_{\text{i}}\) refers to the value of the sample proportion, \({\text{D}}\left({\theta }_{\text{k}}\right)\) denotes a different bootstrapped sample, and \({\text{K}}\) is tree number (\({\text{T}}_{{\text{D}}\left({\theta }_{K}\right)}\text{)}\).

eXtreme Gradient Boosting (XGB)

XGB is an advanced supervised learning algorithm proposed by Chen and Guestrin [42]. This method is based on the Gradient-Boosted Decision Tree (GBDT) approach. XGB aims to create a “strong” learner by combining predictions from a collection of “weak” learners using additive training strategies. This algorithm incorporates a second-order Taylor expansion of the loss function and a regular term, which effectively mitigates overfitting and expedites convergence. The XGB algorithm enhances prediction accuracy by iteratively constructing new decision trees with continuously diminish the residuals between predicted and observed values. XGB stands out as a prominent open-source boosting tree toolkit, offering remarkable speed and performance advantages over other gradient-boosting methods. It is more than 10 times faster than common toolkits, making it the preferred selection for massively parallel boosting tree tasks. XGB prediction for i instance is:

$${f}_{i}^{(d)}=\sum_{k=1}^{d}{f}_{k}({x}_{i})={f}_{i}^{(d-1)}{f}_{d}{(x}_{i})$$
(8)

where \({f}_{k}({x}_{i})\) represents the learner at step \(d\), the predictions at steps \(d\) and \(d-1\) are denoted as \({f}_{i}^{(d)}\) and \({f}_{i}^{(d-1)}\), respectively and \({x}_{i}\) represents the input variable.

In order to prevent the problem of overfitting without sacrificing the computational speed of the model, XGB employs an analytical expression to evaluate the “goodness” of the model in relation to the original function. This analytical formula, denoted as Eq. (2), is created by XGB to provide an estimate of the model’s “goodness” while also reducing the computational speed associated with mathematical computations.

$${\text{Objective}}^{(d)}=\sum_{k=1}^{n}l\left({\overline{y} }_{i},{y}_{i}\right)+\sum_{k=1}^{d}\sigma {(f}_{i})$$
(9)

where \(l\) is the loss function, \(n\) indicates the observation number used, and \(\sigma\) denotes the regularization term as represented in Eq. (3).

$$\sigma \left(f\right)=\gamma T+0.5\lambda {\Vert \omega \Vert }^{2}$$
(10)

where \(\omega\) denote the vector of scores associated with leaves, \(\lambda\) represents the regularization parameter, and \(\gamma\) indicates the minimum loss required for further partitioning of a leaf node.

Elastic net multivariate linear regression (ENMLR)

ENMLR is a regression technique that combines two effective shrinkage regression methods: Ridge regression (L2 penalty) and LASSO regression (L1 penalty). Ridge regression is employed to address high-multicollinearity problems, while LASSO regression focuses on feature selection in regression coefficients. The elastic net estimator in ENMLR benefits from ridge regularization, which allows for better handling of correlations between predictors compared to LASSO regression. Simultaneously, the L1 regularization in elastic net promotes sparsity, facilitating the identification of essential features. However, similar to LASSO regression, the bias issue is still present in ENMLR. The elastic net estimator minimizes the following expression:

$$EN \left(\beta \right)=\sum_{i=1}^{n}{\left({y}_{i}-{x}_{i}^{T}\beta \right)}^{2}+{\lambda }_{1} \sum_{j=1}^{p}\left|{\beta }_{j}\right|+{\lambda }_{2}\sum_{j=1}^{p}{\left|{\beta }_{j}\right|}^{2}$$
(11)

where \(\beta\) is the regression coefficients, \({\beta }_{j}\) is the regression coefficient of the \({j}^{th}\) predictor variable, \({\lambda }_{1}\) and \({\lambda }_{2}\) are the tuning parameters coming from Lasso and Ridge, respectively and positive numeric values (\({\lambda }_{1}\), \({\lambda }_{2}\)> 0). λ is a penalty parameter and has the effect of a compression variable, and its numerical value indicates the severity of punishment.

Ensemble stacking regression (ESR)

The stacking regressor, initially introduced by Wolpert [43], is an effective ensemble learning technique that combines multiple regression models to improve prediction accuracy. In this approach, a meta-regressor is trained to aggregate the predictions of the base regressors, thereby leveraging the collective knowledge of the individual models Li et al. [44]. Different techniques, such as stacking, weighted averaging, and direct averaging, can be employed to create ensemble regressors by integrating the predictions of the base models [45]. The choice of the specific technique depends on finding an optimal balance for combining the predictions, and the meta-regressor can be any type of regression models [46]. To implement stacking regression, the new meta feature sets generated by each base regressor are merged to form the meta training set, and the new target sets produced by each base regressor are combined to create the meta testing set. The final predictions are then generated by the meta-regressor, which is trained using the new meta training set Wu et al. [25]. The stacking regression methodology has gained popularity in various domains, including molecular quantum characteristics [44], daily reference evapotranspiration estimation [25], genome prediction [47], and stock portfolio prediction [48]. In this particular study, XGB, SVR, and ENMLR models were utilized as the base regressors, while RF was employed as the meta-regressor.

Performance evaluation

In order to evaluate and compare the accuracy and performance of the developed ML algorithms in predicting the proliferation of pomegranate, five popular statistical quantitative indicators, namely the correlation coefficient (R), Coefficient of Determination (R2), Root Mean Square Error (RMSE), Relative Root Mean Squared Error (RRMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE), were utilized. These quantitative indicators can be found in Table 2.

Table 2 Description of statistical indicators for the constructed models evaluation

Global performance indicator (GPI)

In order to enhance the accuracy and reliability of statistical analysis and to mitigate any potential discrepancies, we employed the GPI method. Despotovic et al. [49] were the pioneers in introducing GPI as a novel aspect. GPI is a remarkable technique that combines the effects of multiple statistical indicators. During the process, all statistical indicators are scaled to a range between 0 and 1. Subsequently, the appropriate median value of all models is subtracted from each scaled value of a statistical indicator. These differences are then aggregated using appropriate weighting factors (a weight of -1 for \(R\) and \({R}^{2}\) and a weight of 1 for all other statistical indicators). The model with higher GPI values is considered the best. The following equation represents the GPI model:

$${GPI}_{i}=\sum_{j=1}^{5}{\alpha }_{j}\left({\text{\rm M}}_{j}^{S}-{\text{\rm I}}_{ij}^{S}\right)$$
(12)

where \({GPI}_{i}\) represents global performance indicator for model \(i\), \({\text{\rm M}}_{j}^{S}\) is median of scaled values of indicator \(j\), \({\text{\rm I}}_{ij}^{S}\) is the scaled value of indicator \(j\) for model \(i\), \({\alpha }_{j}\) equals -1 for both \(R\) and \({R}^{2}\) and 1 for other performance criteria.

Optimization of ML model via non‑dominated sorting genetic algorithm‑II (NSGA-II)

The best ML algorithm as the fitness function was introduced to the Non-dominating Sorting Genetic Algorithm (NSGA-II) as optimization algorithm in order to find the optimal combination of inputs (GA3 and ZT) for achieving maximal growth responses in three cultivars (Fig. 3c). Based on natural selection, this study employed several parameters to ensure the effectiveness of the NSGA-II optimization process. The first step in the NSGA-II process involved the creation of an initial population, where all the chromosomes were constructed. Then the tournament selection method was adopted to select an elite population for crossover. A binary crossover function, a well-known crossover technique, was considered to generate the next generation of chromosomes. To introduce diversity into the population and prevent convergence to local optima, a mutation operator was applied. It introduced random variations into the chromosomes, reducing the possibility of having similar chromosomes within the population [50]. The non-dominated sorting concept was utilized to derive non-dominated solutions, with each non-dominated front assigned a rank or level date. The non-dominated front with the highest rank is removed, and the remaining solutions were used to generate the parent population for the next generation. Crowding distance was employed to estimate the objective function, and solutions categorized by crowding distance in descending order based on the lowest density of solutions with less priority. In order to achieve an improved fitness function during the optimization process, the optimal values for crucial operators such as the crossover rate, maximum generation, initial population, and mutation rate were regulated through trial and error. In the current study, the crossover rate was set at 90% with a distribution index of 15, the maximum generation was set to 200, the initial population size was 100, and a distribution index of 20 was used for the mutation operator which was real-valued polynomial mutation (real_pm) (Fig. 3c).

All mathematical codes for implementing and evaluating ESR, RF, SVR, and ENMLR models were performed using the Python library Scikit-learn version 1.3.2 [51]. Additionally, XGB was performed using the XGBoost library version 2.0.3 [42]. The tuning of hyperparameters for each of the five models (SVR, RF, XGB, ESR, and ENMLR) was conducted using the Hyperopt library version 0.2.7 [52], and the Pymoo library version 0.6.1.1 [53], specifically applied for multi-objective optimization (NSGA-II algorithm).

Results

The effect of PGRs on in vitro shoot proliferation and development of pomegranate

According to data analysis using factorial ANOVA, the growth responses of pomegranate, including LN, PR, ES, and SL were found to be significantly influenced by different concentrations and combinations of PGRs (GA3 and ZT), as well as the cultivar type. The detailed results can be found in Table 3.

Table 3 Effect of different concentrations of PGRs on in vitro growth parameters of pomegranate cultivars

The addition of ZT to the growth medium, particularly at a concentration of 0.75 mg/L, resulted in improved shoot regeneration favorable vegetative growth characteristics per explant when compared to the control medium. Based on the results of Table 3, although the positive changes in the growth parameters were primarily attributed to increasing the concentrations of PGRs and the interaction between them, the combination of the highest concentration of ZT and GA3 treatment was the most effective treatment in promoting overall growth response. Specifically, when the media was augmented with 0.50 mg/L GA3 and 0.75 mg/L ZT the average growth response was significantly enhanced (Table 3). It is important to note that the observed changes in the growth parameters were different based on the cultivar type. Among the three cultivars studied, the ‘Faroogh’ cultivar exhibited the maximum values of LN (23.62), and PR (4). Similarly, the ‘Atabaki’ cultivar showed the highest growth responses in SL (6.75 cm) when treated with 0.50 mg/L GA3 and 0.75 mg/L ZT. Regarding ES, both ‘Faroogh’ and ‘Atabaki’ cultivars demonstrated a maximum value of ES which was 100% when exposed to three treatments involving the interaction of 0.25, 0.50, 0.75 mg/L ZT with 0.50 mg/L GA3. In contrast, the ‘Shirineshahvar’ cultivar exhibited lower ES rates than other cultivars. For this particular cultivar, the same treatment interaction as mentioned earlier led to the highest values of LN (18.94), PR (3.56), ES (61.87%), and SL (1.95 cm). Generally, the highest and lowest overall growth responses were achieved in the ‘Faroogh’ and ‘Shirineshahvar’, respectively (Table 3).

Comparison of ML performance

In the present study, we utilized the advantages of five ML algorithms namely RF, XGB, SVR, ESR, and ENMLR to build the mathematical models. The scatter plots in Figs. 5, 6 and 7 illustrate the prediction results of these models, while the corresponding prediction evaluation indexes are shown in Tables 4, 5, and 6. Violin plots of the performance metrics are presented in Fig. 4. When comparing the ENMLR to other ML algorithms for all parameters (outputs), both the training and test subset R-values, which measure the correlation between observed (experimental) and predicted values of ML algorithms, were lower. This indicates that all five ML models had a good performance and predictability. However, the ESR with higher R and R2 and smaller RRMSE, RMSE, MAE, and MAPE values in both training and testing sets was the best algorithm in comparison to four other models for all growth parameters (Tables 4, 5 and 6). In this regard, the results derived by comparing the statistical indicators of the different models on the measured growth parameters revealed that the values of the ESR was very close to the other ML algorithms in all three cultivars. Moreover, the impact of statistical quantitative indicators was not clearly distinguishable and different statistical indicator values are in favor for different models; therefore, to address this vagueness, the GPI for the test dataset of overall ML logarithms was calculated and presented in Table 7. The GPI estimation ranked the ESR model as the top performer among all other models. Calculated GPI revealed the order of ESR vs. XGB, RF, SVR, and ENMLR models were: 1.829 vs. − 1.674, 0.647, 0, − 4.171, for LN of ‘Atabaki’ cultivar; 1.312 vs. − 2.562, 0, 0.525, and − 4.688 for LN of ‘Faroogh’ cultivar; 0.089, − 3.040, 0.032, 0.004, and − 5.911 for LN of ‘Shirineshahvar’ cultivar; 1.383 vs 0.980, 0.738, − 2.326, and − 3.801, for PR of ‘Atabaki’ cultivar; 1.182 vs. − 1.199, 0.567, − 2.121, and − 2.104 for PR of ‘Faroogh’ cultivar; 1.911, 0.574, − 2.616, 0.255, and − 3.807 for PR of ‘Shirineshahvar’ cultivar; 0.933 vs. − 4.870, 0.573, 0, − 4.814, for ES of ‘Atabaki’ cultivar; 0.748 vs. − 3.813, 0.483, 0.085, and − 5.240 for ES of ‘Faroogh’ cultivar; 0.973, 0.818, − 1.501, − 2.966, and − 2.507 for ES of ‘Shirineshahvar’ cultivar; 0.180 vs. − 5.158, 0.108, 0.035, − 5.782, for SL of ‘Atabaki’ cultivar; 0.619 vs. − 4.058, 0.092, 0.405, and − 5.380 for SL of ‘Faroogh’ cultivar; 0.513, − 0.913, 0.193, 0.150, and − 5.487 for SL of ‘Shirineshahvar’ cultivar (Table 7). Additionally, the regression lines demonstrated the good fit correlation between the observed and predicted data for all growth parameters during both the training and testing phases of the ML models (Figs. 5, 6, and 7).

Table 4 Statistical evaluation of the constructed models for the micropropagation of the pomegranate cultivar ‘Atabaki’
Table 5 Statistical evaluation of the constructed models for the micropropagation of the pomegranate cultivar ‘Faroogh’
Table 6 Statistical evaluation of the constructed models for the micropropagation of the pomegranate cultivar ‘Shirineshahvar’
Fig. 4
figure 4

The violin plots of the performance metrics of analyzed models on the observed value vs. the predicted values on in vitro pomegranate growth parameters including: A leave number, B proliferation, C explant survival, D shoot length

Table 7 Ranking of the best-performing ML models for growth parameters of pomegranate
Fig. 5
figure 5

Comparison between the predicted compressive strength via RF, XGB, SVR, ESR, and ENMLR models. A leave number, B proliferation, C explant survival, D shoot length of the pomegranate cultivar ‘Atabaki’

Fig. 6
figure 6

Comparison between the predicted compressive strength via RF, XGB, SVR, ESR, and ENMLR models. A leave number, B proliferation, C explant survival, D shoot length of the pomegranate cultivar ‘Faroogh’

Fig. 7
figure 7

Comparison between the predicted compressive strength via RF, XGB, SVR, ESR, and ENMLR models. A leave number, B proliferation, C explant survival, D shoot length of the pomegranate cultivar ‘Shirineshahvar’

Optimization process via non-dominated sorting genetic algorithm-II

The NSGA-II algorithm, as multi-objective evolutionary optimization, was linked to the ESR model which was identified as the most accurate algorithm. ESR-NSGA-II algorithm has successfully determined the optimal values for four growth parameters (LN, PR, ES, and SL) in response to different concentrations of PGRs. The results of the ESR-NSGA-II algorithm are summarized in Table 8. In the ‘Atabaki’ cultivar, the ESR-NSGA-II algorithm identified that the culture medium supplemented with 0.750 mg/L ZT along with, 0.50 mg/L GA3, resulted in the most significant improvements in growth parameters. Specifically, this combination treatment displayed the best outputs with 18.18 LN, 3.47 PR, 84.21% ES, and 2.74 cm SL. For the ‘Faroogh’ cultivar, the optimization algorithm determined that the culture medium supplemented with 0.654 mg/L ZT along with, 0.329 mg/L GA3 were the optimal input variables to achieve the best outputs with 19.76 LN, 3.84 PR, 85.49% ES, and 3.32 cm SL. In the ‘Shirineshahvar’ cultivar, the culture medium supplemented with 0.705 mg/L ZT, combined with 0.347 mg/L GA3, were the significant input variables to achieve the best outputs with 18.77 LN, 3.22 PR, 56.39% ES, and 1.86 cm SL (Table 8).

Table 8 Optimization of pomegranate cultivars and different concentrations of ZT, and GA3 according to the ESR-NSGA-II algorithm to obtain the best plant growth parameters

Discussion

The success of in vitro plant tissue culture strongly depends on several external and internal factors, including environmental conditions, PGRs types, culture medium composition, and gelling agents, and genotype [18]. The application of PGRs, particularly cytokinin and auxin, are commonly used to optimize protocols for in vitro tissue culture and shoot regeneration [17, 54, 55]. Auxin increases the susceptibility of apical meristem cells that are less mitotically active cells to cytokinin [56], while cytokinin promotes cell proliferation, including cell division and shoot elongation [10]. In the case of pomegranate, which is a recalcitrant woody plant for in vitro culture, the optimization of type and concentration of PGRs, as well as their interactions, play a crucial role [8, 57,58,59].

In previous studies to efficiently multiply various pomegranate species, it has been reported that integrating BAP with or without NAA at specific concentrations ranging from 0.4 to 2 mg/L for BAP and 0.5 to 1 mg/L for NAA, has proven effective [57]. However, it is important to note that the results of these studies are often specific to particular cultivars and cannot be universally applied. The optimization of PGR concentrations is necessary due to genetic factors and complexities associated with the oxidation of phenols in explants and culture media, which can lead to tissue death. Furthermore, pomegranate tissue culture protocols are highly dependent on the cultivar and may differ due to variations in uptake rates, translocation rates, or metabolic processes within the meristematic regions of the plant. Additionally, cytokinin metabolism plays a crucial role, as cytokinins may undergo degradation or conjugation with sugars or amino acids, leading to the formation of biologically inert compounds, as reported by Desai et al. [60].

Although ZT has been recognized as highly effective in promoting shoot proliferation in various plant species [61,62,63], its use in pomegranate tissue culture has limited compared to other cytokinins. Similarly, the use of GA3 in shoot proliferation, particularly in recalcitrant woody trees like pomegranate, has received limited attention. However, several studies have demonstrated that the interaction between cytokinins with GA3 can improve the development of shoot/root apical meristems [8, 64, 65]. This study introduces a new shoot proliferation protocol for pomegranate cultivars, which utilizes a combination of ZT and GA3. The results demonstrate the remarkable efficacy of this combination in stimulating shoot development compared to using BAP alone. Notably, the treatment involving the highest concentration of both ZT and GA3 exhibited the most significant growth response, highlighting its effectiveness. Additionally, GA3 enhanced shoot regeneration and increased the ES% of all three tested pomegranate cultivars when combined with cytokinins and auxins. Although limited reports exist on the effect of ZT on shoot proliferation in pomegranate, Naik et al. [66] reported significant improvements in regeneration frequency and shoot growth by adding zeatin riboside (ZR) to the culture medium. The analysis of current study also highlighted that different pomegranate cultivars exhibited different reactions to the same culture medium, despite their close genetic relationship. It is noteworthy that the ‘Faroogh’ cultivar exhibited the highest growth responses among the three cultivars investigated. However, the ‘Shirineshahvar’ cultivar displayed higher recalcitrant to shoot proliferation compared to the other cultivars. This could be attributed to variations in the concentration of endogenous phytohormones within the plants and their interaction with the applied exogenous PGRs in the culture of explants [67].

Developing and optimizing tissue culture protocols is a complex task that poses significant challenges to the field as a whole. The multifactorial nature of in vitro culture processes makes them difficult to understand and interpret using traditional statistical approaches such as ANOVA, t-tests, correlation, and regression, specifically when the variables investigated are nonlinear, noisy, complex, and vague in nature [68]. The knowledge derived from MLs, as complex mathematical tools, offer promise in understanding and interpreting the intricate, nonlinear relationships within datasets. ML models have demonstrated superior predictive power over traditional statistical methodologies when analyzing unpredictable variables and big dataset. Despite the advantages of ML, uncertainty in ML outcomes remains a major constraint in its application [69]. Uncertainty in ML studies arises from three primary sources: data quality, the sample of data collected from the domain, and model fitting [70]. To avoid uncertainties, researchers have recommended the application of different ML algorithms [69, 70]. In this study, five ML approaches (XGB, RF, SVR, ESR, and ENMLR) were employed for modeling the effects of various parameters (PGRs) on in vitro shoot proliferation of pomegranate. While similar performance was observed across the ML models in predicting pomegranate shoot multiplication, the results of the GPI analysis indicated that the ESR model stood out as the best performer. It exhibited robustness and superior predictive accuracy in both the training and testing subsets. It is worth noting that there is a lack of specific investigations regarding the use of the ESR algorithm in the field of plant tissue culture. Nonetheless, numerous studies in other scientific disciplines have demonstrated the robust performance of the ESR model in various prediction tasks [71, 72]. In recent research has shown that integrating optimization algorithms, particularly NSGA-II, with ML models can provide valuable insights and effective utilization of the models. The application of NSGA-II in conjunction with ML enables the answering of "How to get" questions by identifying the optimal culture medium that simultaneously improves multiple desired parameters for the studied parameters [18, 73]. In the current research, the ESR was linked to the NSGA-II algorithm as a computational forecasting approach for predicting and identifying critical factors affecting the in vitro proliferation stage of pomegranate cultivars. The successful application of optimization algorithms, especially NSGA-II, in the field of plant tissue culture has already been accomplished [31]. Additionally, various ML algorithms based on different optimization algorithms have shown promising results in modeling and predicting optimal plant tissue culture media for other fruit tree species such as kiwi berry [18], pear [74], prunus [15], pistachio rootstocks [74], and Persian walnut [10]. The outcomes obtained through the ESR-NSGA-II method accurately predicted that the highest plant growth responses would be achieved by supplementing the culture medium with 0.750 mg/L ZT, and 0.500 mg/L GA3 for the ‘Atabaki’ cultivar, 0.654 mg/L ZT, and 0.329 mg/L GA3 for the ‘Faroogh’ cultivar, and 0.705 mg/L ZT, and 0.347 mg/L GA3 for the ‘Shirineshahvar’ cultivar. Overall, the ESR-NSGA-II algorithm revealed that the interaction between genotype and different concentrations of PGRs caused the most significant influence on pomegranate shoot proliferation. These findings are consistent with a study by Sadat-Hoseini et al. [10], which employed ML approaches to model growth parameters of in vitro Persian walnut using different concentrations of BAP, tidiazuran (TDZ), and indole butyric acid (IBA), and reported that the genotype-PGR interaction plays a crucial role in the proliferation of Persian walnut.

To the best of the author’s knowledge, this study represents the first investigation examining the specific effects of ZT and GA3, as well as their interactions, in enhancing the efficiency of pomegranate tissue culture protocol, especially with the studied pomegranate cultivars on in vitro conditions for enhancing growth parameters. While previous studies have reported in vitro shoot proliferation success of different pomegranate cultivars, the focus on the specific combination of ZT and GA3, and their interactions effects, is a novel aspect of this research. By evaluating the influence of these growth regulators on growth parameters, this study contributes to the advancement of pomegranate tissue culture techniques.

Conclusion

In vitro shoot proliferation is a multifactorial and complex process influenced by various interacting factors. So, to evaluate the extensive datasets and optimize the pomegranate protocol, ML techniques such as RF, SVR, XGB, ESR, and ENMLR were employed as promising alternatives to traditional statistical methods. Based on our results, ESR-NSGA-II exhibited superior accuracy and efficacy in studying pomegranate growth responses to multivariable stimuli in vitro and optimizing the pomegranate protocol. Furthermore, the in vitro responses of pomegranate were found to be positively influenced by the concentrations of PGRs (ZT and GA3) and their interaction. Moreover, the optimization of in vitro condition of pomegranate was strongly depended on the specific cultivar. Specifically, the ‘Shirineshahvar’ cultivar demonstrated as a recalcitrant cultivar to in vitro shoot proliferation compared to other cultivars, while the ‘Faroogh’ cultivar exhibited the highest growth and shoot development. The main objective of the current research was to provide a reliable and robust technology, ESR-NSGA-II based on soft computing methodology, to provide new insight into the crucial factors that impact the growth parameters of pomegranate cultivars cultured in vitro.

Data availability

The authors confirm that the datasets analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Zarbakhsh S, Kazemzadeh-Beneh H, Rastegar S. Quality preservation of minimally processed pomegranate cv. Jahrom arils based on chitosan and organic acid edible coatings. J Food Saf. 2019. https://doi.org/10.1111/jfs.12752.

    Article  Google Scholar 

  2. Zarbakhsh S, Shahsavar AR. Exogenous γ-aminobutyric acid improves the photosynthesis efficiency, soluble sugar contents, and mineral nutrients in pomegranate plants exposed to drought, salinity, and drought-salinity stresses. BMC Plant Biol. 2023;23:543. https://doi.org/10.1186/s12870-023-04568-2.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Dinesh RM, Patel AK, Vibha JB, Shekhawat S, Shekhawat NS. Cloning of mature pomegranate (Punica granatum) cv. Jalore seedless via in vitro shoot production and ex vitro rooting. Vegetos. 2019;32(2):181–9. https://doi.org/10.1007/S42535-019-00021-8.

    Article  Google Scholar 

  4. Pathania M, Arora PK, Pathania S, Kumar A. Studies on population dynamics and management of pomegranate aphid, Aphis punicae Passerini (Hemiptera: Aphididae) on pomegranate under semi-arid conditions of South-western Punjab. Sci Hortic. 2019;243:300–6. https://doi.org/10.1016/j.scienta.2018.07.027.

    Article  Google Scholar 

  5. Guney M. Development of an in vitro micropropagation protocol for Myrobalan 29C rootstock. Turk J Agric For. 2019;43:569–75. https://doi.org/10.3906/tar-1903-4.

    Article  CAS  Google Scholar 

  6. Mulaei S, Jafari A, Shirmardi M, Kamali K. Micropropagation of Arid Zone Fruit Tree, Pomegranate, cvs. ‘Malase Yazdi’ and ‘Shirine Shahvar.’ Int J Fruit Sci. 2020;20(4):825–36. https://doi.org/10.1080/15538362.2019.1680334.

    Article  Google Scholar 

  7. Zareian B, Abadi Z, Kamali A, Abad K, Tabandeh SA. Comparison of different culture media and hormonal concentrations for In-Vitro propagation of pomegranate. Int J Fruit Sci. 2020;20:1721–8. https://doi.org/10.1080/15538362.2020.1830916.

    Article  Google Scholar 

  8. Kanwar K, Joseph J, Deepika R. Comparison of in vitro regeneration pathways in Punica granatum L. Plant Cell Tissue Organ Cult. 2010;100(2):199–207. https://doi.org/10.1007/s11240-009-9637-4.

    Article  CAS  Google Scholar 

  9. Nezami-Alanagh E, Garoosi GA, Landín M, Gallego PP. Combining DOE with neurofuzzy logic for healthy mineral nutrition of pistachio rootstocks in vitro culture. Front Plant Sci. 2018;9:1474. https://doi.org/10.3389/fpls.2018.01474.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Sadat Hosseini M, Arab MM, Soltani M, Eftekhari M, Soleimani A, Vahdati K. Predictive modeling of Persian walnut (Juglans regia L.) in vitro proliferation media using machine learning approaches: a comparative study of ANN KNN and GEP models. Plant Methods. 2022;18:48. https://doi.org/10.1186/s13007022008715.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Hassan SAM, Zayed NS. Factor controlling micropropagation of fruit trees: a review. Sci Int. 2018;6(1):1–10. https://doi.org/10.17311/sciintl.2018.1.10.

    Article  CAS  Google Scholar 

  12. Benelli C, De Carlo A. In vitro multiplication and growth improvement of Olea europaea L. cv Canino with temporary immersion system (Plantform™). 3 Biotech. 2018;8(7):317. https://doi.org/10.1007/s13205-018-1346-4.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Haq IU, Umar H, Akhtar N, Iqbal MA, Ijaz M. Techniques for micropropagation of olive (Olea europaea L.): a systematic review. Pak J Agric Res. 2021;34(1):184–92. https://doi.org/10.17582/journal.pjar/2021/34.1.184.192.

    Article  Google Scholar 

  14. Lambardi M, Rugini E. Micropropagation of olive (Olea europaea L.). In: Micropropagation of woody trees and fruits. Dordrecht: Springer Netherlands; 2003. p. 621–46. https://doi.org/10.1007/978-94-010-0125-0_21.

    Chapter  Google Scholar 

  15. Nezami-Alanagh E, Garoosi GA, Haddad R, Maleki S, Landín M, Gallego PP. Design of tissue culture media for efficient Prunus rootstock micropropagation using artificial intelligence models. Plant Cell Tissue Organ Cult. 2014;117:349–59. https://doi.org/10.1007/s11240-014-0444-1.

    Article  CAS  Google Scholar 

  16. García-Pérez P, Zhang L, Miras-Moreno B, Lozano-Milo E, Landin M, Lucini L, et al. The combination of untargeted metabolomics and machine learning predicts the biosynthesis of phenolic compounds in Bryophyllum medicinal plants (Genus Kalanchoe). Plants. 2021;10(11):2430. https://doi.org/10.3390/plants10112430.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. García-Pérez P, Lozano-Milo E, Landin M, Gallego PP. Machine Learning unmasked nutritional imbalances on the medicinal plant Bryophyllum sp. cultured in vitro. Front Plant Sci. 2020;11:576177. https://doi.org/10.3389/fpls.2020.576177.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Hameg R, Arteta TA, Landin M, Gallego PP, Barreal ME. Modeling and optimizing culture medium mineral composition for in vitro propagation of Actinidia arguta. Front Plant Sci. 2020;11:2088. https://doi.org/10.3389/fpls.2020.554905.

    Article  Google Scholar 

  19. Mirza K, Aasim M, Katırcı R, Karataş M, Ali SA. Machine learning and artificial neural networks-based approach to model and optimize ethyl methanesulfonate and sodium azide induced in vitro regeneration and morphogenic traits of water hyssops (Bacopa monnieri L.). J Plant Growth Regul. 2023;42(6):3471–85. https://doi.org/10.1007/s00344-022-10808-w.

    Article  CAS  Google Scholar 

  20. Niazian M, Shariatpanahi ME, Abdipour M, Oroojloo M. Modeling callus induction and regeneration in an anther culture of tomato (Lycopersicon esculentum L.) using image processing and artificial neural network method. Protoplasma. 2019;256:1317–32. https://doi.org/10.1007/s00709-019-01379-x.

    Article  CAS  PubMed  Google Scholar 

  21. Rezaei H, Mirzaie-Asl A, Abdollahi MR, Tohidfar M. Comparative analysis of different artificial neural networks for predicting and optimizing in vitro seed germination and sterilization of petunia. PLoS ONE. 2023;18(5): e0285657. https://doi.org/10.1371/journal.pone.0285657.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Türkoğlu A, Bolouri P, Haliloğlu K, Eren B, Demirel F, Işık Mİ, et al. Modeling callus induction and regeneration in hypocotyl explant of fodder pea (Pisum sativum var. arvense L.) using machine learning algorithm method. Agronomy. 2023;13:2835. https://doi.org/10.3390/agronomy13112835.

    Article  Google Scholar 

  23. Niazian M, Niedbała G, Sabbatini P. Modeling Agrobacterium –mediated gene transformation of tobacco (Nicotiana tabacum)—a model plant for gene transformation studies. Front Plant Sci. 2021;11:695110. https://doi.org/10.3389/fpls.2021.695110.

    Article  Google Scholar 

  24. Salehi M, Farhadi S, Moieni A, Safaie N, Hesami M. A hybrid model based on general regression neural network and fruit fly optimization algorithm for forecasting and optimizing paclitaxel biosynthesis in Corylus avellana cell culture. Plant Methods. 2021;17(1):1–13. https://doi.org/10.1186/s13007-021-00714-9.

    Article  CAS  Google Scholar 

  25. Wu T, Zhang W, Jiao X, Guo W, Hamoud YA. Evaluation of stacking and blending ensemble learning methods for estimating daily reference evapotranspiration. Comput Electron Agric. 2021;184:106039. https://doi.org/10.1016/j.compag.2021.106039.

    Article  Google Scholar 

  26. Sadat-Hosseini M, Arab MM, Soltani M, Eftekhari M, Soleimani A. Applicability of soft computing techniques for in vitro micropropagation media simulation and optimization: a comparative study on Salvia macrosiphon Boiss. Ind Crops Prod. 2023;199:116750. https://doi.org/10.1016/j.indcrop.2023.116750.

    Article  CAS  Google Scholar 

  27. Lee S, Park J, Kim N, Lee T, Quagliato L. Extreme gradient boosting-inspired process optimization algorithm for manufacturing engineering applications. Mater Des. 2023;226:111625. https://doi.org/10.1016/j.matdes.2023.111625.

    Article  Google Scholar 

  28. Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc Series B Stat Methodol. 2005;67(2):301–20. https://doi.org/10.1111/j.1467-9868.2005.00527.x.

    Article  Google Scholar 

  29. Al-Jawarneh AS, Ismail MT, Awajan AM, Alsayed AR. Improving accuracy models using elastic net regression approach based on empirical mode decomposition. Comm Statist Simul Comput. 2022;51(7):4006–25. https://doi.org/10.1080/03610918.2020.1728319.

    Article  Google Scholar 

  30. Zarbakhsh S, Shahsavar AR. Artificial neural network-based model to predict the effect of γ-aminobutyric acid on salinity and drought responsive morphological traits in pomegranate. Sci Rep. 2022;12(1):16662. https://doi.org/10.1038/s41598-022-04507-5.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Fakhrzad F, Jowkar A, Hosseinzadeh J. Mathematical modeling and optimizing the in vitro shoot proliferation of wallflower using multilayer perceptron non-dominated sorting genetic algorithm-II (MLP-NSGAII). PLoS ONE. 2022;17: e0273009. https://doi.org/10.1371/journal.pone.0273009.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Aasim M, Ayhan A, Katırcı R, Acar AŞ, Ali SA. Computing artificial neural network and genetic algorithm for the feature optimization of basal salts and cytokinin-auxin for in vitro organogenesis of royal purple (Cotinus coggygria Scop). Ind Crops Prod. 2023;199: 116718. https://doi.org/10.1016/j.indcrop.2023.116718.

    Article  CAS  Google Scholar 

  33. Chen Y, Xu M, Shen X, Zhang G, Lu Z, Xu J. A multi-objective modeling method of multi-satellite imaging task planning for large regional mapping. Remote Sens. 2020;12:344. https://doi.org/10.3390/rs12030344.

    Article  Google Scholar 

  34. Murashige T, Skoog FA. Revised medium for rapid growth and bio assays with tobacco tissue cultures. Physiol Plant. 1962;15:473–97. https://doi.org/10.1111/j.1399-3054.1962.tb08052.x.

    Article  CAS  Google Scholar 

  35. Van der Salm TPM, Van der Toorn CJG, Hanisch ten Cate CH, Dubois LAM, De Vries DP, Dons HJM. Importance of the iron chelate formula for micropropagation of Rosa hybrida L. Moneyway. Plant Cell Tissue Organ Cult. 1994;37:73–7. https://doi.org/10.1007/BF00048152.

    Article  Google Scholar 

  36. McCown BH. Woody plant medium (WPM)-a mineral nutrient formulation for microculture for woody plant species. HortSci. 1981;16:453.

    Google Scholar 

  37. Rong G, Li K, Su Y, Tong Z, Liu X, Zhang J, et al. Comparison of tree-structured parzen estimator optimization in three typical neural network models for landslide susceptibility assessment. Remote Sens. 2021;13(22):4694. https://doi.org/10.3390/rs13224694.

    Article  Google Scholar 

  38. Lu M, Hou Q, Qin S, Zhou L, Hua D, Wang X, et al. A stacking ensemble model of various machine learning models for daily runoff forecasting. Water. 2023;15(7):1265. https://doi.org/10.3390/w15071265.

    Article  Google Scholar 

  39. Vapnik V. The nature of statistical learning theory. Dordrecht: Springer Science & Business Media; 2013. https://doi.org/10.1007/978-1-4757-3264-1.

    Book  Google Scholar 

  40. Wang L, Zhou X, Zhu X, Dong Z, Guo W. Estimation of biomass in wheat using random forest regression algorithm and remote sensing data. The Crop J. 2016;4:212–9. https://doi.org/10.1016/j.cj.2016.01.008.

    Article  Google Scholar 

  41. Breiman L. Random forests. Mach Learn. 2001;45:5–32.

    Article  Google Scholar 

  42. Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ACM. 2016;785–794. https://doi.org/10.1145/2939672.2939785

  43. Wolpert DH. Stacked generalization. Neural Netw. 1992;5(2):241–59. https://doi.org/10.1016/S0893-6080(05)80023-1.

    Article  Google Scholar 

  44. Li Y, Tang Z, Yang S. Deep regressor stacking to learn molecular quantum properties. In: Second International Conference on Electronic Information Engineering, Big Data, and Computer Technology (EIBDCT 2023). 2023. https://doi.org/10.1117/12.2674796

  45. Lu X, Zhou W, Ding X, Shi X, Luan B, Li M. Ensemble learning regression for estimating unconfined compressive strength of cemented paste backfill. IEEE ACCESS. 2019;7:72125–33. https://doi.org/10.1109/access.2019.2918177.

    Article  Google Scholar 

  46. Zhu X, Hu J, Xiao T, Huang S, Wen Y, Shang D. An interpretable stacking ensemble learning framework based on multi-dimensional data for real-time prediction of drug concentration: the example of olanzapine. Front Pharmacol. 2022. https://doi.org/10.3389/fphar.2022.975855.

    Article  PubMed  PubMed Central  Google Scholar 

  47. Sapkota S, Boatwright J, Jordan K, Boyles R, Kresovich S. Multi-trait regressor stacking increased genomic prediction accuracy of sorghum grain composition. Agronomy. 2020;10(9):1221. https://doi.org/10.1101/2020.04.03.023531.

    Article  CAS  Google Scholar 

  48. Santana E, Silva J, Mastelini S, Barbon S. Stock portfolio prediction by multi-target decision support. Isys Braz J Inf Syst. 2019;12(1):05–27. https://doi.org/10.5753/isys.2019.381.

    Article  Google Scholar 

  49. Despotovic M, Nedic V, Despotovic D, Cvetanovic S. Review andstatistical analysis of different global solar radiation sunshine models. Renew Sust Energ Rev. 2015;52:1869–80. https://doi.org/10.1016/j.rser.2015.08.035.

    Article  Google Scholar 

  50. Yoosefzadeh-Najafabadi M, Tulpan D, Eskandari M. Application of machine learning and genetic optimization algorithms for modeling and optimizing soybean yield using its component traits. PLoS ONE. 2021;16: e0250665. https://doi.org/10.1371/journal.pone.0250665.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in python. J Mach Learn Res. 2011;12:2825–30.

    Google Scholar 

  52. Bergstra J, Yamins D, Cox DD. Hyperopt: a python library for optimizing the hyperparameters of machine learning algorithms. Proc Python 12th Sci Conf. 2013;13:20. https://doi.org/10.25080/Majora-8b375195-003.

    Article  Google Scholar 

  53. Blank J, Deb K. Pymoo: multi-objective optimization in python. IEEE Access. 2020;8:89497–509. https://doi.org/10.1109/ACCESS.2020.2990567.

    Article  Google Scholar 

  54. Krasteva G, Georgiev V, Pavlov A. Recent applications of plant cell culture technology in cosmetics and foods. Eng Life Sci. 2020;21:68–76. https://doi.org/10.1002/elsc.202000078.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Amiri S, Mohammadi R. Establishment of an efficient in vitro propagation protocol for Sumac (Rhus coriaria L.) and confirmation of the genetic homogeneity. Sci Rep. 2021;11(1):1–9. https://doi.org/10.1038/s41598-020-80550-4.

    Article  CAS  Google Scholar 

  56. Schaller GE, Bishopp A, Kieber JJ. The yin yang of hormones: cytokinin and auxin interactions in plant development. Plant Cell. 2015;27(1):44–63. https://doi.org/10.1105/tpc.114.133595.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Singh SK, Singh A, Singh NV, Ramajayam D. Pomegranate tissue culture and biotechnology. Fruit Veg Cereal Sci Biotechnol. 2010;4:120–5.

    Google Scholar 

  58. Patil VM, Dhande GA, Thigale DM, Rajput JC. Micropropagation of pomegranate (Punica granatum L.) ‘Bhagava’ cultivar from nodal explant. Afr J Biotechnol. 2011;10:18130–6. https://doi.org/10.5897/AJB11.1437.

    Article  CAS  Google Scholar 

  59. da Silva JAT, Rana TS, Narzary D, Verma N, Meshram DT, Ranade SA. Pomegranate biology and biotechnology: a review. Sci Hortic. 2013;160:85–107. https://doi.org/10.1016/j.scienta.2013.05.017.

    Article  CAS  Google Scholar 

  60. Desai P, Patil G, Dholiya B, Desai S, Patel F, Narayanan S. Development of an efficient micropropagation protocol through axillary shoot proliferation for pomegranate variety ‘Bhagwa.’ Ann Agric Sci. 2018;16(4):444–50. https://doi.org/10.1016/j.aasci.2018.06.002.

    Article  Google Scholar 

  61. Schuchovski CS, Biasi LA. In Vitro establishment of ‘Delite’rabbiteye blueberry microshoots. Hortic. 2019;5(1):24. https://doi.org/10.3390/horticulturae5010024.

    Article  Google Scholar 

  62. Debnath SC, Arigundam U. In vitro propagation strategies of medicinally important berry crop, lingonberry (Vaccinium vitis-idaea L.). Agronomy. 2020;10(5):744. https://doi.org/10.3390/agronomy10050744.

    Article  CAS  Google Scholar 

  63. Arigundam U, Variyath AM, Yaw LS, Marshall D, Debnath SC. Liquid culture for efficient in vitro propagation of adventitious shoots in wild Vaccinium vitis-idaea ssp. minus (lingonberry) using temporary immersion and stationary bioreactors. Sci Hortic. 2020;264:1091–9. https://doi.org/10.1016/j.scienta.2020.109199.

    Article  CAS  Google Scholar 

  64. Devidas T, Tiwari Sharad T, Nagesh D. Multiple shoot induction of pomegranate (Punica granatum L.) through different juvenile explants. Bull Env Pharmacol Life Sci. 2017;7:29–33.

    Google Scholar 

  65. Ahmad A, Ahmad N, Anis M, Alatar AA, Abdel-Salam EM, Qahtan AA, et al. Gibberellic acid and thidiazuron promote micropropagation of an endangered woody tree (Pterocarpus marsupium Roxb.) using in vitro seedlings. Plant Cell Tissue Organ Cult. 2021;144(2):449–62. https://doi.org/10.1007/s11240-020-019691.

    Article  CAS  Google Scholar 

  66. Naik SK, Pattnaik S, Chand PK. In vitro propagation of pomegranate (Punica granatum L. cv. Ganesh) through axillary shoot proliferation from nodal segments of mature tree. Sci Hortic. 1999;79:175–83. https://doi.org/10.1016/S0304-4238(98)00218-0.

    Article  CAS  Google Scholar 

  67. Ikeuchi M, Sugimoto K, Iwase A. Plant callus: mechanisms of induction and repression. Plant Cell. 2013;25:3159–73. https://doi.org/10.1105/tpc.113.116053.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Niedbała G, Niazian M, Sabbatini P. Modeling agrobacterium-mediated gene transformation of Tobacco (Nicotiana tabacum)—a model plant for gene transformation studies. Front Plant Sci. 2021;12:695110. https://doi.org/10.3389/fpls.2021.695110.

    Article  PubMed  PubMed Central  Google Scholar 

  69. Yang F, Wanik DW, Cerrai D, Bhuiyan MA, Anagnostou EN. Quantifying uncertainty in machine learning-based power outage prediction model training: a tool for sustainable storm restoration. Sustainability. 2020;12:1525. https://doi.org/10.3390/su12041525.

    Article  Google Scholar 

  70. Saltzman B, Yung J. A machine learning approach to identifying different types of uncertainty. Econ Lett. 2018;171:58–62. https://doi.org/10.1016/j.econlet.2018.07.003.

    Article  Google Scholar 

  71. Khairalla MA, Ning X, Al-Jallad NT, El-Faroug MO. Short-term forecasting for energy consumption through stacking heterogeneous ensemble learning model. Energies. 2018;11(6):1605. https://doi.org/10.3390/en11061605.

    Article  Google Scholar 

  72. Kandel I, Castelli M, Popovič A. Comparing stacking ensemble techniques to improve musculoskeletal fracture image classification. J Imaging. 2021;7(6):100. https://doi.org/10.3390/jimaging7060100.

    Article  PubMed Central  Google Scholar 

  73. Nezami-Alanagh E, Garoosi GA, Maleki S, Landín M, Gallego pp. Predicting optimal in vitro culture medium for Pistacia vera micropropagation using neural networks models. Plant Cell Tissue Organ Cult. 2017;129(1):19–33. https://doi.org/10.1007/s11240-016-1152-9.

    Article  CAS  Google Scholar 

  74. Jamshidi S, Yadollahi A, Arab MM, Soltani M, Eftekhari M, Sabzalipoor H, et al. Combining gene expression programming and genetic algorithm as a powerful hybrid modeling approach for pear rootstocks tissue culture media formulation. Plant Methods. 2019;15:136. https://doi.org/10.1186/s13007-019-0520-y.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We acknowledge the college of agriculture, Shiraz University, Iran for providing experimental facilities. We also acknowledge the reviewers for their valuable comments and suggestions.

Funding

No funding was received.

Author information

Authors and Affiliations

Authors

Contributions

S.Z. and A.R.S. designed the experiment and conceptualization. S.Z. and M.S. implemented the machine learning algorithms and analyzed the results. S.Z. and M.S. wrote the original draft. A.R.S. revised the manuscript. S.Z. finalized the manuscript. All authors contributed to the manuscript and approved the submitted version for publication.

Corresponding author

Correspondence to Ali Reza Shahsavar.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zarbakhsh, S., Shahsavar, A.R. & Soltani, M. Optimizing PGRs for in vitro shoot proliferation of pomegranate with bayesian-tuned ensemble stacking regression and NSGA-II: a comparative evaluation of machine learning models. Plant Methods 20, 82 (2024). https://doi.org/10.1186/s13007-024-01211-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13007-024-01211-5

Keywords