- Research
- Open access
- Published:
Early detection of verticillium wilt in eggplant leaves by fusing five image channels: a deep learning approach
Plant Methods volume 20, Article number: 173 (2024)
Abstract
Background
As one of the world’s most important vegetable crops, eggplant production is often severely affected by verticillium wilt, leading to significant declines in yield and quality. Traditional multispectral disease-imaging equipment is expensive and complicated to operate. Low-cost multispectral devices cannot capture images and cover less information. The traditional approach to early disease diagnosis involves using multispectral disease-imaging equipment in conjunction with machine learning technology. However, this approach has significant limitations in early disease diagnosis, including challenges such as high costs, complex operation, and low model performance.
Results
The aim of this study was to combine low-cost multispectral cameras with deep learning technology to detect early stage Verticillium wilt in eggplant effectively. Using the Manual FS-3200T-10GE-NNC multispectral camera to perform multispectral imaging of the leaves of eggplant seedlings at the early infection stage, information fusion was performed on the collected multispectral images, and a five-channel image information fusion model was established. Image information fusion technology was combined with deep learning technology, among which the VGG16-triplet attention model performed the best, achieving a precision of 86.73% on the test set. Model validation on 48- and 72-hour data reached a precision of 75% and 82%, respectively, achieving an early diagnosis of Verticillium wilt. This highlighted the potential of multispectral cameras for early disease detection.
Conclusions
In this study, we successfully developed a method for the non-destructive detection of the early stages of eggplant wilt disease by combining multispectral imaging technology with deep learning algorithms. While ensuring high accuracy, this method significantly reduces the cost of experimental equipment. The application of this method can reduce the cost of agricultural equipment and provide a scientific basis for agricultural production practices, helping to reduce losses caused by diseases.
Introduction
Eggplant (Solanum melongena L.) is an important vegetable crop widely cultivated and consumed in China because of its rich nutritional value and unique flavor [1]. A major issue encountered during eggplant production is the spread of various diseases, among which Verticillium wilt is the most severe [2]. Verticillium wilt is caused by soil-borne pathogens that can rapidly spread through a plant’s water transport system, leading to wilting, stunted growth, and, in severe cases, death of the entire plant [3]. This disease reduces the yield and quality of eggplant and significantly affects farmers’ economic income. Traditional image processing techniques have difficulty in meeting recognition requirements [4]. More importantly, by the time the symptoms can be detected, the disease is often in the mid-to-late stage, making control much more difficult. By contrast, spectral imaging technology can extract rich spectral information and identify similar diseases [5].
Information from a color camera is limited to the visual range of three color channels (red, green, and blue) with limited image information, and spectroscopy has limitations in spatial coverage [6]. Spectral imaging systems capture images while collecting large amounts of spectral information from multiple spectral bands to overcome this problem [7]. In a broader context, spectral imaging includes two main techniques: hyperspectral imaging [8] and multispectral imaging [9]. Hyperspectral imaging typically provides images with hundreds of continuous bands using spectrometers and sensitive area detectors [10]. By collecting a large amount of narrowband spectral data, hyperspectral imaging provides richer spectral information, enabling more accurate identification and classification of different objects or targets and making it more suitable for detailed remote sensing [11] and geological studies. In contrast, multispectral imaging typically involves pairing sensitive area detectors with a set or series of specific band filters or tunable light sources [12]. Multispectral imaging acquires images by segmenting and collecting spectral data from several broad bands. Each band typically corresponds to a specific color (i.e., red, green, or blue), often matching the RGB bands of visible light. Compared to hyperspectral imaging, multispectral imaging is less expensive, has simpler data processing, and is suitable for various applications, such as agriculture [13], environmental monitoring, and geological exploration. The Manual FS-3200T-10GE-NNC multispectral camera, with its low cost, simplicity, and portability, shows potential for the early diagnosis of plant diseases. Therefore, the Manual FS-3200T-10GE-NNC multispectral camera was chosen for this experiment.
Advances in computer vision, artificial intelligence, and machine learning technologies have promoted the development and implementation of early crop disease diagnostic techniques. After crops are infected with fungal pathogens, their metabolism and tissue structure change. Early disease diagnosis can be performed using enzyme-linked immunosorbent assay (ELISA) and polymerase chain reaction (PCR). Although these techniques achieve satisfactory accuracy, their high cost is a major drawback. In addition, the combination of multispectral technology and machine learning can also be used for the early diagnosis of crop diseases. The application of multispectral technology in early disease detection in crops utilizes the characteristics of plants reflecting, absorbing, and emitting light at different spectral bands [14]. When crops are attacked by diseases, changes in their spectral characteristics occur, and by detecting these changes, early disease diagnosis can be performed before symptoms become visible to the naked eye. Many studies have validated the effectiveness of multispectral technology in detecting various plant diseases, such as bacterial and fungal diseases. These case studies have laid a solid foundation for the promotion of this technology. For example, Xuan et al. collected images of wheat powdery mildew infection 2–5 days post-inoculation. They achieved a classification accuracy of 91.4% on the validation set using a PLS-DA model with fused datasets [15]. Thomas et al. introduced a new hyperspectral phenotyping system that combines high-throughput canopy-scale measurements with high spatial resolution and controlled measurement environments, enabling the accurate assessment of disease severity [16]. Chen et al. used sun-induced chlorophyll fluorescence for the early diagnosis of Citrus huanglongbing, and the classification results showed that PFD was the best stage for HLB diagnosis, recommending the SIFY741 index, with an overall accuracy of greater than 90% [17]. Multispectral technology is non-destructive, rapid, and efficient. Compared to traditional detection methods, multispectral technology can detect diseases earlier and take timely prevention and control measures. Despite its great potential in disease detection, multispectral technology still faces some challenges. For non-imaging multispectral devices, the extracted crop information is limited, and most machine learning models lack the performance to learn from such restricted information and achieve high accuracy. This limited information is used for specific purposes and lacks general applicability. For example, Goran Kitić et al. developed a new low-cost portable multispectral optical device for precise plant status assessment, which comprises light sources of the four most indicative wavelengths (850, 630, 535, and 465 nm) [18]. Therefore, the performance of machine learning models needs improvement. On the other hand, imaging multispectral devices are expensive; although they can provide sufficient information, they require extensive research to extract corresponding feature information from images for machine learning [19, 20]. This process is time-consuming and labor-intensive, with high environmental requirements, making it unsuitable for most agricultural settings. Additionally, blindly extracting multiple features may result in complex characteristics that machine learning cannot process effectively. However, most of the equipment and experiments capable of multispectral imaging and early disease diagnosis are relatively expensive. This study aims to use a low-cost multispectral camera combined with a well-performing deep learning model for the early diagnosis of Verticillium wilt in eggplants, achieving a low-cost, non-destructive, rapid detection method with a relatively high accuracy.
Leaves are important indicators of plant health. Lu et al. used a high-resolution portable spectral sensor to detect tomato leaves at different disease stages, including the early asymptomatic stage [21]. Principal component analysis (PCA) was used to evaluate the spectral vegetation indices(SVIs), and K-nearest neighbour(KNN) was applied for classification. Among the detected leaves, healthy leaves achieved the highest accuracy (100%) owing to their uniform tissue, with the lowest error rate of 0. Conrad et al. (2020) used near-infrared (NIR) spectroscopy combined with machine learning for the early detection of rice wilt disease and constructed disease prediction models using SVM and random forest algorithms, achieving a detection accuracy of 86.1% [22]. This demonstrates the broad application prospects of machine-learning models for early disease diagnosis.
The aim of this study was to develop a non-destructive early detection method for eggplant wilt disease using low-cost multispectral cameras and deep learning technology. Images were taken before (day 0) and 2–7 days after infection, and various deep learning algorithms were used to learn the spectral and imaging features of eggplants at different infection stages. An effective early disease detection model was established to classify eggplants. Additionally, model validation was conducted on data from the second and third days to provide a scientific basis and technical support for the early prevention and control of eggplant wilt disease. This study offers a new perspective and tool for disease management in various crops, potentially reducing agricultural facility costs and improving agricultural productivity and sustainability. It also provides new case studies and data support for the application of multispectral imaging technology to agricultural disease detection.
Materials and methods
Study area and experimental design
The experimental research area is located at the National Key Laboratory on the western campus of Hebei Agricultural University in Baoding City, Hebei Province (115°45’N, 38°83’E), where the laboratory temperature is controlled at 24℃. We selected the eggplant variety known as “Dalong Zhigen” in China as the material for this study. Seeds were planted in 32-cell trays with light loam soil having a pH range between 6.5 and 7.0. The eggplant seeds were sown on February 26, 2024, and divided into two groups: Class B for disease inoculation and Class G representing healthy plants. The study period covered the seedling stage of eggplant growth, spanning 44 days from planting to image capture.
The pathogen used in this experiment, Verticillium dahliae Kleb, which causes Verticillium wilt, was provided by the College of Horticulture, Hebei Agricultural University. The strain was cultured and preserved using standard methods on March 12, 2024. Eggplant seedlings were inoculated with the pathogen at a concentration of 1 × 107 spores/mL on April 3, 2024. First, all eggplant seedling roots were cleaned. Class B seedlings were inoculated by soaking their roots in the prepared spore solution for 25 min before replanting them into the soil. The roots of Class G seedlings were soaked in clean water for 25 min before replanting. Thereafter, the seedlings were transported to the laboratory and adapted to 24 ℃ for 2 days. We successfully infected 103 Class B plants and obtained 101 healthy Class G plants. Take healthy leaves one day before the disease treatment, then start shooting from the second day until the seventh day, and shoot images for six days. Take one labeled leaf image of each eggplant seedling every day as the dataset to establish an early detection model for eggplant wilt disease and effectively classify it. If the eggplant seedlings’ leaves fall off prematurely or completely wither due to illness, the shooting will be terminated before lifting. Images were collected for a total of 1174 samples. The specific experimental procedure is shown in Fig. 1.
Acquisition of multispectral images
We utilized a multispectral camera with a manually adjustable focal length and aperture (FS-3200T-10GE-NNC; JAI A/S, Copenhagen, Denmark) to capture images of eggplant seedling leaves. Multispectral data collection was conducted on April 3, 2024, and from April 5, 2024, to April 10, 2024. Eggplant leaf images were captured using a multispectral camera in a laboratory dark box environment with internal box lighting as the light source. The images were captured vertically, with the camera positioned approximately 50 cm from the eggplant plants. The image resolution was 2048 × 1536 pixels. Three images were generated after capturing the RGB, NIR1, and NIR2 images, as shown in Fig. 2. The spectral band range of the camera is illustrated in Fig. 3. The specific parameters of the camera are shown in Table 1. A total of 1174 five-channel images were synthesized in the experiment, with data augmentation doubling the count to 2348 images. Among 2,348 images, 200 images composed the test set, which was used for model testing. The test set images were evenly sampled from the data for each day without overlapping with the training data, and the remaining images were used for model training to compose the training set. The test set for model validation consists of 40 images from the second day (20 images per category) and 40 images from the third day, with each group increasing to 200 images after data augmentation. In this experiment, the data from the second and third days are evenly divided into five parts, and model validation is conducted sequentially, performing a five-fold cross-validation to evaluate the model. The early stage of eggplant wilt disease is characterized by wilting leaves, and the changing stage is the appearance of yellow leaves [23]. This study mainly focuses on model validation in the early stages of diseases.
Plant disease confirmation
The leaf surface was disinfected with 75% ethanol, followed by a secondary disinfection with 5% sodium hypochlorite for one minute. The leaf surface was then rinsed thrice with sterile water to remove any residual disinfectant. The treated samples were subsequently placed on agar plates containing potato dextrose and incubated at 25 °C for 5 days. Microbial growth on the leaf surfaces was observed under an optical microscope, focusing on the growth of mycelia and the presence of Verticillium dahliae spores (Zeiss Image Z2, Oberkochen, Germany).
Multispectral image information fusion model and image preprocessing
This experiment involved a series of data preprocessing steps to enhance the accuracy and efficiency of the diagnosis. First, a multispectral image information fusion model was established. The three images generated after shooting were fused to form a single five-channel image: the first three channels from the RGB image, fourth channel from the NIR1 single-channel image, and the fifth channel from the NIR2 single-channel image. For the deep-learning model, the synthesized five-channel image retained the color information of the RGB image while adding spectral information from the NIR channels, aiding the model in better extracting and learning features related to Verticillium wilt. All fused five-channel multispectral images were then resized to a uniform resolution using bilinear interpolation [24], aiming to minimize potential image distortion during resizing and ensure consistency in the data input to meet the requirements of the deep learning model. Finally, data augmentation techniques were applied to expand the dataset, doubling its size to improve the model’s generalization ability, making it more robust and capable of handling images captured from different perspectives. During data augmentation, horizontal flipping was used to simulate conditions that might be encountered in real-world scenarios.
Model construction
In reference to other studies, CNN [25], ResNet50 [26], VGG16, and VGG19 demonstrated superior performance in different aspects of deep-learning classification tasks. Therefore, to study early diagnosis of eggplant Verticillium wilt based on multispectral images, this paper used various deep learning algorithms, such as CNN, ResNet50, VGG16, and VGG19, to construct and compare different diagnostic models.
CNN can effectively extract features through local connections and weight sharing, and are widely used in tasks such as image classification and object detection. ResNet50 is developed on the basis of CNN, which solves the problem of gradient vanishing in deep networks by introducing shortcut connections, making it easier for the network to train and learn complex features. These two algorithms have important application value in the field of deep learning.
This study primarily used the VGG16 for classification. VGG16 [27] is a deep-learning model developed by the Visual Geometry Group at Oxford University and is primarily used for image recognition and classification. The name VGG16 originates from the 16 convolutional layers included in the model (excluding the pooling and fully connected layers). It uses a series of small 3 × 3 convolutional kernels and pooling layers, making the network structure simple, clear, and easy to understand and implement. VGG16 gradually extracts the features of images by stacking multiple convolutional and pooling layers, and finally, it performs classification through fully connected layers. In comparison, VGG19 adds three convolutional layers to the structure of VGG16, making the network deeper and better at capturing high-level image features.
Feature extraction backbone network improvement
Owing to the lack of clear characteristics of the early stages of wilt disease, its detection is challenging. This study further enhanced the model’s performance by incorporating four different attention mechanisms besides VGG16: Squeeze-and-Excitation Networks (SENet), Convolutional Block Attention Module(CBAM), Efficient Channel Attention Network(ECANet), and triplet attention.
As illustrated in the Fig. 4, the SENet structure recalibrates channel feature responses adaptively by explicitly modeling the interdependencies among channels [28]. SENet enables the model to focus more on features with maximum information content in channels while suppressing unimportant features. As shown in the formula, SENet consists of Squeeze and Excitation operations, where Fsq and Fex in Fig. 4 represent the Squeeze and Excitation operations, respectively, h denotes the height of the feature maps, w denotes the width of the feature maps, and c represents the channels. Squeeze performs global pooling on the input feature maps in spatial dimensions, thereby compressing the feature maps of each channel into a single value. Learning the weights for each channel through a fully connected network enhances useful features and suppresses irrelevant features. The Excitation operation learns nonlinear relationships between channels, \(\:{W}_{1}\in\:{R}^{\frac{C}{r}\times\:C}{W}_{2}\in\:{R}^{\frac{C}{r}\times\:C}\), and r is a hyperparameter. W1 is the first fully connected operation aimed at dimensionality reduction, and W2 is the second fully connected operation that restores dimensionality to the input dimension. After two fully connected operations, sigmoid activation was applied to normalize the weights of each feature map. Finally, the weighted output feature map is obtained by multiplying the weight matrix output from the sigmoid activation function by the original feature map, yielding the result of the feature map, as shown in Eq. (3).
The CBAM mechanism enhances the representational capability of the network, as shown in Fig. 5. By incorporating the CBAM, the network can better capture spatial information from images [29]. The CBAM comprises two key modules: the channel attention module (CAM) and the spatial attention module (SAM). The CAM weighs the feature maps of each channel to enhance the useful channels and suppress irrelevant channels. SAM weighs the spatial information of the feature maps to improve their representational capability. As shown in Eq. (3), the channel feature map calculation includes the sigmoid activation function (σ), input feature maps (F), and global average pooling and global maximum pooling feature maps, denoted by and, respectively, where W0 and W1 represent the parameters of the two layers of the perceptron. As indicated in Eq. (4), the spatial feature map calculation involves a 7 × 7 convolution operation (f7*7). Initially, the input feature map was subjected to global average pooling and global maximum pooling. The two feature maps were then concatenated, followed by a 7 × 7 convolution operation and a sigmoid activation function to generate the spatial feature map (Ms). Finally, the original input feature map is multiplied by the spatial feature map (Ms) to generate the final feature map.
ECANet is an improved version of SENet, as shown in Fig. 6, and is a lightweight attention mechanism [30]. ECANet primarily comprises two modules: an Interactive Channel Attention Module (ICAM) and an Interactive Channel Attention Fusion Module (ICAF). ICAM weighs the feature maps of each channel to enhance useful channels and suppress irrelevant channels. ICAF merges feature information from different channels to enhance the representational capability of the features.
Triplet attention consists of three different attention mechanisms: self-attention, global attention, and relative attention [31], as shown in Fig. 7. These mechanisms can simultaneously learn global contextual information, relative positional information, and internal relationships within the sequence. Its structure comprises three parallel branches. Two of these branches are responsible for capturing cross-dimensional interactions between channels (C) and spatial dimensions (H or W). The final branch, similar to CBAM, is used to construct spatial attention. The outputs of the three branches are aggregated using an average. This novel method calculates attention weights by capturing cross-dimensional interactions through a three-branch structure.
In the network architecture of VGG16 and VGG19, adding the network layers of the four attention mechanism modules before maxpool helps the model select and enhance important features before downsampling. This approach not only preserves more spatial details, but also effectively enhances the expression of local features and reduces the interference of irrelevant information on the model. We take the VGG16 model as an example and show the improved VGG16 network model structure diagram, as shown in Fig. 8.
Experimental environment and model evaluation
The experiments in this study were conducted on the following platform:
-
Operating System: Windows 10.
-
Hardware: GPU is Intel(R) Core(TM) i3-800, RAM is 64 GB.
-
Development Environment: All algorithms were developed and written in Pycharm.
-
Data Preprocessing: Multispectral data preprocessing was completed using Matlab2020b software.
This experiment uses the Adam optimizer to iteratively train 100 generations. The first 50 generations undergo freeze training, with a learning rate of 0.001 for the first 50 generations and 0.0001 for the last 50 generations. This experiment used accuracy (%), precision (%), recall (%), F1-score (%), and time(s) as effective metrics for evaluating the models, where TP represents true positive, FP represents false positive, and FN represents false negative. Precision is the ratio of correctly predicted positive observations to the total number of predicted positives. The recall is the ratio of correctly predicted positive observations to all actual positives. The F1-score is the harmonic mean of precision and recall. We use a test set to test each trained algorithm, obtaining ten test times for each algorithm and calculating the average time of these ten times as a measure of algorithm time. The time unit for this experiment is seconds.
Results
Analysis of detection results from different models
In the preliminary analysis of the multispectral imaging data, this study utilized four deep learning models (CNN, ResNet50, VGG16, and VGG19) for data classification. Deep neural network architectures, such as VGG16, can better capture features within multispectral images, achieving high classification accuracy. Multispectral images contain rich plant information, and neural network models can assist in uncovering and utilizing this information, enabling early diagnosis of eggplant Verticillium wilt. For this experiment, the fine-tuning of models, such as VGG16, was required, including modifying the input image channel to 5. In addition, the first filter layers of VGG16 and VGG19 must be adjusted to 65 to accommodate the input composite image channel. Each model randomly divides the training set into 8:2 parts for each round of training, with 80% of the training set data used for model learning and 20% for validating and evaluating the model. Table 2 shows the validation results of model training, namely the validation set results. Table 3 shows the testing results of the testing set.
According to the results presented in Tables 2 and 3, VGG16 and VGG19 exhibited higher accuracy and various metrics on both the validation and test sets than the CNN. ResNet50 showed an accuracy comparable to VGG16 on the validation set, which was only slightly lower than the 82.79% accuracy of the VGG19 model. However, ResNet50 metrics such as F1-score, recall, and precision were lower on the test set. As shown in Fig. 9, ResNet50 could correctly classify wilted leaves in the test set, but the number of healthy leaves correctly classified was relatively small, resulting in a lower F1-score, recall, and precision of the ResNet50 model compared with the VGG16 and VGG19 models. And the ResNet50 model has the longest testing time.
In conclusion, VGG16 and VGG19 exhibited the most stable and outstanding performance in the early classification of eggplant Verticillium wilt based on multispectral images, particularly on the test set. VGG19 slightly outperformed VGG16 but classified the same number of Verticillium wilt-infected leaf samples correctly on the test set. In terms of the F1-score, recall, and precision, VGG16 and VGG19 performed similarly. According to Table 3, VGG16 has a slightly shorter running time than VGG19. The main difference between VGG19 and VGG16 is the number of layers, with VGG19 having 19 layers and three more convolutional layers than VGG16. This implies that the VGG19 is deeper and more complex than the VGG16. Due to the fact that VGG19 has more convolutional layers, its total parameter count will be larger than VGG16. The number of parameters for VGG16 is approximately 138 million, while the number of parameters for VGG19 is approximately 143 million. However, owing to its deeper architecture, VGG19 requires more computational resources and time for both training and inference than VGG16. In many practical applications, VGG16 provides sufficient accuracy while maintaining lower computational costs and faster training speeds than VGG19. Through the performance analysis of VGG16 and VGG19 on the validation and test sets and comparison with other models, we can conclude that both VGG16 and VGG19 perform excellently in the early classification of eggplant Verticillium wilt. Considering their performance, further modifications can be made to both the VGG16 and VGG19 models for comparative testing.
Analysis of improved model recognition results
In this experiment, four different attention mechanisms, namely SENet, CBAM, ECANet, and triplet attention, were incorporated into the VGG16 model and VGG19 model, resulting in eight models with unique functionalities and performance characteristics: VGG16-SE, VGG16-CBAM, VGG16-ECA, VGG16-triplet attention, VGG19-SE, VGG19-CBAM, VGG19-ECA, and VGG19- triplet attention. Each of these models exhibited different performances in the experiment.
By comparing the validation set accuracy between all improved models and the two original models, it was observed that each improved model outperformed the original models in terms of accuracy, as shown in Table 4. Both the VGG16 and VGG19 models enhanced with triplet attention and CBAM modules achieved higher accuracies on the validation set, exceeding 85%. The VGG19 model with triplet attention module achieved the highest accuracy on the validation set, reaching 86.54%.
In the test set, all improved models outperformed the original models in various metrics, as shown in Table 5. In particular, the VGG16-triplet attention model achieved an F1-score, recall, and precision of 85.86%, 85%, and 86.73%, respectively, using 85 correctly classified Verticillium wilt-infected leaf samples. All the metrics of VGG16-triplet attention were higher than those of the other models. VGG19-CBAM and VGG19-triplet attention also demonstrated good performance, but they required more processing time, with VGG19-CBAM and VGG19-triplet attention taking 2.97 s and 1.58 s longer, respectively, than the VGG16-triplet attention model. Overall, the VGG16 improved model has a shorter testing time than the VGG19 improved model, but the difference in metrics is not significant. When testing large amounts of data, the VGG19 and its modified models may exhibit significant drawbacks. Among all models, the VGG16 triple attention model performs the best with shorter processing time. We will further validate the performance of VGG16 and its modified models.
Model validation
The trained models were subjected to early diagnostic validation to further validate their early diagnostic performance. As nearly half of the Class B eggplant leaves exhibited sudden and explosive localized light yellowing and became visually distinguishable on the fourth day—an occurrence that did not happen on the third day. Therefore, images from the second and third days were selected to create a dataset for model evaluation. A five-fold cross-validation was conducted by dividing the data from these two days into five parts separately, followed by data augmentation to obtain 200 images (100 Class B images and 100 Class G images) for model testing. Table 6 presents the experimental results, showing the highest and lowest accuracies of the five runs in the experiment.
Table 6 indicated that VGG16-triplet attention still demonstrates excellent performance, with the lowest accuracies reaching 75% and 82% in the five-fold cross-validation at 48 and 72 h, respectively, and the highest precisions reaching 78.5% and 84%. This confirmed the method’s effectiveness in the early diagnosis of eggplant verticillium wilt, highlighting the potential of deep learning in the field of early disease detection. Figure 10 shows the image of correctly detecting the leaf category.
Discussion
Verticillium wilt is highly contagious and seriously threatens eggplant production and agricultural economy, necessitating timely and effective prevention and control measures. The traditional approach to early disease diagnosis involves using multispectral disease-imaging equipment in conjunction with machine learning technology. However, this approach has significant limitations in early disease diagnosis, including challenges such as high costs, complex operation, and low model performance. This study achieved a high level of accuracy in the early diagnosis of Verticillium wilt in eggplant seedlings using low-cost multispectral cameras. Compared with traditional multispectral imaging devices, Manual FS-3200T-10GE-NNC cameras offer advantages such as lower cost, portability, and ease of operation. They enable noncontact monitoring of plant health, greatly enhancing detection efficiency. Multispectral imaging can capture subtle physiological changes in the early stages of disease development, enabling timely prevention and control measures [32]. Although this study achieved positive results to some extent, certain issues still need to be addressed. For example, environmental factors such as lighting conditions can affect image data and require further research and calibration [33]. The improved models used in this experiment have increased the detection accuracy to some extent but also increased the computational complexity, potentially increasing the time cost of training and inference.
Before conducting the experiment, eggplant seedling leaves were photographed to determine the optimal light intensity. The light intensity was adjusted manually because the photos were taken in a dark box without natural light sources. When the light intensity was too low, it had little impact on the spectral data obtained from the two NIR images, and good spectral data could still be obtained. However, it significantly affected the RGB images, resulting in darker RGB images, where colors could not be easily distinguished. Excessive light intensity did not have a significant impact on RGB images but greatly affected NIR images, causing a loss of spectral data information in the leaf images, which affected the early diagnosis of eggplant Verticillium wilt. Therefore, it is necessary to determine the appropriate light intensity for different environments to ensure that sufficient color and spectral information can be captured for model classification.
We conducted relevant experiments by creating a dataset of daily captured images to train the model, and also creating a dataset of all daily captured images to train the model. We obtained two methods of training the model using the daily test set, and compared the results with accuracy and other indicators. We found that training and testing all images together on the dataset would achieve better results. We believe that the main reason is that when the deep learning model is trained on the dataset of all day captured images, the model learns the changing patterns of diseases, and can learn the changes from the three stages of disease onset, disease change stage, and disease end, thereby extracting the common feature information of these three stages. In supervised learning of the model, the model will focus on the commonalities among three stages, which are like the intersection of three sets. This commonality will narrow down the range of learned features and make judgments based on a small and accurate set of features. If the model learns the features of a single stage separately, it will focus on more feature information, and there will be more information on the image that is positively or negatively correlated or unrelated to the detection results, which will affect the feature extraction effect of the model and thus affect the detection results. This is also why this study uses four different attention mechanisms for improvement, which focus on certain specific information, which is the desired effect.
In the early diagnosis of eggplant seedling wilt disease, VGG16 has the advantage of a deep network structure that can better capture image features than CNN, thus improving classification performance. By contrast, CNN may be better trained with fewer parameters. A CNN is a generic neural network structure, whereas VGG16 and VGG19 are specific network structures based on a CNN with deeper network structures that can better extract more complex features, giving them certain advantages in handling image tasks [34]. ResNet50 achieved an accuracy of 82.56% on the validation set. However, the number of correctly classified Class B eggplant wilt disease leaves on the test set was not particularly outstanding, and metrics such as the F1-score, recall, and precision did not reflect the performance as well as on the validation set. As shown in Fig. 9, ResNet50 could correctly classify wilted eggplant leaves in the test set, but the number of correctly classified healthy leaves was relatively low, resulting in a lower F1-score, recall, and precision of the ResNet50 model compared to the VGG16 and VGG19 models. The performance of the ResNet50 model in detecting Eggplant wilt disease multispectral images needs to be improved. ResNet50 introduces Residual Blocks, which allow information to flow more efficiently in the network and alleviate the gradient vanishing problem in deep networks. However, ResNet50 and VGG16 have different feature representation abilities. For this study, ResNet50 is not as effective as VGG16 in feature representation of five channel multispectral images, despite having higher classification accuracy on the training set. In the future, efforts can be made to improve the ResNet50 network structure or conduct further research on feature fusion at different levels of the network.
On the second day after infection, wilted leaves affected by eggplant wilt disease did not show evident color differences compared with healthy leaves. The leaves exhibited wilting and a lack of expansion. By the third day, very few leaves showed faint color changes. Discoloration along the leaf margins and veins, with some parts of the margins turning brown, was detected on the fourth day. In VGG16-CBAM and VGG19-CBAM, CBAM includes channel attention and adds spatial attention. This enables for better classification of images of wilted leaves compared with healthy leaves, especially when dealing with images of leaves that lack expansion. Compared to other attention mechanism modules, the addition of spatial attention in VGG16-CBAM and VGG19-CBAM increased the depth of the network and computational intensity, resulting in higher F-scores of 84% and 84.58%, respectively, and the correct detection of 84 wilted leaves and 85 wilted leaves, respectively.
The VGG16 model with triplet attention (VGG16-triplet attention) is an innovative multi-dimensional interaction model. Through its three-branch structure, it captures the interactions between channels and spatial dimensions, enhancing the model’s understanding of different dimensions in images. This enables the VGG16-triplet attention model to learn spatial and color features better [35]. However, there are spatial differences between leaves in the early stage of illness and healthy leaves. As a result, VGG16-triplet attention achieved the highest performance among all models, with the F1-score, recall, and precision reaching 85.86%, 85%, and 86.73%, respectively. This indicated that the model performed well in detecting wilted leaves and achieved high accuracy in detecting healthy leaves. This model can effectively learn the spatial and color variations of the disease. The validation results further demonstrated the superior performance of the VGG16-triplet attention after incorporating triplet attention. Using a multispectral camera for the early diagnosis of eggplant wilt disease, VGG16-triplet attention could classify the disease with high precision, highlighting the potential of combining low-cost multispectral cameras with deep learning techniques for disease diagnosis.
The improved models enhanced the recognition and classification capabilities of multispectral images by incorporating different attention mechanisms and adapting them better to the characteristics of multispectral images. Various attention mechanisms help improve the model’s ability to extract key features from images, thereby enhancing the classification accuracy. The improved models based on the VGG16 exhibited performance improvements. VGG16-triplet attention, with its three-branch structure capturing cross-dimensional interactions, effectively learned the evolving features of eggplant wilt disease, enabling better capture of spectral and color information. By integrating spectral and color information, it learned the differences between wilted and healthy leaves better, emerging as one of the best-performing models across all datasets capable of the early diagnosis of eggplant wilt disease in seedlings.
Additionally, because of the differences in eggplant varieties and the lack of relevant samples, we could not determine whether the detection ability of the model would decrease for other eggplant varieties. To overcome this limitation, we will increase the sample size of other eggplant varieties and use the model for transfer learning between different varieties to further improve the model performance. Moreover, it is necessary to analyze the data on eggplant wilt disease in complex environments to enhance the generalizability of the model, making it more suitable for detecting wilt disease under different conditions and varieties.
This study demonstrated the effectiveness of early detection methods for eggplant wilt disease based on multispectral imaging technology, providing a new technical method for disease monitoring. The application of statistical analysis and deep learning models further confirmed the potential of multispectral cameras for the early diagnosis of wilt disease. These findings provide scientific evidence for the non-destructive early detection of eggplant wilt disease, which is of great significance for guiding agricultural production practices and reducing disease loss. Despite certain challenges, with technological advancements and cost reduction, multispectral imaging is expected to play a significant role in agricultural disease management. Future research should explore economically practical and versatile multispectral imaging devices and develop more robust data-processing algorithms to minimize the impact of external conditions.
Conclusion
In this study, we successfully developed a method for the non-destructive detection of the early stages of eggplant wilt disease by combining multispectral imaging technology with deep learning algorithms. While ensuring high accuracy, this method significantly reduces the cost of experimental equipment. By synthesizing the five-channel RGB, NIR1, and NIR2 images of eggplant leaves at the infection stage, we found that the fused images contained sufficient disease-related information. A precision of 86.73% was achieved for the test set using deep-learning algorithms. In addition, the trained VGG16-triplet attention model achieved accuracies of 75% and 82% for data from 48 to 72 h, respectively. These experimental results further support the use of these spectral indicators as effective methods for the early detection of plant physiological abnormalities.
Data availability
No datasets were generated or analysed during the current study.
Abbreviations
- ELISA:
-
Enzyme-Linked Immunosorbent Assay
- PCR:
-
Polymerase Chain Reaction
- SVM:
-
Support Vector Machine
- PCA:
-
Principal Component Analysis
- SVIs:
-
Spectral Vegetation Indices
- KNN:
-
K-Nearest Neighbour
- NIR:
-
Near-Infrared
- SENet:
-
Squeeze-and-Excitation Networks
- CBAM:
-
Convolutional Block Attention Module
- ECANet:
-
Efficient Channel Attention Network
- CAM:
-
Channel Attention Module
- SAM:
-
Spatial Attention Module
- ICAM:
-
Interactive Channel Attention Module
- ICAF:
-
Interactive Channel Attention Fusion Module
- CNN:
-
Convolutional Neural Network
References
Yang X, Zhang Y, Cheng Y, Chen X. Transcriptome analysis reveals multiple signal network contributing to the Verticillium wilt resistance in eggplant. Sci Hortic. 2019;256:108576. https://doi.org/10.1016/j.scienta.2019.108576.
Scholz SS, Schmidt-Heck W, Guthke R, Furch ACU, Reichelt M, Gershenzon J, Oelmüller R. Verticillium Dahliae-Arabidopsis interaction causes changes in gene expression profiles and jasmonate levels on different time scales. Front Microbiol. 2018;9:217. https://doi.org/10.3389/fmicb.2018.00217.
Bubici G, Amenduni M, Colella C, D’Amico M, Cirulli M. Efficacy of acibenzolar-s-methyl and two strobilurins, azoxystrobin and trifloxystrobin, for the control of corky root of tomato and verticillium wilt of eggplant. Crop Prot. 2006;25:814–20. https://doi.org/10.1016/j.cropro.2005.06.008.
Yang S, Xing Z, Wang H, Gao X, Dong X, Yao Y, Zhang R, Zhang X, Li S, Zhao Y, Liu Z. Classification and localization of maize leaf spot disease based on weakly supervised learning. Front Plant Sci. 2023;14:1128399. https://doi.org/10.3389/fpls.2023.1128399.
Shin M-Y, Viejo G, Tongson C, Wiechel E, Taylor T, Fuentes PWJ, S. Early detection of Verticillium wilt of potatoes using near-infrared spectroscopy and machine learning modeling. Comput Electron Agric. 2023;204:107567. https://doi.org/10.1016/j.compag.2022.107567.
Ariza Ramirez W, Mishra G, Panda BK, Jung H-W, Lee S-H, Lee I, Singh CB. Multispectral camera system design for replacement of hyperspectral cameras for detection of aflatoxin B 1. Comput Electron Agric. 2022;198:107078. https://doi.org/10.1016/j.compag.2022.107078.
Suzuki A, Vettori S, Giorgi S, Carretti E, Di Benedetto F, Dei L, Benvenuti M, Moretti S, Pecchioni E, Costagliola P. Laboratory study of the sulfation of carbonate stones through SWIR hyperspectral investigation. J Cult Herit. 2018;32:30–7. https://doi.org/10.1016/j.culher.2018.01.006.
Xie Y, Plett D, Evans M, Garrard T, Butt M, Clarke K, Liu H. Hyperspectral imaging detects biological stress of wheat for early diagnosis of crown rot disease. Comput Electron Agric. 2024;217:108571. https://doi.org/10.1016/j.compag.2023.108571.
Hu X, Yang L, Zhang Z. Non-destructive identification of single hard seed via multispectral imaging analysis in six legume species. Plant Methods. 2020;16:116. https://doi.org/10.1186/s13007-020-00659-5.
Qin J, Chao K, Kim MS, Lu R, Burks TF. Hyperspectral and multispectral imaging for evaluating food safety and quality. J Food Eng. 2013;118:157–71. https://doi.org/10.1016/j.jfoodeng.2013.04.001.
Bagheri N. Application of aerial remote sensing technology for detection of fire blight infected pear trees. Comput Electron Agric. 2020;168:105147. https://doi.org/10.1016/j.compag.2019.105147.
Steiner H, Sporrer S, Kolb A, Jung N. 2016. Design of an active multispectral SWIR camera system for skin detection and face verification. J. Sens. 2016, 1–16. https://doi.org/10.1155/2016/9682453
Shafiee S, Mroz T, Burud I, Lillemo M. Evaluation of UAV multispectral cameras for yield and biomass prediction in wheat under different sun elevation angles and phenological stages. Comput Electron Agric. 2023;210:107874. https://doi.org/10.1016/j.compag.2023.107874.
Zhang K, Yan F, Liu P. The application of hyperspectral imaging for wheat biotic and abiotic stress analysis: a review. Comput Electron Agric. 2024;221:109008. https://doi.org/10.1016/j.compag.2024.109008.
Xuan G, Li Q, Shao Y, Shi Y. Early diagnosis and pathogenesis monitoring of wheat powdery mildew caused by Blumeria graminis using hyperspectral imaging. Comput Electron Agric. 2022;197:106921. https://doi.org/10.1016/j.compag.2022.106921.
Thomas S, Behmann J, Steier A, Kraska T, Muller O, Rascher U, Mahlein A-K. Quantitative assessment of disease severity and rating of barley cultivars based on hyperspectral imaging in a non-invasive, automated phenotyping platform. Plant Methods. 2018;14:45. https://doi.org/10.1186/s13007-018-0313-8.
Chen S, Zhai L, Zhou Y, Xie J, Shao Y, Wang W, Li H, He Y, Cen H. Early diagnosis and mechanistic understanding of citrus Huanglongbing via sun-induced chlorophyll fluorescence. Comput Electron Agric. 2023;215:108357. https://doi.org/10.1016/j.compag.2023.108357.
Kitić G, Tagarakis A, Cselyuszka N, Panić M, Birgermajer S, Sakulski D, Matović J. Non-invasive presymptomatic detection of Cercospora beticola infection and identification of early metabolic responses in Sugar Beet. Front Plant Sci. 2019;7:1377. https://doi.org/10.3389/fpls.2016.01377.
Pérez-Roncal C, López-Maestresalas A, Lopez-Molina C, Jarén C, Urrestarazu J, Santesteban LG, Arazuri S. Hyperspectral imaging to assess the Presence of Powdery Mildew (Erysiphe necator) in cv. Carignan Noir Grapevine Bunches Agron. 2020;10:88. https://doi.org/10.3390/agronomy10010088.
Sun Q, Sun L, Shu M, Gu X, Yang G, Zhou L. Monitoring maize lodging grades via unmanned aerial vehicle multispectral image. Comput Electron Agric. 2019;2019:1–16. https://doi.org/10.34133/2019/5704154.
Lu J, Ehsani R, Shi Y, De Castro AI, Wang S. Detection of multi-tomato leaf diseases (late blight, target and bacterial spots) in different stages by using a spectral-based sensor. Sci Rep. 2018;8:2793. https://doi.org/10.1038/s41598-018-21191-6.
Conrad AO, Li W, Lee D-Y, Wang G-L, Rodriguez-Saona L, Bonello P. Machine learning-based presymptomatic detection of rice sheath blight using spectral profiles. Plant Phenom. 2020;2020(2020/8954085):8954085. https://doi.org/10.34133/2020/8954085.
Fradin EF, Thomma BPHJ. Physiology and molecular aspects of Verticillium wilt diseases caused by V. Dahliae and V. albo-atrum. Mol. Plant Pathol. 2006;7:71–86. https://doi.org/10.1111/J.1364-3703.2006.00323.X.
Gao S, Gruev V. Bilinear and bicubic interpolation methods for division of focal plane polarimeters. Opt Express. 2011;19:26161–73. https://doi.org/10.1364/OE.19.026161.
Jekauc D, Burkart D, Fritsch J, Hesenius M, Meyer O, Sarfraz S, Stiefelhagen R. Recognizing affective states from the expressive behavior of tennis players using convolutional neural networks. Knowl Based Syst. 2024;295:111856. https://doi.org/10.1016/j.knosys.2024.111856.
Yu Q, Zhang Y, Xu J, Zhao Y, Zhou Y. Intelligent damage classification for tensile membrane structure based on continuous wavelet transform and improved ResNet50. Measurement. 2024;227:114260. https://doi.org/10.1016/j.measurement.2024.114260.
Sarker S, Tushar SNB, Chen H. High accuracy keyway angle identification using VGG16-based learning method. J Manuf Processes. 2023;98:223–33. https://doi.org/10.1016/j.jmapro.2023.04.019.
Hu J, Shen L, Albanie S, Sun G, Wu E. Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell. 2020;42:2011–23. https://doi.org/10.1109/TPAMI.2019.2913372.
Xue M, Chen M, Peng D, Guo Y, Chen H. One spatio-temporal sharpening attention mechanism for light-weight YOLO models based on sharpening spatial attention. Sens (Basel). 2021;21:7949. https://doi.org/10.3390/s21237949.
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q. 2020. ECA-net: Efficient channel attention for deep convolutional neural networks,2020. IEEE Publications. IEEE Publications, Seattle, Washington, pp. 11531–11539[2024-05-09]. https://doi.org/10.1109/CVPR42600.2020.01155
Cui L, Dong Z, Xu H, Zhao D. Triplet attention-enhanced residual tree-inspired decision network: a hierarchical fault diagnosis model for unbalanced bearing datasets. Adv Eng Inf. 2024;59:102322. https://doi.org/10.1016/j.aei.2023.102322.
Peng Y, Dallas MM, Ascencio-Ibáñez JT, Hoyer JS, Legg J, Hanley-Bowdoin L, Grieve B, Yin H. Early detection of plant virus infection using multispectral imaging and spatial–spectral machine learning. Sci Rep. 2022;12:3113. https://doi.org/10.1038/s41598-022-06372-8.
Xiao-Hua J, Yao-Yao C, Yong-Jia X, Zhi-Le H. Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, 2012. Suzhou 215163, China. Adaptive optics multispectral photoacoustic imaging. Acta Phys Sin 61, 217801.
Wu Y, He Y, Wang Y. Multi-class weed recognition using hybrid CNN-SVM classifier. Sens (Basel). 2023;23. https://doi.org/10.3390/s23167153.
Misra D, Nalamada T, Arasanipalai AU, Hou Q. Rotate to attend: Convolutional triplet attention module, 3138–3147. https://doi.org/10.1109/WACV48630.2021.00318
Acknowledgements
We would like to thank Editage (www.editage.cn) for its linguistic assistance during the preparation of this manuscript.
Funding
This study is supported by the National Natural Science Foundation of China (32072572), the earmarked fund for CARS (CARS-23), the Innovative Research Group Project of Hebei Natural Science Foundation (C2020204111). Hebei Province Graduate Innovation Ability Cultivation Funding Project (CXZZBS2024069).
Author information
Authors and Affiliations
Contributions
YWZ collected and analyzed the data and wrote the manuscript. DFZ and YFZ revised the manuscript. FQC provided experimental materials. XMZ and M W revised the manuscript. XFF provided financial support, supervised the entire study and revised the manuscript. XFF also contributed to editing and enhancing the manuscript and provided guidance for experimental procedures. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhang, Y., Zhang, D., Zhang, Y. et al. Early detection of verticillium wilt in eggplant leaves by fusing five image channels: a deep learning approach. Plant Methods 20, 173 (2024). https://doi.org/10.1186/s13007-024-01291-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13007-024-01291-3









