Plant materials, culture, and disease inoculation
Images used to develop and validate the phenotyping pipeline were taken from two different experiments. In the first experiment, plant materials consisted of 329 diverse soybean accessions selected from the USDA Soybean Germplasm Collection, of which four replicates were grown. The second experiment consisted of 180 recombinant inbred lines (RILs) that were derived from a ‘Forrest’ × ‘Davis’ cross, which were grown in six replicates. ‘Forrest’ is a susceptible soybean cultivar, while ‘Davis’ is a resistant cultivar that carries the major resistance gene Rcs3 [29].
All disease assays were conducted in the Plant Pathology greenhouse at the University of Georgia Griffin campus in Griffin, GA. Experiments were laid out in a randomized complete block design, with two replicates being planted at a time. Four seeds were planted in a 10 cm square plastic pot and 12 pots were arranged in a 15-cell tray, leaving the middle three positions empty to maximize light distribution. After emergence, pots were thinned to two plants each. The greenhouse was maintained at approximately 27 °C during the day and 21 °C at night with 13 h of supplemental light during the winter and spring and 3 h during the summer from metal halide lamps. Plants were grown to the V2–V3 growth stage and inoculated with isolate S23 (race 8) of C. sojina from the University of Georgia C. sojina culture collection [4]. To produce inoculum, colonies of C. sojina growing on V8 agar media were flooded with 0.04% Tween-20 and lightly scraped with a scalpel to dislodge conidia. The conidia-Tween solution was passed through two layers of cheesecloth to remove large pieces of mycelium. The conidia concentration was measured on a hemocytometer and adjusted to 9 × 104 spores × mL−1. At the time of inoculation, plants were moved to plastic-covered inoculation chambers to maintain humidity near 100% that were placed under 95% shade cloth to regulate temperature. In each chamber, 150 mL of prepared inoculum was evenly sprayed onto the trifoliolates of the plants. Inoculation was repeated 24 h following the same procedure. Plants remained in the inoculation chamber for additional 24 h after the second inoculation and then moved back to the greenhouse bench. Disease symptoms appeared on susceptible plants 14–21 days after inoculation.
Image acquisition
Fourteen to 21 days after inoculation, RGB images were acquired with a Canon EOS Rebel T4i/EOS 650D digital single-lens reflex (DSLR) camera with a 17.9 megapixel resolution. The camera was mounted overhead 0.75 m above the subject. Images were taken on a plain white background with two LED lights set at 45° angles on either side. White balance of the camera was adjusted according to the manufacturer’s instructions before images were captured for the experiments. The camera was set to auto mode with flash disabled to automatically adjust shutter speed, aperture, and ISO speed. For each plant, the most diseased leaf was removed from the plant and placed on the white background. To keep the leaf flat, one piece of 20 cm × 20 cm nonreflective glass was placed on top of the leaf during imaging (ArtToFinish New York, USA). Each image also included a QR code for each sample indicating the experiment, pot number, and genotype, as well as a ruler for scale. Images were saved in JPEG format for analysis. To compare the results of the image analysis, the disease severity of each leaf was also estimated visually by a single rater using a 1–5 scale, where 1 = disease free, 2 = small lesions without a differentiated light center, 3 = < 10% of leaf area covered with lesions, 4 = ≥ 10% to < 20% of leaf area covered with lesions, and 5 = ≥ 20% of leaf area covered with lesions. The same 1–5 scale has been used to phenotype soybean breeding lines at the University of Georgia due to simplicity compared to a 0–100% continuous scale, and similar scales are used to phenotype other crop leaf diseases [17, 18]. Visual estimates were based on a set of standard area diagrams [14].
Image analysis
The image analysis method (Fig. 4, Additional file 2) was developed in FIJI software [26], a free, open-source, and highly customizable distribution of ImageJ [27] for scientific image processing. To preprocess an image, it is first renamed and compressed. If a QR code was included in the image, it is decoded with the Barcode_Codec plugin [30], and the image file is renamed with the text decoded in the barcode. If no QR code is present, the image file name is used. Next, the image is rescaled so that the width is 1500 pixels. At this resolution, details of the leaves and lesions can be detected, but the storage space and computational power required are reduced.
After preprocessing, the RGB image is converted to a three-slice HSB (hue, saturation, brightness) color space stack to partition the leaf from the white background. In the saturation (S) slice of the HSB stack, segmentation with a threshold of 85 to 255 is applied, and the image is converted to a binary mask. A median filter with a radius of 4 is used to remove any remaining black pixels within the leaf area and white pixels in the background. To remove the petiole, morphological opening is used with an element of 9 to erode and subsequently dilate the binary image. By doing this, any structures in the image that are 18 pixels or narrower, such as the petiole, are removed and the leaf area remains almost unchanged. After the leaf has been isolated and the petiole has been removed, the remaining white pixels in the image are counted with the “Measure” function; this value is stored as the leaf area.
To isolate, measure, and count the lesions on each leaf, the compressed RGB is converted to a L*a*b* (lightness, a*-chrominance, b*-chrominance) color space stack, and the a*-chrominance channel is isolated. The background is first removed using the selection created in the leaf segmentation step, and the brightness and contrast of the leaf area are optimized to allow 0.35% of the pixels to become fully saturated. Next, a threshold of − 8 to 100 is applied to the a* channel to select the lesions. Then the “Analyze Particles” function is used to count the number of lesions with an area of 16 pixels or greater and a circularity of 0.3 or higher, where \(circularity=4\pi \frac{area}{{perimeter}^{2}}\) and a value of 1 represents a perfect circle. The minimum area and circularity constraints minimize small debris or other irregularities less than 16 pixels or with a circularity value less than 0.3 from being classified as lesions. The “Measure” function is then applied to the selections to measure the pixel area of each lesion.
After processing each image, the percent of leaf area that is infected with lesions is calculated as \(\frac{total lesion area}{total leaf area}\times 100\). Total leaf area, total lesion area, percent of diseased leaf area, and lesion number are saved in a combined file along with the sample name as determined by the QR code or input file name. For each image processed, a file containing the individual measurements of each lesion is also saved, as well as a result image that shows the leaf outline, the outline of each lesion, and the number of each lesion. This information can be used downstream to visually verify the accuracy of the image processing results.
Optimization of image compression
To optimize the time to process each image and the amount of storage space required to store the result images, 51 images were tested using nine different resolutions in the image processing workflow. Compression levels ranged from 5184 × 3456 pixels (uncompressed original image) to 400 × 267 pixels, and steps in the image processing that rely on pixel dimensions were scaled accordingly. To determine the optimal compression, the lowest resolution that maintained an accurate count of lesions and measurement of lesion area was selected. Pearson’s correlations were used to compare the results from compressed images to the results of the uncompressed images for the percent of diseased leaf area and lesion number traits. One-way ANOVA and Tukey’s HSD were used to determine significant changes in image processing time and file size of the result image.
Image-based phenotyping and comparison to visual phenotyping
To assess the agreement between image-based disease severity estimates and the actual disease severity, the 75 result images were visually inspected for healthy leaf areas detected as lesions or lesions that were not detected. To obtain an estimated true value for lesion area and lesion number in each image, lesions that were not detected and healthy areas that were marked as lesions were manually corrected and measured in ImageJ. Pearson correlations between the automatic measurements and true values were calculated for percent of diseased leaf area and lesion number in using the base R cor function. False positives were defined as any non-diseased area of the image that was marked as a lesion by the image processing software, and false negatives were defined as any lesion present in the image that was not marked by the software. False positives and false negatives were reported for lesion area and lesion number.
Spearman’s rank correlation coefficient was used to compare the image analysis traits with the 1–5 visual scale. Spearman’s rank correlation can be used to assess the relationship between continuous and ordinal variables and is based on the ranked value for each variable [31]. For 2096 samples that had data from the image processing and visual assessment traits, Spearman’s rank correlation coefficients were calculated for pairs of traits using the base R cor function.