 Research
 Open Access
 Published:
Predicting grain yield using canopy hyperspectral reflectance in wheat breeding data
Plant Methods volume 13, Article number: 4 (2017)
Abstract
Background
Modern agriculture uses hyperspectral cameras to obtain hundreds of reflectance data measured at discrete narrow bands to cover the whole visible light spectrum and part of the infrared and ultraviolet light spectra, depending on the camera. This information is used to construct vegetation indices (VI) (e.g., green normalized difference vegetation index or GNDVI, simple ratio or SRa, etc.) which are used for the prediction of primary traits (e.g., biomass). However, these indices only use some bands and are cultivarspecific; therefore they lose considerable information and are not robust for all cultivars.
Results
This study proposes models that use all available bands as predictors to increase prediction accuracy; we compared these approaches with eight conventional vegetation indexes (VIs) constructed using only some bands. The data set we used comes from CIMMYT’s global wheat program and comprises 1170 genotypes evaluated for grain yield (ton/ha) in five environments (Drought, Irrigated, EarlyHeat, Melgas and Reduced Irrigated); the reflectance data were measured in 250 discrete narrow bands ranging between 392 and 851 nm. The proposed models for the simultaneous analysis of all the bands were ordinal least square (OLS), Bayes B, principal components with Bayes B, functional Bspline, functional Fourier and functional partial least square. The results of these models were compared with the OLS performed using as predictors each of the eight VIs individually and combined.
Conclusions
We found that using all bands simultaneously increased prediction accuracy more than using VI alone. The Splines and Fourier models had the best prediction accuracy for each of the nine timepoints under study. Combining image data collected at different timepoints led to a small increase in prediction accuracy relative to models that use data from a single timepoint. Also, using bands with heritabilities larger than 0.5 only in Drought as predictor variables showed improvements in prediction accuracy.
Background
Plant breeding programs routinely perform early field evaluations of large numbers of candidates for selection based not only on the main primary trait—grain yield measured in different environments—but also on several secondary traits related to yield, such as disease resistance. Methods that could help breeders measure grain yield based on other secondary traits in the early stages of plant growth could be of value to help reduce evaluation time and cost [25]. In recent years, the use of remote or proximal sensing, hyperspectral imaging, and laser scanners has helped develop lowcost, efficient highthroughput phenotyping platforms (HTPP) [1] which aim to collect data at low cost on many phenotypes of large numbers of breeding individuals at different stages of plant growth under different environmental conditions. This could drastically increase the number of traits that can be quantified on fieldgrown plants, selection intensity, prediction accuracy, and, therefore, the response to selection [25].
The basis of remote sensing and spectral science is the ability to measure electromagnetic energy at varying wavelengths that interact with different parts of the growing plant. The goal of spectral science is to measure a phenotype quantitatively through the interaction between light and plants, such as reflected, absorbed, transmitted and/or emitted photons. This is possible because each component of plant cells and tissues has wavelengthspecific absorbance, reflectance and transmittance properties [16]. For example, a healthy plant interacts (absorbs, reflects, emits, transmits and fluoresces) with electromagnetic radiation in a different way than an infected plant [16].
In practical applications of HTPP to agriculture, the reflectance of electromagnetic energy at different wavelength bands is usually summarized in scores of spectral vegetative indices (VI) that are further used to predict plant physiological issues or agronomic traits. Spectral VI are a simple and convenient way of extracting information from remotely sensed data that facilitates the processing and analysis of large amounts of data acquired by modern cameras and satellite platforms [13, 18]. Significant advances have been achieved in understanding the nature and proper interpretation of spectral VI [18, 21] and theoretical frameworks have been proposed to support the development of indices optimized for particular applications. Popular VIs are the Normalized Difference Vegetation Index (NDVI), the Canopy Water Mass Index [30] and the Modified Normalized Difference at 705 nm wavelength (mND; e.g., [26]). Vegetative indices have been applied successfully in some crops [3,4,5,6,7,9, 31]. For example, the canopy temperature (CT) and NDVI indices have been applied to estimate yield, taking advantage of the correlation between yield and these two VIs [2, 15, 17, 22]. Also, it has been documented that the aircanopy temperature difference index can be used as a selection criterion in wheat (Triticum aestivum L.) breeding programs to estimate yield [24]. The NDVI has also been used successfully to estimate wheat yield before harvest at the regional and farm scale [15].
Although several spectral VIs are positively correlated with grain yield and other important agronomic and physiological traits, they do not consider all the spectral bands from the hyperspectral sensors [28, 29]. Nevertheless, cameras with high spectral resolution can produce data on hundreds of spectral bands that can be further used to capture a wide range of information. Also, despite some successful applications of spectral VIs, most of these indices tend to be speciesspecific and, therefore, are not robust when applied across different species that have different canopy architectures and leaf structures because they use only a fraction of the available information on the measured wavelengths.
The important idea is that for sensor data to be meaningful, algorithms need to be developed to interpret the data and extract the most useful information to be translated into important traits for plant breeding. Thus, the use of highresolution images is important to develop prediction models for grain yield, yield components, and relevant physiological and agronomical traits. However, the enormous volume, variety, and velocity of HTPP data generated by such platforms make it a ‘big data’ problem. Big data generated by these near realtime platforms must be efficiently archived and retrieved for analysis [27]. The analysis and interpretation of these large datasets is quite challenging, although several authors have proposed using sensor data through a linear regression based on standard ordinary least squares [29]. To overcome collinearity among predictors (bands from the sensor), Hernandez et al. [14] concluded that penalized ridge regression models from spectral reflectance data at anthesis or grainfilling predict grain yield well under different water levels. Recently, Ferragina et al. [12] proved that highdimensional Bayesian regression models (similar to those used in genomic selection for predicting the performance of unobserved individuals based on a large number of markers) can be used to derive functions of hundreds of wavelengths. However, no Bayesian regression models or other functional regression models that define the function of wavelength for the prediction of grain yield and other traits in different environmental conditions have been studied in plant breeding [6].
Based on the above considerations, the main objectives of this study are: (1) to compare prediction accuracy of eight conventional spectral VIs (see Table 1) versus seven models that include ordinary least squared regressions for each spectral VI and all spectral VIs combined, Bayes B with all bands, Principal Components (PC) Bayes B regression, and three functional regression models, spline regression, Fourier regression and Partial Least Regression (PLS); (2) to identify the best models in terms of prediction accuracy; and (3) to identify timepoints of plant growth before harvesting from which accurate predictions for wheat grain yield can be obtained. We illustrate the use of the different methods and models with data on grain yield collected on 1170 CIMMYT wheat lines evaluated in five contrasting environments. A total of 250 wavelengths were used on nine different time data points of crop growth.
Methods
Data
All 1170 lines were evaluated in all environments [Drought (severe drought), EarlyHeat (irrigated early planting for heat at sowing), Irrigated (irrigated bed planting), Melgas (irrigated flat planting) and Reduced Irrigated (moderate drought)]. In each environment, the lines were included in 39 trials, each comprising 30 lines; in each trial, the lines were studied using an alphalattice design with three replicates and six experimental blocks. The planting dates were all in 2013 as follows: EarlyHeat on October 30, Irrigated and Reduced Irrigated on November 21, Drought on November 26 and Melgas on December the 1st. The traits measured on each line were grain yield (GY) and days to heading (DH), but only GY was analyzed in this study.
The image data was obtained using a Piper PA16 Clipper flight that was fitted with a Hyperspectral camera (Model: Aseries, MicroHyperspec Airborne sensor, VNIR Headwall Photonics, www.headwallphotonics.com, Fitchburg, Massachusetts, US) and thermal camera (A600 series Infrared camera, FLIR, www.flir.com, Boston, US). The plane flew at 270 m above the surface.
The aerial high throughput phenotyping (HTP) data was measured around solar noon time every date, aligning the plane to the solar azimuth for the data acquisition. Images of the experimental fields were obtained and formatted to tabular data by calculating the mean value of the pixels inside the center of each individual trial plot represented as a polygon area on a map. The software used to achieve this was ArcMap (ESRI, USA, CA).
On the data processing, the 38 cm per pixel CT data was corrected with a linear calibration of slope 1.2253 and Y intercept −6767.9 with the software ImapQ (Alava Ingenieros, Madrid, Spain). The several individual images of each flight were used to compose a unique mosaic per date with the software Autopano Giga (Kolor SARL, France). Then they were manually georeferenced using ArcMap (ESRI, USA, CA). The original image data is stored in kelvin units ×100, the next formula was applied with ENVI software (Excelis VIS, USA, CO) to convert the pixel values to Celsius degrees: (Pixel value)/100 − 273.15.
As well, the 30 cm per pixel hyperspectral data was processed with the software HyproQ (Alava Ingenieros, Madrid, Spain). First the images had a radiometric calibration with coefficients provided by the Laboratory for Research Methods in Quantitative Remote Sensing of the Consejo Superior de Investigaciones Científicas (QuantaLab, IASCSIC, Spain) derived with a calibrated uniform light source; additionally the dark frame subtraction was performed to reduce the noise of the sensor. Corrections to decrease the effects of the atmosphere conditions in the images was performed modeling irradiance based on sunphotometer field measurements (Microtops II, Solar Light Company, PA, USA). The images were orthorectified and coarsely georeferenced based on the builtin Inertial Navigation System (INS/GPS). For the data extraction, where the image did not overlay the plots polygons because of INS inaccuracy, they were manually aligned to it in ArcMap.
The bands were measured on nine different dates (January 10, 2015; January 17, 2015; January 30, 2015; February 7, 2015; February 14, 2015; February 19, 2015; February 27, 2015; March 11, 2015; and March 17, 2015; which we called timepoints 1, 2, 3,…, 9, respectively) using 250 discrete narrow wavelengths. In each plot, 250 wavelengths λ_{1}, … λ_{250} from 392.03 to 850.69 nm were measured for each wheat line. The ith discretized spectrometic curve is given by x_{1}(λ_{1}), …, x_{n}(λ_{250}). We used the notation x(780) without subscripts to denote the response of the band measured at a wavelength of 780 nm, x(670) to denote the response of the band measured at a wavelength of 670 nm, and so on.
Note that early heat trial was planted in average 26 days earlier than the other trials; therefore, comparisons between individual timepoints between heat trial and the others environments should consider data on the number of weeks since sowing. The comparison between the environments except early heat trial can be done more less fairly since the heading date ranged from 77 to 82 days after sowing, a period of five days in average. In all the environments heading date happened after the timepoint four except in the Melgas environment in which heading date occurred at the same moment of timepoint six.
Definition and computation of spectral vegetation indices
Eight different VIs were constructed with the 250 bands and are described in Table 1.
Statistical methods
Adjusting the original data set
The lines were evaluated using an alphalattice design with three replicates and six incomplete blocks each, with five wheat lines randomly distributed within the incomplete block. This alphalattice design was established for each of the five environments. First, the design effect was removed in each environment and the BLUPs (Best Linear Unbiased Predictor) of genotypes for GY, for each of the 250 wavelengths and for each of the eight VIs were obtained in each of the nine timepoints using the following model
where μ is the overall mean,\( y_{ijkl} \) is the response variable (GY, wavelength measure and VIs) for the ith genotype, jth trial, kth replicate, and lth block, g _{ i } is the random genetic effect of genotype i with normal distribution \( N\left( {0,\upsigma_{g}^{2} } \right) \), t _{ j } is the random effect of trial j with normal distribution \( N\left( {0,\upsigma_{t}^{2} } \right) \), \( r_{k\left( j \right)} \) is the random effect of replicate k nested within trial j with normal distribution \( N\left( {0,\upsigma_{r}^{2} } \right) \), b _{ l(k,j)} is the random effect of the incomplete block l nested within replicate k and trial j with normal distribution \( N\left( {0,\upsigma_{b}^{2} } \right) \), and \( \epsilon_{ijkl} \) is the residual effect with normal distribution \( N\left( {0,\upsigma_{e}^{2} } \right) \). After these preadjustments in each environment, we obtained BLUPs for each of the 1170 genotypes for GY, for each of the 250 bands and for each of the eight VIs. The BLUPs of genotypes were obtained for each of the nine timepoints under study. Also, from fitting the alphalattice experimental model expressed above, we used the variance components of genotypes and of the error term to calculate the broadsense heritability using the expression \( H^{2} = \frac{{\upsigma_{g}^{2} }}{{\upsigma_{g}^{2} +\upsigma_{e}^{2} }} \) [10]; this was calculated for each of the 250 bands in each timepoint in each environment.
Proposed single timepoint models for the adjusted data
With the preadjusted data (BLUPs of genotypes for GY, for each of the 250 bands and the eight VIs), we propose to evaluate prediction accuracy for each timepoint using the following statistical models:

Model 1 Index ordinal least square (OLS) regression: \( y_{i} = \mu + z_{im} \alpha_{m} + \epsilon_{i} \),

Model 2 Joint index OLS regression: \( y_{i} = \mu + \sum \nolimits_{m = 1}^{8} z_{im} \delta_{m} + \epsilon_{i} \),

Model 3 All bands Bayes B regression: \( y_{i} = \mu + \sum\nolimits_{k = 1}^{250} x_{ik} \beta_{k} + \epsilon_{i} \),

Model 4 PC Bayes B regression: \( y_{i} = \mu + \sum\nolimits_{l = 1}^{NPC} PC_{il} \gamma_{k} + \epsilon_{i} \),

Model 5 All bands functional Bspline regression: \( y_{i} = \mu + \int x_{i} \left(k\right)\beta_{1} \left(k\right) dk + \epsilon_{i} \),

Model 6 All bands functional Fourier regression: \( y_{i} = \mu + \int x_{i} \left(k\right) \beta_{2} \left(k\right) dk + \epsilon_{i} \),

Model 7 All bands functional PLS regression: \( y_{i} = \mu + \int x_{i} \left(k\right) \beta_{3} \left(k\right) dk + \epsilon_{i} \),
where i = 1, …, n, with \( n = 1170, \,k = 1, \ldots ,K \) with \( K = 250, NPC \) denotes the number of principal components (PC) used and we used 6 (5, 10, 20, 35, 45 and 55). \( x_{ik} \) represents reflectance at the kth band collected in the ith genotype, \( PC_{il} \) are the loadings of the lth PC on the ith genotype derived from the spectra data collected, z _{ im } is the mth index (RNDVI, GNDVI, SRa, RARSa, RARSb, RARSc, NPQI and PRI) derived from data collected at the ith genotype, while x _{ i }(k) is the functional predictor collected at the ith genotype and its corresponding functional data set is the sample [\( x_{1} \left( k \right), \ldots ,x_{n} \left( k \right)] \). The error terms \( \epsilon_{i} \) were assumed to be independent with null mean and variance σ _{ E } ^{2} . α _{ m }, δ _{ m }, β _{ k }, γ _{ k }, are the regression coefficients for models 1, 2, 3 and 4, respectively, while β _{1}(k), \( \beta_{2} \left( k \right), \beta_{3} \left( k \right) \) are the coefficient functions for functional models 5, 6 and 7, respectively.
The three proposed functional regression models (Models 5, 6 and 7) are the most popular functional regression models, where the responses are scalars and the covariates are functions. For this reason, the response variable (y _{ i }) is a scalar in all the proposed models and represents grain yield (GY). Also, the difference between Models 5, 6 and 7 is the basis used for representing β _{ o }(k), with o = 1, 2, 3. Here a basis is understood to be a set of standard functions \( (\phi_{w} )_{{w \in {\mathbb{N}}}} \) that are used to approximate any function of interest by a linear combination of a sufficiently large r _{ w } of these functions [23]. In Model 6, we assumed the Bspline basis; in Model 7, we used the Fourier basis; and in Model 8, the basis is the PLS. More details on the theory behind Models 5, 6, and 7 and their basis can be found in Ramsay and Silverman [23].
The parameter estimation of Models 1 and 2 was performed using OLS and implemented in the R software with the function lm() of the library MAS, while for Models 3 and 4, a Bayesian shrinkagevariable selection procedure (called Bayes B method) using a prior with a point of mass at zero and a tslab was implemented in the BGLR Rpackage [20]. The functional models (Models 5, 6 and 7) were estimated with OLS and implemented in the Rpackage fda.usc [11] with 21 basis. First, models were fitted to the entire data set to evaluate goodnessoffit to the training data and were then implemented through the crossvalidation described in the next section. Using these 7 models, we created 19 methods (described in Table 2) according to the type of data they were applied to. The 19 methods were implemented in each of the five environments and per timepoint.
It is important to point out that methods M1–M8 used only one of the 8 VIs, M9 used all 8 VIs simultaneously, M10–M19 used all 250 bands, but methods M11–M16 used all bands to perform a principal component analysis and then used different numbers of principal components (5, 10, 20, 35, 45, 55 PCs), as shown in Table 2. Additionally, methods M17 and M18 were implemented with only those bands whose heritability is >0.5.
Assessing prediction accuracy
For the prediction accuracies of the 19 proposed methods presented in Table 2, we implemented a tenfold crossvalidation—with 1053 (90%) lines for training and 117 (10%) for testing in each fold—that was assessed by the Pearson correlation between the observed BLUPs of GY and their predicted values using the testing data set. We reported the average of the tenfold crossvalidation of the Pearson correlation (APC) as measure of prediction accuracy as well as the quantiles 2.5 (LL) and 97.5% (UL) (see “Appendices 1, 2”). It is important to point out that we used the same Split (of the tenfold crossvalidation) in the 19 methods to ensure fair comparisons between methods.
Results
The results are given in two sections: the first section presents the heritability estimates of each of the 250 wavelengths for each environment, while the second section presents the prediction accuracies estimated under the implemented methods.
Heritability estimates
The highest heritability estimates were found in the Irrigated (Fig. 1b) and EarlyHeat (Fig. 1c) environments, with values between 0.6 and 0.8 for most of the timepoints. In these environments, heritability estimates are quite homogeneous across wavelengths, although in Irrigated, the lowest heritabilities were higher than 0.4 and were observed for wavelengths before 570 nm and those in the 580–700 nm range, while in EarlyHeat, the wavelengths with the lowest heritability were found before wavelengths of 480 nm and between wavelengths of 680–730 nm and all were higher than 0.4. On the other hand, the environment with the lowest heritability was Drought (Fig. 1b); the heritability before 450 nm and those in the 600–700 nm range are very low (around 0.2), while the rest of the bands with the highest heritability show values of around 0.6. The rest of the environments [Melgas (Fig. 1d) and Reduced Irrigated (Fig. 9; “Appendix 3”)] have intermediate heritability although they are very heterogeneous between timepoints and across bands. For example, in the Melgas environment, we observed heterogeneity of heritabilities between timepoints and across wavelengths and for wavelengths >750 nm for all timepoints. While in Reduced Irrigated, the lowest heritabities (around 0.3) were observed in the 590–700 nm wavelength range, for six timepoints (Fig. 9; “Appendix 3”).
In Early heat all timepoints correspond to after heading stage while in the other trials timepoints one to four were taken before heading, time points five and six during heading stage and after heading seven to nine time points. There is not a clear relationship between the heritability and the stage of the crop in which the images were taken.
Prediction accuracies of the proposed methods
Comparing vegetation indices versus all bands
Figure 2 shows the prediction accuracy of Methods 1–19 in the four environments for three timepoints (1, 5 and 9). In the Drought environment, the methods with the best prediction accuracy were those that used all the bands (M10–M19), while in the Irrigated environment in timepoint 9, most of the methods that use all bands were the best in terms of prediction accuracy (methods M11–M19); however, in timepoint 5, only methods M17 and M18 were better in terms of prediction accuracy than the methods that were built using the vegetation indices (M1–M9), but in timepoint 1, methods M1, M3 and M4 built using the vegetation indices had the best prediction accuracy. In the EarlyHeat environment (Fig. 2c), the methods with the best prediction accuracy were methods M17 and M18, which use all the available wavelengths, although it is important to point out that timepoint 1 was better than timepoints 5 and 9 in methods M10 to M16, which use all the bands. Also, in the Melgas (Fig. 2d) and Reduced Irrigated environments (Fig. 10; “Appendix 3”), methods M17 and M18 had the best prediction accuracy. However, in these environments the best prediction was observed in timepoint 9 and the worst, in timepoint 1. Appendices 1 and 2 show the rest of the prediction accuracies for timepoints 2, 3, 4, 6, 7 and 8 for all methods.
Comparing some methods for all timepoints
Compared in this section are methods M1, M2, M9, M10, M17, M18, and M19. Methods M1 and M2 were chosen because they were built using two of the most widely used vegetation indices (RNDVI, GNDVI), whereas M9 uses all 8 VI simultaneously; M10 uses all bands and Bayes B. Methods M17 and M18 were included because they provided the best prediction accuracies in all the environments using all bands, while method M19 was used because it performed well in the last section using all bands.
Drought (Fig. 3), Melgas (Fig. 4) and Reduced Irrigated (Fig. 11; “Appendix 3”) environments show that timepoints below 6 had lower prediction accuracies and the best predictions were from points 7, 9 and the joint timepoints 79, 89, 789, and 6789. Timepoint 79, 89, 789 and 6789 were obtained as the average of the timepoints 7 and 9, 8 and 9, 7, 8 and 9, and 6, 7, 8 and 9 respectively; this nomenclature is used in the rest of the manuscript. It is important to point out that the prediction accuracies of timepoint 8 were considerably lower than those of timepoints 7, 9, and 6. In the Irrigated environment (Fig. 3), a similar trend is observed, yet for some methods (M17 and M18), timepoint 7 provides the best predictions. In this environment (Irrigated), the differences in prediction accuracy between timepoint 5 and timepoints 6, 7, 8, 9, 79, 89, 789, and 6789 are not strong. This indicates that even with timepoint 5, we can generate good prediction accuracies for grain yield. In the EarlyHeat environment (Fig. 4), all timepoints produced good predictions, although methods M1 and M2 produced lower predictions in timepoints 7, 9, 79, 89, 789 and 6789. It is important to point out that methods M17 and M18 were the best in all timepoints in all environments, although in environments EarlyHeat and Melgas, the superiority of these methods is clearer.
Comparing environments for timepoints 5 and 9
Figures 5 and 6 show that there are differences in prediction accuracy between environments. In timepoint 5 (Fig. 5), EarlyHeat was the environment with the best predictions, followed by Irrigated and Drought, while the worst predictions were observed in Melgas and Reduced Irrigation. In timepoint 3 (Fig. 6), the behavior was similar to that of timepoint 5, since EarlyHeat was also the best in terms of prediction accuracy; however, here Melgas was the worst and the other three environments were in the middle. In timepoints 7 (Fig. 6) and 9 (Fig. 5), the pattern was different since here the best predictions were in the Drought environment, and the second best was EarlyHeat, since in four of the seven methods presented in Fig. 5, this environment had the second best predictions. In third place is the Irrigated environment, while the worst predictions were observed in Melgas and Reduced Irrigated. It is important to point out that methods M17 and M18 were consistently the best in the five environments. Furthermore, it should be noted that the planting date in EarlyHeat is around 5 weeks earlier than the planting dates in the other four environments; thus the comparison of prediction performance at the same timepoints does not represent a comparison at the same crop development stage.
Comparing methods M17 and M18 using all the bands and bands with heritabilities >0.5
Figure 7 compares methods M17 and M18 for all time points using all bands and only those bands with heritabilities >0.5. We observe in Drought that when only the bands with heritabilities >0.5 were used, prediction accuracies were better than when using all bands for both methods (M17 and M18). However, in the Irrigated environment using all bands, prediction accuracies were slightly better than when using only the bands with heritability >0.5; however, the difference is not relevant.
In Fig. 8, we observe that in both environments (EarlyHeat and Melgas), using all the bands was a little better than using only the bands with heritabilities >0.5, although the differences were not significant. The same pattern is observed for the Reduced Irrigated environment (Fig. 12; “Appendix 3”).
Discussion
Heritability estimates of the bands
Results indicated that the heritabilities of each wavelength are not homogeneous across groups. The Irrigated and EarlyHeat environments had the highest heritabilities (with values between 0.6 and 0.8), which were homogenous across wavelengths, while Drought had the lowest heritabilities (with lowest values around 0.2), which were heterogeneous across wavelengths and timepoints. Results in “Comparing vegetation indices vs all bands” section indicate that using all bands simultaneously as explanatory variables produced better prediction accuracies that using the VIs alone or combined. However, predictions were better when using only those bands with heritabilities >0.5 compared with using all bands only in Drought, while in the other 4 environments, using all bands produced slightly better prediction accuracies (“Comparing methods M17 and M18 using all the bands and bands with heritabilities>0.5” section). Therefore, the evidence indicates that using all bands simultaneously provided better prediction accuracies than using the VI alone or combined and even than using those bands with heritabilities >0.5. However, it is important to point out that the methods that used VI alone as predictor variables are very heterogeneous in terms of prediction accuracy, since some performed very poorly, while others produced reasonable predictions (for example, M6).
Prediction accuracy of the methods
Since we now have enough evidence to say that using all bands produced better predictions than using individual, combined VI and even when we restrict the models to less noisy features (H2 > 0.5), we compared the methods that used all the bands. Based on the prediction accuracy of the methods, results indicate that for this data set, methods M17 and M18 are the best for prediction. These two methods were better in all environments and in most of the nine timepoints, and were also considerably better than the PC methods (M11 to M16), the Bayes B method (M10), and a little better than the functional PLS method (M19). The best two methods (M17 and M18) are functional regression models and correspond to models 5 and 6 described in “Proposed single timepoint models for the adjusted data” section. Functional regression models nowadays have become an increasingly important statistical tool when the number of covariates is larger than the number of observations, where the unit of observation is generally viewed as a function or a curve defined based on some underlying continuous domain, and the observed data consist of a sample of functions taken from some population, sampled on a discrete grid.
Given the nature of our data, the functional regression that we implemented only considered functional predictors; however, this regression method can also be used when both the predictors and the responses are functions. For this reason, functional regression models have been implemented successfully in many research areas (spectroscopy, economics, environmental studies, bioscience, system engineering, etc.). Functional regression is also very attractive because it is a nondestructive technology that measures numerous chemical compounds in a variety of products (plant, soil, food, petroleum, wood products, etc.) and can be used in large databases in experimental and nonexperimental settings.
Prediction accuracy for timepoints
Regarding the prediction accuracy for timepoints, in general, prediction accuracies before timepoint 6 were poor in four environments, and all timepoints produced good predictions only in the EarlyHeat environment; a likely explanation for this may be that in this environment the sowing date was around 5 weeks earlier than the sowing dates in the other 4 environments, that is, the development of the crop for all timepoints was more advanced in EarlyHeat. For this reason, the empirical evidence indicates that, for this dataset, timepoint 6 achieved good prediction accuracy. Also, in general, timepoint 6 predictions are better than timepoint 8 predictions. However, we need to be careful when interpreting timepoint 6, since sowing time was different in each environment and the plants were at different growth stages when the bands were measured. Using this timepoint can be helpful for breeders, since it is around 28 days before timepoint 9. Also, it is important to point out that the predictions of the average timepoints under study (79, 89, 789, and 6789) are a little better than those of timepoints 6, 7 and 9 in methods that used all the bands; however, the increase in prediction accuracy is not large.
Conclusions
In this research, we proposed using all the bands simultaneously as predictor variables instead of using only one VI alone or all the VI together. First, we found that the heritabilities of the bands were heterogeneous across timepoints and environments and that the best heritabilities were observed in the Irrigated and EarlyHeat environments and the worst in the Drought environment. We found that using all the bands simultaneously produced better predictions than using one VI alone or all the VI together. When we used only the less noisy bands (H2 > 0.5) in Drought, the predictions improved, while in the rest of the environments the results were similar. Out of the methods that used all the bands, the best methods across timepoints and environments were M17 and M18 (functional Bspline and Fourier, respectively). Also, timepoint 6 and 8 predictions were slightly lower than those of timepoints 7 and 9, yet were close enough to be used for the prediction of wheat lines before harvesting. Finally, results show that the approach used to analyze highresolution image data used in this study is promising; however, it is also clear that its application in this context is not straightforward, since the Bayes B method, which is popular for genomic selection, did not produce the best predictions. There are many challenges that need to be considered in future research using functional regression models, such as the inclusion of genotype × environment interaction, random effects, traits not normally distributed and multiple traits as response variables. Also, other conventional methods (GBLUP, Bayes A, Ridge Regression, Bayes C) used in genomicenabled prediction should be tested in the context of highresolution imaging data.
Abbreviations
 APC:

average of the tenfold crossvalidation of the Pearson correlation
 BLUP:

best linear unbiased predictor
 CIMMYT:

Centro Internacional de Mejoramiento de Maíz y Trigo
 CT:

canopy temperature
 DH:

days to heading
 GNDVI:

green normalized difference vegetation index
 GY:

grain yield
 H2:

broadsense heritability
 HTPP:

highthroughput phenotyping platforms
 mND:

modified normalized difference at 705 nm wavelength
 NDVI:

normalized difference vegetation index
 NPQI:

normalized pheophytinization index
 OLS:

ordinary least square
 PC:

principal components
 PLS:

partial least sqaure
 PRI:

photochemical reflectance index
 RARSa:

ratio analysis of reflectance spectra chlorophyll a
 RARSb:

ratio analysis of reflectance spectra chlorophyll b
 RARSc:

ratio analysis of reflectance spectra chlorophyll c
 RNDVI:

red normalized difference vegetation index
 SRa:

simple ratio
 VI:

vegetative index
References
 1.
Araus JL, Cairns JE. Field highthroughput phenotyping: the new crop breeding frontier. Trends Plant Sci. 2014;19(1):52–61.
 2.
Bahar B, Yildirim M, Barutcular C, Ibrahim G. Effect of canopy temperature depression on grain yield and yield components in bread and durum wheat. Notulae Botanicae Horti Agrobotanici ClujNapoca. 2008;36:34.
 3.
Boegh E, Soegaard H, Broge N, Hasager CB, Jensen NO, Schelde K, et al. Airborne multispectral data for quantifying leaf area index, nitrogen concentration, and photosynthetic efficiency in agriculture. Remote Sens Environ. 2002;81:179–93.
 4.
Broge NH, Mortensen JV. Deriving green crop area index and canopy chlorophyll density of winter wheat from spectral reflectance data. Remote Sens Environ. 2002;81:45–57.
 5.
Clevers JGPW. The application of a weighted infraredred vegetation index for estimating leafarea index by correcting for soilmoisture. Remote Sens Environ. 1989;29:25–37.
 6.
Clevers JGPW, van der Heijden GWAM, Verzakov S, Schaepman ME. Estimating grassland biomass using SVM band shaving of hyperspectral data. Photogramm Eng Remote Sens. 2007;73(10):1141–8.
 7.
Colombo R, Bellingeri D, Fasolini D, Marino CM. Retrieval of leaf area index in different vegetation types using high resolution satellite data. Remote Sens Environ. 2003;86:120–31.
 8.
Curran PJ. Estimating green LAI from multispectral aerialphotography. Photogramm Eng Remote Sens. 1983;49:1709–20.
 9.
Curran PJ. Multispectral remotesensing for the estimation of green leaf area index. Philos Trans R Soc Lond Ser A Math Phys Eng Sci. 1983;309:257–70.
 10.
Falconer DS, Mackay TFC. Introduction to quantitative genetics. Harlow: Longman; 1996.
 11.
FebreroBande M, Oviedo de la Fuente M. Statistical computing in functional data analysis: the R package fda. usc. J Stat Softw. 2012;51(4):1–28.
 12.
Ferragina A, de Los Campos G, Vazquez AI, Cecchinato A, Bittante G. Bayesian regression models outperform partial least squares methods for predicting milk components and technological properties using infrared spectral data. J Dairy Sci. 2015;98(11):8133–51.
 13.
Govaerts YM, Verstraete MM, Pinty B, Gobron N. Designing optimal spectral indices: a feasibility and proof of concept study. Int J Remote Sens. 1999;20:1853–73.
 14.
Hernandez J, Lobos GA, Matus I, del Pozo A, Silva P, Galleguillos M. Using ridge regression models to estimate grain yield from field spectral data in bread wheat (Triticum aestivum L.) grown under three water regimes. Remote Sens. 2015;7(2):2109–26.
 15.
Labus M, Nielsen G, Lawrence R, Engel R, Long D. Wheat yield estimates using multitemporal NDVI satellite imagery. Int J Remote Sens. 2002;23:4169–80.
 16.
Li L, Zhang Q, Huang D. A review of imaging techniques for plant phenotyping. Sensors. 2014;14(11):20078–111.
 17.
Mason RE, Singh RP. Considerations when deploying canopy temperature to select high yielding wheat breeding lines under drought and heat stress. Agronomy. 2014;4:191–201.
 18.
Myneni RB, Hall FG, Sellers PJ, Marshak AL. The interpretation of spectral vegetation indexes. IEEE Trans Geosci Remote Sens. 1995;33:481–6.
 19.
Pask AJD, Pietragalla J, Mullan DM, Reynolds MP. Physiological breeding II: a field guide to wheat phenotyping. Mexico: CIMMYT; 2012.
 20.
Pérez P, de los Campos G. BGLR: a statistical package for whole genome regression and prediction. R package version 1. 0.2. 2013.
 21.
Pinty B, Leprieur C, Verstraete MM. Towards a quantitative interpretation of spectral vegetation indexes. Remote Sens Rev. 1993;7:127–50.
 22.
Quarmby N, Milnes M, Hindle T, Silleos N. The use of multitemporal NDVI measurements from AVHRR data for crop yield estimation and prediction. Int J Remote Sens. 1993;14:199–210.
 23.
Ramsay JO, Silverman BW. Applied functional data analysis: methods and case studies, vol. 77. New York: Springer; 2002.
 24.
Rees D, Sayre K, Acevedo E, Nava Sanchez T, Lu Z, Zeiger E, et al. Canopy temperatures of wheat: relationships with yield and potential as a technique for early generation selection. Mexico: CIMMYT; 1993.
 25.
Rutkoski J, Poland J, Mondal S, Autrique E, Crossa J, Reynolds MP, Singh RP. Predictor traits from highthroughput phenotyping improve accuracy of pedigree and genomic selection for yield in wheat. G3:GenesGenomesGenetics (accepted). 2016.
 26.
Sims DA, Gamon JA. Estimation of vegetation water content and photosynthetic tissue area from spectral reflectance: a comparison of indices based on liquid water and chlorophyll absorption features. Remote Sens Environ. 2003;84(4):526–37.
 27.
Singh A, Ganapathysubramanian B, Singh AK, Sarkar S. Machine learning for highthroughput stress phenotyping in plants. Trends Plant Sci. 2016;21(2):110–24.
 28.
Tucker CJ. Red and photographic infrared linear combinations for monitoring vegetation. Remote Sens Environ. 1979;8:127–50.
 29.
Viña A, Gitelson AA, NguyRobertson AL, Peng Y. Comparison of different vegetation indices for the remote assessment of green leaf area index of crops. Remote Sens Environ. 2011;115(12):3468–78.
 30.
Winterhalter L, Mistele B, Jampatong S, Schmidhalter U. High throughput sensing of aerial biomass and aboveground nitrogen uptake in the vegetative stage of wellwatered and drought stressed tropical maize hybrids. Crop Sci. 2011;51:479–89.
 31.
Xiao X, He L, Salas W, Li C, Moore B, Zhao R, et al. Quantitative relationships between fieldmeasured leaf area index and vegetation index derived from vegetation images for paddy rice fields. Int J Remote Sens. 2002;23:3595–604.
Authors’ contributions
OAML and AML performed all analysis. JR and MS designed the research and collected the data. OAML, JC, GA, GdC and JB were involved in drafting and writing the manuscript. All authors read and approved the final manuscript.
Source of funding
Funding was provided by “Centro Internacional de Mejoramiento de Maíz y Trigo”.
Competing interests
The authors declare that they have no competing interests.
Availability of data and materials
The data and materials used in this study can be downloaded from the link: http://hdl.handle.net/11529/10693. The links contains file corresponding to the phenotypic and bands data for each environments, Drought.Phe_and_Bands.RData, EarlyHeat.Phe_and_Bands.RData, Irrigated.Phe_and_Bands.RData, Irrigated.Phe_and_Bands.RData.
Author information
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
MontesinosLópez, O.A., MontesinosLópez, A., Crossa, J. et al. Predicting grain yield using canopy hyperspectral reflectance in wheat breeding data. Plant Methods 13, 4 (2017) doi:10.1186/s1300701601542
Received
Accepted
Published
DOI
Keywords
 Spectral data
 Vegetation indexes
 Prediction accuracy
 Genome selection
 Bayes B
 Spline regression
 Fourier regression
 Wheat