Predicting grain yield using canopy hyperspectral reflectance in wheat breeding data
 Osval A. MontesinosLópez^{1, 2},
 Abelardo MontesinosLópez^{3},
 José Crossa^{1},
 Gustavo de los Campos^{4},
 Gregorio Alvarado^{1},
 Mondal Suchismita^{1},
 Jessica Rutkoski^{1, 5},
 Lorena GonzálezPérez^{1} and
 Juan Burgueño^{1}Email authorView ORCID ID profile
DOI: 10.1186/s1300701601542
© The Author(s) 2017
Received: 22 July 2016
Accepted: 1 December 2016
Published: 3 January 2017
Abstract
Background
Modern agriculture uses hyperspectral cameras to obtain hundreds of reflectance data measured at discrete narrow bands to cover the whole visible light spectrum and part of the infrared and ultraviolet light spectra, depending on the camera. This information is used to construct vegetation indices (VI) (e.g., green normalized difference vegetation index or GNDVI, simple ratio or SRa, etc.) which are used for the prediction of primary traits (e.g., biomass). However, these indices only use some bands and are cultivarspecific; therefore they lose considerable information and are not robust for all cultivars.
Results
This study proposes models that use all available bands as predictors to increase prediction accuracy; we compared these approaches with eight conventional vegetation indexes (VIs) constructed using only some bands. The data set we used comes from CIMMYT’s global wheat program and comprises 1170 genotypes evaluated for grain yield (ton/ha) in five environments (Drought, Irrigated, EarlyHeat, Melgas and Reduced Irrigated); the reflectance data were measured in 250 discrete narrow bands ranging between 392 and 851 nm. The proposed models for the simultaneous analysis of all the bands were ordinal least square (OLS), Bayes B, principal components with Bayes B, functional Bspline, functional Fourier and functional partial least square. The results of these models were compared with the OLS performed using as predictors each of the eight VIs individually and combined.
Conclusions
We found that using all bands simultaneously increased prediction accuracy more than using VI alone. The Splines and Fourier models had the best prediction accuracy for each of the nine timepoints under study. Combining image data collected at different timepoints led to a small increase in prediction accuracy relative to models that use data from a single timepoint. Also, using bands with heritabilities larger than 0.5 only in Drought as predictor variables showed improvements in prediction accuracy.
Keywords
Spectral data Vegetation indexes Prediction accuracy Genome selection Bayes B Spline regression Fourier regression WheatBackground
Plant breeding programs routinely perform early field evaluations of large numbers of candidates for selection based not only on the main primary trait—grain yield measured in different environments—but also on several secondary traits related to yield, such as disease resistance. Methods that could help breeders measure grain yield based on other secondary traits in the early stages of plant growth could be of value to help reduce evaluation time and cost [25]. In recent years, the use of remote or proximal sensing, hyperspectral imaging, and laser scanners has helped develop lowcost, efficient highthroughput phenotyping platforms (HTPP) [1] which aim to collect data at low cost on many phenotypes of large numbers of breeding individuals at different stages of plant growth under different environmental conditions. This could drastically increase the number of traits that can be quantified on fieldgrown plants, selection intensity, prediction accuracy, and, therefore, the response to selection [25].
The basis of remote sensing and spectral science is the ability to measure electromagnetic energy at varying wavelengths that interact with different parts of the growing plant. The goal of spectral science is to measure a phenotype quantitatively through the interaction between light and plants, such as reflected, absorbed, transmitted and/or emitted photons. This is possible because each component of plant cells and tissues has wavelengthspecific absorbance, reflectance and transmittance properties [16]. For example, a healthy plant interacts (absorbs, reflects, emits, transmits and fluoresces) with electromagnetic radiation in a different way than an infected plant [16].
In practical applications of HTPP to agriculture, the reflectance of electromagnetic energy at different wavelength bands is usually summarized in scores of spectral vegetative indices (VI) that are further used to predict plant physiological issues or agronomic traits. Spectral VI are a simple and convenient way of extracting information from remotely sensed data that facilitates the processing and analysis of large amounts of data acquired by modern cameras and satellite platforms [13, 18]. Significant advances have been achieved in understanding the nature and proper interpretation of spectral VI [18, 21] and theoretical frameworks have been proposed to support the development of indices optimized for particular applications. Popular VIs are the Normalized Difference Vegetation Index (NDVI), the Canopy Water Mass Index [30] and the Modified Normalized Difference at 705 nm wavelength (mND; e.g., [26]). Vegetative indices have been applied successfully in some crops [3–9, 31]. For example, the canopy temperature (CT) and NDVI indices have been applied to estimate yield, taking advantage of the correlation between yield and these two VIs [2, 15, 17, 22]. Also, it has been documented that the aircanopy temperature difference index can be used as a selection criterion in wheat (Triticum aestivum L.) breeding programs to estimate yield [24]. The NDVI has also been used successfully to estimate wheat yield before harvest at the regional and farm scale [15].
Although several spectral VIs are positively correlated with grain yield and other important agronomic and physiological traits, they do not consider all the spectral bands from the hyperspectral sensors [28, 29]. Nevertheless, cameras with high spectral resolution can produce data on hundreds of spectral bands that can be further used to capture a wide range of information. Also, despite some successful applications of spectral VIs, most of these indices tend to be speciesspecific and, therefore, are not robust when applied across different species that have different canopy architectures and leaf structures because they use only a fraction of the available information on the measured wavelengths.
The important idea is that for sensor data to be meaningful, algorithms need to be developed to interpret the data and extract the most useful information to be translated into important traits for plant breeding. Thus, the use of highresolution images is important to develop prediction models for grain yield, yield components, and relevant physiological and agronomical traits. However, the enormous volume, variety, and velocity of HTPP data generated by such platforms make it a ‘big data’ problem. Big data generated by these near realtime platforms must be efficiently archived and retrieved for analysis [27]. The analysis and interpretation of these large datasets is quite challenging, although several authors have proposed using sensor data through a linear regression based on standard ordinary least squares [29]. To overcome collinearity among predictors (bands from the sensor), Hernandez et al. [14] concluded that penalized ridge regression models from spectral reflectance data at anthesis or grainfilling predict grain yield well under different water levels. Recently, Ferragina et al. [12] proved that highdimensional Bayesian regression models (similar to those used in genomic selection for predicting the performance of unobserved individuals based on a large number of markers) can be used to derive functions of hundreds of wavelengths. However, no Bayesian regression models or other functional regression models that define the function of wavelength for the prediction of grain yield and other traits in different environmental conditions have been studied in plant breeding [6].
Spectral reflectance indices
Index  Name  Physiological process  Type  Calculation 

RNDVI  Red normalized difference vegetation index  Green area, photosynthetic capacity, N status  VI  [x(780) − x(670)]/[x(780) + x(670)] 
GNDVI  Green normalized difference vegetation index  Green area, photosynthetic capacity, N status  VI  [x(780) − x(550)]/[x(780) + x(550)]/ 
SRa  Simple ratio  Green biomass  VI  [x(800)/]/[x(680)] and [x(900)]/[x(680)] 
RARSa  Ratio analysis of reflectance spectra chlorophyll a  Chlorophyll a content  PI  [x(675)]/[x(700)] 
RARSb  Ratio analysis of reflectance spectra chlorophyll b  Chlorophyll b content  PI  [x(675)]/[x(650) × x(700)] 
RARSc  Ratio analysis of reflectance spectra chlorophyll c  Chlorophyll c content  PI  [x(760)]/[x(500)] 
NPQI  Normalized pheophytinization index  Normal chlorophyll degradation; can be used to estimate phenology, pest and diseases  PI  [x(415) − x(435)]/[x(415) + x(435)] 
PRI  Photochemical reflectance index  Dissipation of excess radiation  PI  [x(531) − x(570)]/[x(531) + x(570)] 
Methods
Data
All 1170 lines were evaluated in all environments [Drought (severe drought), EarlyHeat (irrigated early planting for heat at sowing), Irrigated (irrigated bed planting), Melgas (irrigated flat planting) and Reduced Irrigated (moderate drought)]. In each environment, the lines were included in 39 trials, each comprising 30 lines; in each trial, the lines were studied using an alphalattice design with three replicates and six experimental blocks. The planting dates were all in 2013 as follows: EarlyHeat on October 30, Irrigated and Reduced Irrigated on November 21, Drought on November 26 and Melgas on December the 1st. The traits measured on each line were grain yield (GY) and days to heading (DH), but only GY was analyzed in this study.
The image data was obtained using a Piper PA16 Clipper flight that was fitted with a Hyperspectral camera (Model: Aseries, MicroHyperspec Airborne sensor, VNIR Headwall Photonics, www.headwallphotonics.com, Fitchburg, Massachusetts, US) and thermal camera (A600 series Infrared camera, FLIR, www.flir.com, Boston, US). The plane flew at 270 m above the surface.
The aerial high throughput phenotyping (HTP) data was measured around solar noon time every date, aligning the plane to the solar azimuth for the data acquisition. Images of the experimental fields were obtained and formatted to tabular data by calculating the mean value of the pixels inside the center of each individual trial plot represented as a polygon area on a map. The software used to achieve this was ArcMap (ESRI, USA, CA).
On the data processing, the 38 cm per pixel CT data was corrected with a linear calibration of slope 1.2253 and Y intercept −6767.9 with the software ImapQ (Alava Ingenieros, Madrid, Spain). The several individual images of each flight were used to compose a unique mosaic per date with the software Autopano Giga (Kolor SARL, France). Then they were manually georeferenced using ArcMap (ESRI, USA, CA). The original image data is stored in kelvin units ×100, the next formula was applied with ENVI software (Excelis VIS, USA, CO) to convert the pixel values to Celsius degrees: (Pixel value)/100 − 273.15.
As well, the 30 cm per pixel hyperspectral data was processed with the software HyproQ (Alava Ingenieros, Madrid, Spain). First the images had a radiometric calibration with coefficients provided by the Laboratory for Research Methods in Quantitative Remote Sensing of the Consejo Superior de Investigaciones Científicas (QuantaLab, IASCSIC, Spain) derived with a calibrated uniform light source; additionally the dark frame subtraction was performed to reduce the noise of the sensor. Corrections to decrease the effects of the atmosphere conditions in the images was performed modeling irradiance based on sunphotometer field measurements (Microtops II, Solar Light Company, PA, USA). The images were orthorectified and coarsely georeferenced based on the builtin Inertial Navigation System (INS/GPS). For the data extraction, where the image did not overlay the plots polygons because of INS inaccuracy, they were manually aligned to it in ArcMap.
The bands were measured on nine different dates (January 10, 2015; January 17, 2015; January 30, 2015; February 7, 2015; February 14, 2015; February 19, 2015; February 27, 2015; March 11, 2015; and March 17, 2015; which we called timepoints 1, 2, 3,…, 9, respectively) using 250 discrete narrow wavelengths. In each plot, 250 wavelengths λ_{1}, … λ_{250} from 392.03 to 850.69 nm were measured for each wheat line. The ith discretized spectrometic curve is given by x_{1}(λ_{1}), …, x_{n}(λ_{250}). We used the notation x(780) without subscripts to denote the response of the band measured at a wavelength of 780 nm, x(670) to denote the response of the band measured at a wavelength of 670 nm, and so on.
Note that early heat trial was planted in average 26 days earlier than the other trials; therefore, comparisons between individual timepoints between heat trial and the others environments should consider data on the number of weeks since sowing. The comparison between the environments except early heat trial can be done more less fairly since the heading date ranged from 77 to 82 days after sowing, a period of five days in average. In all the environments heading date happened after the timepoint four except in the Melgas environment in which heading date occurred at the same moment of timepoint six.
Definition and computation of spectral vegetation indices
Eight different VIs were constructed with the 250 bands and are described in Table 1.
Statistical methods
Adjusting the original data set
Proposed single timepoint models for the adjusted data

Model 1 Index ordinal least square (OLS) regression: \( y_{i} = \mu + z_{im} \alpha_{m} + \epsilon_{i} \),

Model 2 Joint index OLS regression: \( y_{i} = \mu + \sum \nolimits_{m = 1}^{8} z_{im} \delta_{m} + \epsilon_{i} \),

Model 3 All bands Bayes B regression: \( y_{i} = \mu + \sum\nolimits_{k = 1}^{250} x_{ik} \beta_{k} + \epsilon_{i} \),

Model 4 PC Bayes B regression: \( y_{i} = \mu + \sum\nolimits_{l = 1}^{NPC} PC_{il} \gamma_{k} + \epsilon_{i} \),

Model 5 All bands functional Bspline regression: \( y_{i} = \mu + \int x_{i} \left(k\right)\beta_{1} \left(k\right) dk + \epsilon_{i} \),

Model 6 All bands functional Fourier regression: \( y_{i} = \mu + \int x_{i} \left(k\right) \beta_{2} \left(k\right) dk + \epsilon_{i} \),

Model 7 All bands functional PLS regression: \( y_{i} = \mu + \int x_{i} \left(k\right) \beta_{3} \left(k\right) dk + \epsilon_{i} \),
The three proposed functional regression models (Models 5, 6 and 7) are the most popular functional regression models, where the responses are scalars and the covariates are functions. For this reason, the response variable (y _{ i }) is a scalar in all the proposed models and represents grain yield (GY). Also, the difference between Models 5, 6 and 7 is the basis used for representing β _{ o }(k), with o = 1, 2, 3. Here a basis is understood to be a set of standard functions \( (\phi_{w} )_{{w \in {\mathbb{N}}}} \) that are used to approximate any function of interest by a linear combination of a sufficiently large r _{ w } of these functions [23]. In Model 6, we assumed the Bspline basis; in Model 7, we used the Fourier basis; and in Model 8, the basis is the PLS. More details on the theory behind Models 5, 6, and 7 and their basis can be found in Ramsay and Silverman [23].
Methods implemented for the analyses in each environment
Method  Data  Model  Method  Data  Model 

M1  RNDVI  Model 1  M11  All bands with 5PC  Model 4 
M2  GNDVI  Model 1  M12  All bands with 10PC  Model 4 
M3  SRa  Model 1  M13  All bands with 20PC  Model 4 
M4  RARSa  Model 1  M14  All bands with 35PC  Model 4 
M5  RARSb  Model 1  M15  All bands with 45PC  Model 4 
M6  RARSc  Model 1  M16  All bands with 55PC  Model 4 
M7  NPQI  Model 1  M17  All bands  Model 5 
M8  PRI  Model 1  M18  All bands  Model 6 
M9  All indices  Model 2  M19  All bands  Model 7 
M10  All bands  Model 3 
It is important to point out that methods M1–M8 used only one of the 8 VIs, M9 used all 8 VIs simultaneously, M10–M19 used all 250 bands, but methods M11–M16 used all bands to perform a principal component analysis and then used different numbers of principal components (5, 10, 20, 35, 45, 55 PCs), as shown in Table 2. Additionally, methods M17 and M18 were implemented with only those bands whose heritability is >0.5.
Assessing prediction accuracy
For the prediction accuracies of the 19 proposed methods presented in Table 2, we implemented a tenfold crossvalidation—with 1053 (90%) lines for training and 117 (10%) for testing in each fold—that was assessed by the Pearson correlation between the observed BLUPs of GY and their predicted values using the testing data set. We reported the average of the tenfold crossvalidation of the Pearson correlation (APC) as measure of prediction accuracy as well as the quantiles 2.5 (LL) and 97.5% (UL) (see “Appendices 1, 2”). It is important to point out that we used the same Split (of the tenfold crossvalidation) in the 19 methods to ensure fair comparisons between methods.
Results
The results are given in two sections: the first section presents the heritability estimates of each of the 250 wavelengths for each environment, while the second section presents the prediction accuracies estimated under the implemented methods.
Heritability estimates
In Early heat all timepoints correspond to after heading stage while in the other trials timepoints one to four were taken before heading, time points five and six during heading stage and after heading seven to nine time points. There is not a clear relationship between the heritability and the stage of the crop in which the images were taken.
Prediction accuracies of the proposed methods
Comparing vegetation indices versus all bands
Comparing some methods for all timepoints
Compared in this section are methods M1, M2, M9, M10, M17, M18, and M19. Methods M1 and M2 were chosen because they were built using two of the most widely used vegetation indices (RNDVI, GNDVI), whereas M9 uses all 8 VI simultaneously; M10 uses all bands and Bayes B. Methods M17 and M18 were included because they provided the best prediction accuracies in all the environments using all bands, while method M19 was used because it performed well in the last section using all bands.
Comparing environments for timepoints 5 and 9
Comparing methods M17 and M18 using all the bands and bands with heritabilities >0.5
Discussion
Heritability estimates of the bands
Results indicated that the heritabilities of each wavelength are not homogeneous across groups. The Irrigated and EarlyHeat environments had the highest heritabilities (with values between 0.6 and 0.8), which were homogenous across wavelengths, while Drought had the lowest heritabilities (with lowest values around 0.2), which were heterogeneous across wavelengths and timepoints. Results in “Comparing vegetation indices vs all bands” section indicate that using all bands simultaneously as explanatory variables produced better prediction accuracies that using the VIs alone or combined. However, predictions were better when using only those bands with heritabilities >0.5 compared with using all bands only in Drought, while in the other 4 environments, using all bands produced slightly better prediction accuracies (“Comparing methods M17 and M18 using all the bands and bands with heritabilities>0.5” section). Therefore, the evidence indicates that using all bands simultaneously provided better prediction accuracies than using the VI alone or combined and even than using those bands with heritabilities >0.5. However, it is important to point out that the methods that used VI alone as predictor variables are very heterogeneous in terms of prediction accuracy, since some performed very poorly, while others produced reasonable predictions (for example, M6).
Prediction accuracy of the methods
Since we now have enough evidence to say that using all bands produced better predictions than using individual, combined VI and even when we restrict the models to less noisy features (H2 > 0.5), we compared the methods that used all the bands. Based on the prediction accuracy of the methods, results indicate that for this data set, methods M17 and M18 are the best for prediction. These two methods were better in all environments and in most of the nine timepoints, and were also considerably better than the PC methods (M11 to M16), the Bayes B method (M10), and a little better than the functional PLS method (M19). The best two methods (M17 and M18) are functional regression models and correspond to models 5 and 6 described in “Proposed single timepoint models for the adjusted data” section. Functional regression models nowadays have become an increasingly important statistical tool when the number of covariates is larger than the number of observations, where the unit of observation is generally viewed as a function or a curve defined based on some underlying continuous domain, and the observed data consist of a sample of functions taken from some population, sampled on a discrete grid.
Given the nature of our data, the functional regression that we implemented only considered functional predictors; however, this regression method can also be used when both the predictors and the responses are functions. For this reason, functional regression models have been implemented successfully in many research areas (spectroscopy, economics, environmental studies, bioscience, system engineering, etc.). Functional regression is also very attractive because it is a nondestructive technology that measures numerous chemical compounds in a variety of products (plant, soil, food, petroleum, wood products, etc.) and can be used in large databases in experimental and nonexperimental settings.
Prediction accuracy for timepoints
Regarding the prediction accuracy for timepoints, in general, prediction accuracies before timepoint 6 were poor in four environments, and all timepoints produced good predictions only in the EarlyHeat environment; a likely explanation for this may be that in this environment the sowing date was around 5 weeks earlier than the sowing dates in the other 4 environments, that is, the development of the crop for all timepoints was more advanced in EarlyHeat. For this reason, the empirical evidence indicates that, for this dataset, timepoint 6 achieved good prediction accuracy. Also, in general, timepoint 6 predictions are better than timepoint 8 predictions. However, we need to be careful when interpreting timepoint 6, since sowing time was different in each environment and the plants were at different growth stages when the bands were measured. Using this timepoint can be helpful for breeders, since it is around 28 days before timepoint 9. Also, it is important to point out that the predictions of the average timepoints under study (79, 89, 789, and 6789) are a little better than those of timepoints 6, 7 and 9 in methods that used all the bands; however, the increase in prediction accuracy is not large.
Conclusions
In this research, we proposed using all the bands simultaneously as predictor variables instead of using only one VI alone or all the VI together. First, we found that the heritabilities of the bands were heterogeneous across timepoints and environments and that the best heritabilities were observed in the Irrigated and EarlyHeat environments and the worst in the Drought environment. We found that using all the bands simultaneously produced better predictions than using one VI alone or all the VI together. When we used only the less noisy bands (H2 > 0.5) in Drought, the predictions improved, while in the rest of the environments the results were similar. Out of the methods that used all the bands, the best methods across timepoints and environments were M17 and M18 (functional Bspline and Fourier, respectively). Also, timepoint 6 and 8 predictions were slightly lower than those of timepoints 7 and 9, yet were close enough to be used for the prediction of wheat lines before harvesting. Finally, results show that the approach used to analyze highresolution image data used in this study is promising; however, it is also clear that its application in this context is not straightforward, since the Bayes B method, which is popular for genomic selection, did not produce the best predictions. There are many challenges that need to be considered in future research using functional regression models, such as the inclusion of genotype × environment interaction, random effects, traits not normally distributed and multiple traits as response variables. Also, other conventional methods (GBLUP, Bayes A, Ridge Regression, Bayes C) used in genomicenabled prediction should be tested in the context of highresolution imaging data.
Abbreviations
 APC:

average of the tenfold crossvalidation of the Pearson correlation
 BLUP:

best linear unbiased predictor
 CIMMYT:

Centro Internacional de Mejoramiento de Maíz y Trigo
 CT:

canopy temperature
 DH:

days to heading
 GNDVI:

green normalized difference vegetation index
 GY:

grain yield
 H2:

broadsense heritability
 HTPP:

highthroughput phenotyping platforms
 mND:

modified normalized difference at 705 nm wavelength
 NDVI:

normalized difference vegetation index
 NPQI:

normalized pheophytinization index
 OLS:

ordinary least square
 PC:

principal components
 PLS:

partial least sqaure
 PRI:

photochemical reflectance index
 RARSa:

ratio analysis of reflectance spectra chlorophyll a
 RARSb:

ratio analysis of reflectance spectra chlorophyll b
 RARSc:

ratio analysis of reflectance spectra chlorophyll c
 RNDVI:

red normalized difference vegetation index
 SRa:

simple ratio
 VI:

vegetative index
Declarations
Authors’ contributions
OAML and AML performed all analysis. JR and MS designed the research and collected the data. OAML, JC, GA, GdC and JB were involved in drafting and writing the manuscript. All authors read and approved the final manuscript.
Source of funding
Funding was provided by “Centro Internacional de Mejoramiento de Maíz y Trigo”.
Competing interests
The authors declare that they have no competing interests.
Availability of data and materials
The data and materials used in this study can be downloaded from the link: http://hdl.handle.net/11529/10693. The links contains file corresponding to the phenotypic and bands data for each environments, Drought.Phe_and_Bands.RData, EarlyHeat.Phe_and_Bands.RData, Irrigated.Phe_and_Bands.RData, Irrigated.Phe_and_Bands.RData.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Araus JL, Cairns JE. Field highthroughput phenotyping: the new crop breeding frontier. Trends Plant Sci. 2014;19(1):52–61.View ArticlePubMedGoogle Scholar
 Bahar B, Yildirim M, Barutcular C, Ibrahim G. Effect of canopy temperature depression on grain yield and yield components in bread and durum wheat. Notulae Botanicae Horti Agrobotanici ClujNapoca. 2008;36:34.Google Scholar
 Boegh E, Soegaard H, Broge N, Hasager CB, Jensen NO, Schelde K, et al. Airborne multispectral data for quantifying leaf area index, nitrogen concentration, and photosynthetic efficiency in agriculture. Remote Sens Environ. 2002;81:179–93.View ArticleGoogle Scholar
 Broge NH, Mortensen JV. Deriving green crop area index and canopy chlorophyll density of winter wheat from spectral reflectance data. Remote Sens Environ. 2002;81:45–57.View ArticleGoogle Scholar
 Clevers JGPW. The application of a weighted infraredred vegetation index for estimating leafarea index by correcting for soilmoisture. Remote Sens Environ. 1989;29:25–37.View ArticleGoogle Scholar
 Clevers JGPW, van der Heijden GWAM, Verzakov S, Schaepman ME. Estimating grassland biomass using SVM band shaving of hyperspectral data. Photogramm Eng Remote Sens. 2007;73(10):1141–8.View ArticleGoogle Scholar
 Colombo R, Bellingeri D, Fasolini D, Marino CM. Retrieval of leaf area index in different vegetation types using high resolution satellite data. Remote Sens Environ. 2003;86:120–31.View ArticleGoogle Scholar
 Curran PJ. Estimating green LAI from multispectral aerialphotography. Photogramm Eng Remote Sens. 1983;49:1709–20.Google Scholar
 Curran PJ. Multispectral remotesensing for the estimation of green leaf area index. Philos Trans R Soc Lond Ser A Math Phys Eng Sci. 1983;309:257–70.View ArticleGoogle Scholar
 Falconer DS, Mackay TFC. Introduction to quantitative genetics. Harlow: Longman; 1996.Google Scholar
 FebreroBande M, Oviedo de la Fuente M. Statistical computing in functional data analysis: the R package fda. usc. J Stat Softw. 2012;51(4):1–28.View ArticleGoogle Scholar
 Ferragina A, de Los Campos G, Vazquez AI, Cecchinato A, Bittante G. Bayesian regression models outperform partial least squares methods for predicting milk components and technological properties using infrared spectral data. J Dairy Sci. 2015;98(11):8133–51.View ArticlePubMedGoogle Scholar
 Govaerts YM, Verstraete MM, Pinty B, Gobron N. Designing optimal spectral indices: a feasibility and proof of concept study. Int J Remote Sens. 1999;20:1853–73.View ArticleGoogle Scholar
 Hernandez J, Lobos GA, Matus I, del Pozo A, Silva P, Galleguillos M. Using ridge regression models to estimate grain yield from field spectral data in bread wheat (Triticum aestivum L.) grown under three water regimes. Remote Sens. 2015;7(2):2109–26.View ArticleGoogle Scholar
 Labus M, Nielsen G, Lawrence R, Engel R, Long D. Wheat yield estimates using multitemporal NDVI satellite imagery. Int J Remote Sens. 2002;23:4169–80.View ArticleGoogle Scholar
 Li L, Zhang Q, Huang D. A review of imaging techniques for plant phenotyping. Sensors. 2014;14(11):20078–111.View ArticlePubMedPubMed CentralGoogle Scholar
 Mason RE, Singh RP. Considerations when deploying canopy temperature to select high yielding wheat breeding lines under drought and heat stress. Agronomy. 2014;4:191–201.View ArticleGoogle Scholar
 Myneni RB, Hall FG, Sellers PJ, Marshak AL. The interpretation of spectral vegetation indexes. IEEE Trans Geosci Remote Sens. 1995;33:481–6.View ArticleGoogle Scholar
 Pask AJD, Pietragalla J, Mullan DM, Reynolds MP. Physiological breeding II: a field guide to wheat phenotyping. Mexico: CIMMYT; 2012.Google Scholar
 Pérez P, de los Campos G. BGLR: a statistical package for whole genome regression and prediction. R package version 1. 0.2. 2013.
 Pinty B, Leprieur C, Verstraete MM. Towards a quantitative interpretation of spectral vegetation indexes. Remote Sens Rev. 1993;7:127–50.View ArticleGoogle Scholar
 Quarmby N, Milnes M, Hindle T, Silleos N. The use of multitemporal NDVI measurements from AVHRR data for crop yield estimation and prediction. Int J Remote Sens. 1993;14:199–210.View ArticleGoogle Scholar
 Ramsay JO, Silverman BW. Applied functional data analysis: methods and case studies, vol. 77. New York: Springer; 2002.View ArticleGoogle Scholar
 Rees D, Sayre K, Acevedo E, Nava Sanchez T, Lu Z, Zeiger E, et al. Canopy temperatures of wheat: relationships with yield and potential as a technique for early generation selection. Mexico: CIMMYT; 1993.Google Scholar
 Rutkoski J, Poland J, Mondal S, Autrique E, Crossa J, Reynolds MP, Singh RP. Predictor traits from highthroughput phenotyping improve accuracy of pedigree and genomic selection for yield in wheat. G3:GenesGenomesGenetics (accepted). 2016.
 Sims DA, Gamon JA. Estimation of vegetation water content and photosynthetic tissue area from spectral reflectance: a comparison of indices based on liquid water and chlorophyll absorption features. Remote Sens Environ. 2003;84(4):526–37.View ArticleGoogle Scholar
 Singh A, Ganapathysubramanian B, Singh AK, Sarkar S. Machine learning for highthroughput stress phenotyping in plants. Trends Plant Sci. 2016;21(2):110–24.View ArticlePubMedGoogle Scholar
 Tucker CJ. Red and photographic infrared linear combinations for monitoring vegetation. Remote Sens Environ. 1979;8:127–50.View ArticleGoogle Scholar
 Viña A, Gitelson AA, NguyRobertson AL, Peng Y. Comparison of different vegetation indices for the remote assessment of green leaf area index of crops. Remote Sens Environ. 2011;115(12):3468–78.View ArticleGoogle Scholar
 Winterhalter L, Mistele B, Jampatong S, Schmidhalter U. High throughput sensing of aerial biomass and aboveground nitrogen uptake in the vegetative stage of wellwatered and drought stressed tropical maize hybrids. Crop Sci. 2011;51:479–89.View ArticleGoogle Scholar
 Xiao X, He L, Salas W, Li C, Moore B, Zhao R, et al. Quantitative relationships between fieldmeasured leaf area index and vegetation index derived from vegetation images for paddy rice fields. Int J Remote Sens. 2002;23:3595–604.View ArticleGoogle Scholar