Predicting the quality of ryegrass using hyperspectral imaging

Background The quality of forage plants is a crucial component of animal performance and a limiting factor in pasture based production systems. Key forage attributes that may require improvement include the sugar, lipid, protein and energy contents of the vegetative parts of these plants. The aim of this study was to evaluate the potential capacity of hyperspectral imaging (HSI) for non-invasive assessment of forage chemical composition. Hyperspectral image data within the visible near-infrared range into the extended near-infrared covering 550–1700 nm wavelengths were obtained from 185 accessions of ryegrass (Lolium perenne), which were also analysed for 13 forage quality attributes. Results Medium to high predictive power was observed for the HSI models of total sugars (R2 validation of 0.58), high molecular weight sugars (R2 validation of 0.63), %Ash (R2 validation of 0.50) and %Nitrogen (R2 validation of 0.70). Significant HSI models with low R2 validation of 0.1–0.5 were also obtained for low molecular weight sugars, NDF (%), ADF (%), DOMD (% DM), ME (MJ/kg DM), DM (%), Ca (mg/g) and OM (%). We also observed significant differences in the chemical composition between the pseudostems and leaves of the plants for each accession. The power of HSI for prediction of these differences within plants was also demonstrated. Conclusion This study paves the way for the HSI technology to be used for in-field estimation of forage composition attributes in perennial ryegrass. This will allow more rapid genetic-based selection and breeding for a trait that is normally expensive to measure providing a cheaper, non-destructive and high throughput screening tool.


Background
Genomic selection can be effectively used to increase forage and animal productivity by selecting traits of interest in forages [1]. Among these traits, forage composition is one of the hardest and most costly traits to measure as it requires harvest and manual separation. Another limitation to the measurement of this trait is that the standard method of measuring forage composition requires collection and destruction of the plant material. Therefore, non-invasive technologies from which models can be developed to predict the composition of forage would be highly valuable [2]. A body of research has demonstrated the potential of non-invasive near-infrared spectroscopy (NIRS) based methods (dried and ground samples) to estimate the components of forage [3][4][5][6]. Other noninvasive technologies including Hyperspectral Imaging (HSI) systems [7][8][9][10] have been used to develop predictive models for forage quality and quantity. If accurate prediction models can be developed, such technologies will offer the potential for faster and higher throughput prediction of forage quality attributes and hence streamlining more rapid genomic selection for quality traits.
Near-infrared spectroscopy (NIRS) is based on utilizing the interaction of electromagnetic radiation with matter to detect the characteristic chemical signatures of target materials [2]. NIRS instruments based on single point measurement typically require a multitude of replicate scans of the target object for improved prediction of attributes. NIRS instruments can be used to predict the attributes of heterogeneous materials provided a sufficient number of replicate scans are obtained to account for this heterogeneity [11] although this adds extra time and cost to applications. NIRS has been used in perennial Open Access Plant Methods *Correspondence: paul.shorten@agresearch.co.nz 1 AgResearch, Ruakura Research Centre, Private Bag 3123, Hamilton 3240, New Zealand Full list of author information is available at the end of the article ryegrass to estimate leaf relative water content [12]. HSI instruments, however, are capable of capturing both spectral and spatial information [2]. These features make the technology suitable for automated, rapid and largescale screening of forage. HSI systems have been used at plot, paddock, farm and catchment scales to determine type of forage as well as the quality of forage [7-10, 13, 14], although there is less information at the plant scale, which is the scale of interest for plant breeders. The deployment of the HSI technology, like any other technology, for forage analysis is dependent on the time and spatial scales of interest. Furthermore, because model calibrations are dependent on the spatial scale of interest as well as the lighting conditions, care must be taken when transferring calibration equations between spatial scales. This is specifically true when the forage is heterogeneous, and lighting and forage geometry only allow for observation of a subset of the sward in which case, a calibration transfer technique must be employed.
The targeted selection of new forages for optimal animal production and environmental outcomes in pastoralbased systems requires detailed information on forage composition. Information on the temporal (diurnal and seasonal) and spatial (between plants in the field as well as within the sward) variation in the composition of forage is also required to assess the performance of new forages and their interaction with the farm system. Forage quality traits of interest include the contents of total sugars, high molecular weight (HMW) sugars, low molecular weight (LMW) sugars, ash, calcium (Ca) and nitrogen [1,15]. In addition, more complex and calculated composition traits such as neutral detergent fibre (NDF), acid detergent fibre (ADF), digestible organic matter in dry matter (DOMD), metabolisable energy (ME), and organic matter (OM) are the gold standard traits in forages [1,15]. The ME content of forage plays the central role in animal growth, maintenance and production. It also provides a measure of production potential, although production is dependent on the other components of the forage as well [15]. NDF is an indication of the digestibility of cellulose, hemicellulose and lignin which comprise the cell wall components of the plant structure. ADF is a measure of the digestibility of the cellulose and lignin components only. The ash in forage consists of mineral elements such as sulphates, chlorides, and phosphates, which provide no energetic value to the animal. DOMD represents the organic matter in dry matter (DM), which is residual dry weight of the forage after the removal of moisture. The water soluble carbohydrate (WSC) includes mono-and di-saccharides, oligosaccharides and fructans (LMW and HMW sugars). Water soluble carbohydrates are the most digestible components of ryegrass and play an important role in the complex process of ruminant digestion [16]. Water soluble carbohydrates also provide a primary substrate for ruminal microbiota to breakdown of plant protein, which then becomes available for animal growth and maintenance. High WSC forage also potentially increases milk production of dairy cows while reducing the excretion of urinary nitrogen. The latter is a significant environmental problem in pastoral-based agricultural systems [16]. The concentration of the soluble sugars within ryegrass also exhibits a diurnal pattern that is seasonally dependent on the time course of photosynthesis [17][18][19][20]. Reducing the nitrogen content of forage also provides another strategy to reduce the nitrogen intake of the animal and therefore reduce urinary nitrogen levels and nitrogen losses from pasture [21][22][23].
The purpose of this study is to develop and test HSI models to predict the chemical composition of different genotypes of perennial ryegrass. Another goal is to examine the differences in the chemical composition of the leaves and pseudostems of the plants. The role of different wavelengths in the detection of different forage quality traits was investigated. Also, the likely predictive power of different wave ranges of very near-infrared (VNIR) and extended NIR was explored. An examination of the utility of a range of chemometric methods to predict the chemical composition of ryegrass forage was also conducted.

Results
Raw HSI images of forage blades and pseudostems are shown in Fig. 1a, b respectively. The forage heat map at 1080 nm is shown in Fig. 1c for the blades image in Fig. 1a. The region of interest (ROI) for the blades image in Fig. 1a is shown in Fig. 1d. The distribution in mean forage reflectance spectra in the ROI across all 37 genotypes is shown in Fig. 2. There is significant variability between genotypes in the mean forage reflectance spectra in the ROI for blades samples (P < 0.001). Reflectance minima are at 1000 nm, 1200 nm and 1450 nm and are consistent with water absorbance bands and forage reflectance spectra obtained from dried and milled samples [3].
Prediction models were calibrated and validated using the hyperspectral images and the wet chemistry data (partial least squares regression (PLSR) models). Comparisons of the best candidate calibration models for the BL + PS (Fig. 3) and BL datasets are described in Table 1) where model performance was similar for each dataset. The performance of the model for the prediction of total sugars in the BL + PS dataset is shown in Fig. 3a (R 2 validation = 0.58, n = 66, LV = 12, RMSE = 34.4 mg/g). Average-high model performance was also observed for the prediction of HMW sugars for the BL + PS dataset (R 2 validation = 0.63, n = 66, LV = 12, RMSE = 21.6 mg/g). The model performance for % Nitrogen for the BL + PS dataset is shown in Fig. 3b (R 2 = 0.70, n = 64, LV = 20, RMSE = 0.35%). HSI models (R 2 validation of 0.1-0.5) are also obtained for Ash (%), NDF (%), ADF (%), DOMD (% DM), ME (MJ/kg DM), DM (%), Ca (mg/g) and OM (%).
We also examined the ability of a variety of multivariate methods to predict the forage attributes. Our comparison of the different methods for nitrogen prediction on the BL + PS dataset is summarized in Table 2. Partial least squares regression methods provided the best predictions on the validation dataset. Gaussian process regression and RF methods require larger dataset for training [24] and did not perform as well as PLSR. Support vector machine provided reduced performance compared to PLSR, and would also be expected to improve in performance relative to PLSR with a larger dataset. The various multiple linear regression models provided validation R 2 of around 0.6, although the performance of these models would likely decrease on an independent dataset as they less effectively deal with multi-collinear data compared to PLSR based methods.
We found that wavelengths in the range 900-1700 nm provided a much better prediction of total sugars than wavelengths in the 550-900 nm wavelength range with a calibration R 2 of 0.53 (900-1700 nm) compared to 0.18 Fig. 1 a Raw HSI image (320 pixels per line by 400 lines per image) of the ryegrass blades (at 558, 740 and 937 nm wavelengths). The image was captured directly above the plant. b Raw HSI image of forage of the ryegrass pseudostems after harvesting the top half of the plant (at 558, 740 and 937 nm wavelengths). The image was captured directly above the plant. c Forage heat map of ryegrass at 1080 nm (red denotes high reflectance and blue denotes low reflectance). d Region of interest (ROI) for the ryegrass (white pixels) (550-900 nm) (Table 3). Conversely, we found that wavelengths in the 550-900 nm and the 900-1700 nm ranges each provided a similar prediction of nitrogen % with a calibration R 2 of 0.62 (900-1700 nm) compared to 0.71 (550-900 nm) ( Table 3). This highlights that the relative importance of wavelength ranges for the prediction of nitrogen and total sugars is dependent on the attribute. However, total sugars and nitrogen models based on the full wavelength range 550-1700 nm were superior to the respective individual VNIR (550-900 nm) and extended NIR (900-1700 nm) models, which highlights the value of obtaining spectral information over a broad range of wavelengths for accurate prediction of these attributes. Key wavelengths for the prediction of total sugars (based on PLSR(VIP)) are 548-573, 642-750, 1332-1460, 1509-1519, 1558-1627, and 1676-1696 nm. Key wavelengths for the prediction of nitrogen (based on PLSR(VIP)) are 548-582, 637-755, 888-893, 932-937, 1351-1415, and 1514-1696 nm.
There were differences in the nitrogen, total sugar, LMW and HMW sugar concentration between pseudostems and leaves in the 15 plants with wet chemistry measurements (Fig. 4). The pseudostems had a nitrogen concentration of 2.1 ± 0.35% whereas the blades had a nitrogen concentration of 3.2 ± 0.32% (difference of 1.1 ± 0.13% (P < 0.001)). Higher nitrogen concentrations in the blades are consistent with other studies that observed that the nitrogen concentration in leaves was ~ 50% greater than the stems of 99 day old perennial ryegrass plants [25] and mature ryegrass [26], although this is dependent on season and the time period after harvesting. In contrast, the pseudostems had a larger total sugar concentration of 153 ± 53 mg/g compared to leaves with a concentration of 60 ± 18 mg/g (difference of 93 ± 15 mg/g (P < 0.001)). This difference was associated with a difference (P < 0.001) in the HMW sugars between the pseudostems (75 ± 39 mg/g) and leaves (17 ± 14 mg/g) and a similar difference (P < 0.001) in the LMW sugars between the pseudostems (77 ± 15 mg/g) and leaves (43 ± 7.5 mg/g). LMW sugars are 2.5 times more concentrated in the leaves than the pseudostems compared to HMW sugars. Plants with high concentrations of nitrogen, total sugar, LMW or HMW sugar in the leaves also had corresponding high concentrations of nitrogen, total sugar, LMW or HMW sugar in the pseudostems (R 2 = 0.50-0.89; P < 0.01).
There are significant differences between the mean spectra of PS and BL plants (Fig. 5a). Pseudostems have a characteristic high reflectance at 800-1100 nm    (reflectance of 1.1 for PS compared to 0.7 for BL at 1100 nm) and a characteristic low reflectance at 1200-1500 nm respectively (reflectance of -1.2 for PS compared to -0.7 for BL at 1500 nm). The spectra were used to classify the genotypes into PS and BL groups (Fig. 5b). This highlights that the spectral profiles of the pseudostems and leaves of ryegrass plants are different. The HSI predicted distribution in the concentration of nitrogen and total sugars prior to the first cut are shown in Fig. 6a, b respectively for the image in Fig. 1a. These predictions are independent of the lighting gradients inherent with the experimental setup illustrated in Fig. 1a. Nitrogen concentration was predicted to be 1-2.5% at the base of the plant (PS) and 2.5-4% at the top of the plant (BL). The total sugar concentration was 100-250 mg/g at the base of the plant (PS) but only 40-100 mg/g at the top of the plant (BL). These predictions for the spatial distribution of nitrogen and total sugars within the plant are consistent with the significant differences in the wet chemistry measurements of nitrogen and total sugars between the blades and pseudostems (Fig. 4a, b). This highlights the ability of hyperspectral systems to predict not only between plant differences in attributes, but also the variation in these attributes within a single plant.

Discussion
The models developed in this study provide a rapid screening tool for selection of the best genotypes without the need for ongoing wet chemistry measurements. However, our calibration dataset of 120 plants is not large enough to cover a broader range of these traits. Therefore, these prediction models are expected to improve for calibration datasets with a larger number of samples such as the 1000-4000 laboratory analyzed samples in larger studies [9,27]. Despite this, our models provide accurate, repeatable and rapid prediction of ryegrass quality traits under field conditions. In addition, our predictions of plant chemistry potentially provides a method to classify different plant species that have different chemical composition [28][29][30] (e.g. clover and/or weeds) for mixed sward-based applications such as high/low nitrogen species content and/or nitrogen fertilizer delivery [13,14]. Hyperspectral data can be used to obtain information on the quality of forage on both spatial and temporal scales [8,10]. This extra information can be used to parameterize mathematical models of forage growth and their response to environmental and management variables (e.g. climate and grazing) [31]. Hyperspectral data also provide information for testing the underlying assumptions of these mathematical models and revision and improvement of the models accordingly. The amalgamation of hyperspectral data with plant modelling will also allow for in-field assessment of plants and their diurnal and seasonal nutrient flows. These are traits and interactions that are difficult and time-consuming to objectively measure otherwise.
Our current models were based on 550-1700 nm waveband range and are likely to be less informative than models calibrated over a broader range of wavelengths (350-2500 nm) [3,7,32]. We found that wavelengths in the extended NIR range 900-1700 nm provided a much better prediction of total sugars than wavelengths in the visible near-infrared 550-900 nm wavelength range. However, the two ranges provided a similar prediction of % nitrogen. This indicates that low Fig. 6 a The HSI predicted distribution in % nitrogen in the same plant shown in Fig. 1a. White denotes the background and regions with very low reflectance. Nitrogen concentration is 1-2.5% at the base of the sward (PS) and 2.5-4% at the top of the sward (BL). b The HSI predicted distribution in the total concentration of sugars (mg/g) in the same plant shown in Fig. 1a. White denotes the background and regions with very low reflectance. Total sugar concentration is 100-250 mg/g at the base of the sward (PS) and 40-100 mg/g at the top of the sward (BL) cost visible near-infrared 400-1000 nm devices [24] are likely to provide satisfactory predictions of nitrogen but are likely to provide poorer prediction of the concentration of sugar in forage.
Differences in the nitrogen, total sugar, LMW and HMW sugar concentration between pseudostems and leaves of the same genotypes and even individual plants show the imbalanced distribution of these components in perennial ryegrass. The significantly higher nitrogen concentration in leaves than in the pseudostems was consistent with other studies [25]. The very high sugar concentration in pseudostems compared to the leaves was also consistent with previous studies using destructive methods, although this was dependent on the ryegrass cultivar, plant age and time of day [20,26,33].
We also used our model calibrations to predict the distributions in the concentrations of nitrogen and sugar within an individual plant at an individual pixel scale (sub leaf scale). Our predictions in the spatial distribution in nitrogen and total sugars within the plant were consistent with the measured differences in nitrogen and total sugars between stems and leaves in studies using destructive measures [20,25,26,33]. This highlights the potential for hyperspectral imaging to predict the concentration and distribution of the components within an individual plant and the opportunity to conduct non-destructive and continuous experiments on the changes in the nutrient distribution within ryegrass plants.
The high positive correlation of nitrogen, total sugar, LMW or HMW sugar in the leaves with those in the pseudostems indicates that models calibrated to material from the upper zones of the plant will provide biased predictions of the average concentration of these components within the entire plant. The partitioning of sugar is consistent with previous studies of the distribution of sugars within ryegrass that found that the concentration of total sugar in the pseudostem was more than twice that in the leaf, although dependent on the cultivar, plant age and time of day [20,26,33]. However, the whole plant predictions of these traits will be correlated with the actual concentrations of these traits. The slope, intercept and error variance for this relationship, however, will depend on the ryegrass cultivar, plant age, time of day and plant structure. This highlights that bias and error can be introduced when extrapolating model predictions to regions of forage that have not been subject to hyperspectral calibration analysis. The identification of the differences in the spectral signatures of the pseudostems and leaves can be used to define the region of interest in an image for trait prediction (e.g. only in the leaves). It can also be used for spectral un-mixing procedures for forage imaged at greater distances from the ground [34].
Although the same approach and procedures can be used for other species of grasses, there will be speciesspecific challenges as the shape, shading and leaf overlap will vary across species. Furthermore, each species has its own chemical variation in different organs or sections of the plant. Therefore, species-specific models may be required for more accurate prediction of traits of interest.

Conclusions
This study examines the utility of Hyperspectral Imaging (HSI) based methods for non-invasive assessment of the composition of ryegrass. The quality of forage is an important component of animal performance and environmental impact in pasture based production systems. Hyperspectral image data (550-1700 nm) were obtained from 185 individual ryegrass (Lolium perenne) plants that were also assessed for 13 forage quality attributes including nitrogen and sugar content. We used this data to develop models to predict these quality attributes of ryegrass and these provide for more accurate, repeatable and rapid prediction of ryegrass quality attributes under field conditions. We also examined ten different chemometric methods for predicting the forage attributes and established that partial least squares regression models performed favorably for the size of our calibration dataset. We also examined the relative importance of different wavelengths for the prediction of the different quality attributes and these can be used to make informed decisions about suitable sensors for field deployment of spectral systems. We also observed significant differences in the concentrations of nitrogen and sugars between the pseudostems and leaves of the plants and demonstrated the ability for hyperspectral systems to predict these differences within the plant. The use of hyperspectral systems will allow for more rapid genetic selection of desirable ryegrass attributes and provides an in-field tool to investigate the interactions between forage genetics, animal production and environmental outcomes in pastoral-based agricultural systems.

Plant material
Thirty-seven wild accessions and cultivars of perennial ryegrass (Lolium perenne L.) plants were clonally replicated (n = 5) and grown in individual pots in a fully-randomised complete block (n = 185 plants in total), outdoors at the AgResearch Grasslands campus in Palmerston North, New Zealand (40.3804°S, 175.6138°E). Plant seeds originated from different European countries, NZ and Australia. Clones were produced by dividing one plant into five 5-tiller plants and potted in prepared soil mix with slow release Osmocote. Plants were kept outside but under overhead irrigation so that the plants did not dry out during periods with no rain. Plants were 3 months old when harvested. For each plant, total leaf material was harvested, within 1 min after hyperspectral scanning, by cutting at 4 cm above the soil surface. Leaf material was immediately snap-frozen in liquid nitrogen. For a subset of three genotypes × 5 clonal replicates (n = 15 plants), pseudostems (PS) and upper leaf blades (BL) were scanned and harvested separately. All samples were held at − 80 °C prior to freeze-drying. Freezedried material was milled and subsequently analysed by wet chemistry. The ratio of leaf blade: pseudostem was also recorded for each plant.
WSC were determined by liquid/gas chromatography-mass spectrometry based technique [35]. Dry matter and Ash/OM were determined by AOAC 930.15/925.10/942.05. For Ash determination the sample was ignited at 500 °C to burn off all organic material. The inorganic material not volatilized at this temperature is the ash. For dry matter determination the moisture of the sample was removed by volatilization caused by heating at 105 °C for 16 h. The amount of material left after the removal of the moisture was defined as the dry matter. Total Nitrogen was determined by Leco CN analyzer, AOAC 968.06 [36]. The sample was weighed into tin foil capsule, loaded into the furnace and combusted in a stream of oxygen. The products of combustion were then passed through a secondary furnace for further oxidation and particulate removal. The moisture free gases were then swept through a heated copper catalyst under helium flow to remove oxygen and convert NOx to N2, and the nitrogen content was determined with a thermal conductivity cell. The crude protein content of the sample was obtained by multiplying total nitrogen content by 6.25. Calcium was determined by preparation AOAC 968.08D followed by colourimetric analysis. The estimation of DOMD followed the method of Roughan and Holland [37]. NDF/ADF were determined by Ankom. Metabolizable Energy (ME) was determined by calculation.

Hyperspectral imaging
The Hyperspectral Line Scanning Imaging System (Hyperspec ® Extended VNIR, Headwall Photonics, Fitchburg, MA, USA) with 235 wavebands captured between 550 and 1700 nm for each of 320 pixels in a line and 400 lines per image was used to capture spectra from the samples. Hyperspectral images were captured before and after harvesting the top half of the plant (BL and PS images respectively). A 50 mm lens with a focal length of 25 mm, pixel pitch of 30 µm and an f-stop of 2.8 were used for imaging all samples. A moving stage system was used at a speed of 11.1 mm/s and with an image write speed of 25 frames per second. The spatial resolution was calculated by the Hyperspec ™ software and the HSI camera exposure setting was fixed at 28 ms. A single line beam halogen light source (30 o angle of incidence from vertical, 400 W) was used to illuminate samples and all 185 plants were scanned. Measurements were undertaken on the 22nd of March 2016 in a dark room at the AgResearch Grasslands site, Palmerston North, New Zealand. A total of 452 images were collected with either two or three images per plant from directly above depending on the quality of image capture and the general morphology of each plant. White and dark reference samples were collected for each image with a 200 mm × 50 mm Spectralon white reference tile and the lens cap on respectively. The reflectance (R) of each pixel/ sample was calculated using these white (W) and dark (D) reference samples according to the equation where the calibrated image reflectance is obtained from the raw image irradiance (I) and the dark and white reference images.

Analysis
Approximately 128,000 spectra were collected per hyperspectral image, however, only a portion of these contain information about the sample of interest. An algorithm was developed to automatically segment the images into ryegrass regions of high spectral reflectance. This was defined by the region with R ≥ 0.3 (reflectance at 1080 nm, which was chosen as this wavelength is not much affected by water content), which was representative of the harvested region.
Multivariate analysis of the spectra was conducted using partial least squares regression (PLSR). Partial least squares regression is a widely accepted regression modelling approach that effectively deals with multicollinear data [38]. Models tested were based on the mean spectrum. The mean spectrum were calculated from 29,228 ± 18,255 HSI pixels, which represents on