Improved estimation of aboveground biomass in wheat from RGB imagery and point cloud data acquired with a low-cost unmanned aerial vehicle system
Plant Methodsvolume 15, Article number: 17 (2019)
Aboveground biomass (AGB) is a widely used agronomic parameter for characterizing crop growth status and predicting grain yield. The rapid and accurate estimation of AGB in a non-destructive way is useful for making informed decisions on precision crop management. Previous studies have investigated vegetation indices (VIs) and canopy height metrics derived from Unmanned Aerial Vehicle (UAV) data to estimate the AGB of various crops. However, the input variables were derived either from one type of data or from different sensors on board UAVs. Whether the combination of VIs and canopy height metrics derived from a single low-cost UAV system can improve the AGB estimation accuracy remains unclear. This study used a low-cost UAV system to acquire imagery at 30 m flight altitude at critical growth stages of wheat in Rugao of eastern China. The experiments were conducted in 2016 and 2017 and involved 36 field plots representing variations in cultivar, nitrogen fertilization level and sowing density. We evaluated the performance of VIs, canopy height metrics and their combination for AGB estimation in wheat with the stepwise multiple linear regression (SMLR) and three types of machine learning algorithms (support vector regression, SVR; extreme learning machine, ELM; random forest, RF).
Our results demonstrated that the combination of VIs and canopy height metrics improved the estimation accuracy for AGB of wheat over the use of VIs or canopy height metrics alone. Specifically, RF performed the best among the SMLR and three machine learning algorithms regardless of using all the original variables or selected variables by the SMLR. The best accuracy (R2 = 0.78, RMSE = 1.34 t/ha, rRMSE = 28.98%) was obtained when applying RF to the combination of VIs and canopy height metrics.
Our findings implied that an inexpensive approach consisting of the RF algorithm and the combination of RGB imagery and point cloud data derived from a low-cost UAV system at the consumer-grade level can be used to improve the accuracy of AGB estimation and have potential in the practical applications in the rapid estimation of other growth parameters.
Aboveground biomass (AGB) is a critical indicator in crop growth status monitoring and grain yield prediction . Accurate and rapid estimation of AGB is crucial for the assessment of crop nutrition status and the improvement of crop management strategies. The conventional estimation of AGB is based on destructive measurements , which are not only time consuming and labor intensive, but also hard to apply over large areas . Remote sensing as a non-destructive technique has been proved to have great potential in AGB estimation for crops, such as wheat [4, 5], barley , maize  and rice .
The majority of previous studies on the remote estimation of AGB focused on the use of remotely sensed data acquired from ground [4, 8], man-made aircraft  and satellite platforms . For instance, Cheng et al.  reported a R2 up to 0.81 for the relationship between the red-edge chlorophyll index (CIRed-edge) and rice biomass using ground-based hyperspectral data. Although ground-based remote sensing can yield satisfactory estimation accuracy for crop growth parameters, they are costly to acquire and unsuitable for monitoring over large areas . In contrast, the satellite platform has great advantages in acquiring crop growth information over regional and large scales. As reported by Wang et al. , a high accuracy (R2 = 0.79) could be obtained from HJ-1 satellite imagery for the estimation of wheat AGB at the anthesis stage over four counties of Jiangsu province, China. However, as the growth status of crop varies rapidly across critical growth stages, multi-temporal and timely acquisition of remotely sensed data is necessary for crop monitoring . It is challenging to acquire suitable satellite imagery for monitoring over multiple growth stages due to the frequent cloud cover and the inadequate spatial resolution matching relatively small field sizes in China, especially the lower reaches of Yangtze River . Using manned airborne platforms may overcome these limitations and acquire images with high temporal and spatial resolutions, but it is often complex and costly to allocate aircraft and instrument resources.
The advent of unmanned aerial vehicles (UAVs) makes it possible to acquire high temporal and spatial resolution remotely sensed data in an affordable way. In recent years, multiple types of cameras for acquiring RGB, color-infrared (CIR), multispectral, and hyperspectral images have been mounted on various UAV platforms for monitoring crop growth status [14,15,16]. Particularly, much attention has been paid to low-cost UAV systems consisting of RGB or modified CIR cameras and light-weight drones. The low-cost UAV systems were widely used due to the most significant advantages in affordability, ease of operation, and simplicity in image processing [11, 17,18,19]. People often use the visible or CIR images collected with these low-cost UAV systems to generate orthophotos and point cloud data for crop growth monitoring. While the former type of data could be used to extract vegetation indices (VIs) for estimating crop biophysical and biochemical parameters with moderate accuracies, such as biomass [1, 20], the latter could be used to construct crop surface models (CSMs) for estimating crop structural parameters with high accuracies, such as plant height (PH) [21, 22].
Previous studies have demonstrated that both VIs and canopy height metrics (e.g., height percentiles) derived from UAV images are critical variables for estimating crop biomass [1, 7]. However, the majority of those studies utilized either VIs [19, 23] or canopy height metrics alone  for model establishment . While the VIs composed of visible or CIR bands characterized the spectral properties of the top canopy, the height metrics reflected the vertical structure properties of the entire canopy. Although these two types of variables were used to extract different sources of information about the crop canopy, the performance of either type of variables for biomass estimation might be limited by the insensitivity of VIs at high biomass conditions and the stability of plant height at reproductive stages [1, 25, 26]. The estimation of crop biomass from UAV images might be improved by using the two complementary data sources simultaneously.
In recent years, there are some attempts to improve the estimation of crop biomass by combining VIs and canopy height metrics (Table 1). For instance, Bendig et al.  combined the GnyLi index derived from ground-based hyperspectral data and canopy height metrics acquired from a low-cost UAV system to obtain a R2 of 0.82 for barley biomass estimation. Tilly et al.  also achieved a high accuracy with the fusion of ground-based hyperspectral data and canopy height acquired from terrestrial LiDAR data. Nevertheless, these studies focused on the combination of VIs and canopy height metrics derived from two different sensors, which may limit the applications over large areas due to the high cost of an expensive sensor. In contrast, Li et al.  reported the fusion of VIs and canopy height metrics acquired from a low-cost UAV system for the estimation of maize biomass with three regression techniques (simple linear regression, stepwise linear regression and random forest regression), but they did not explicitly compare the performances among VIs, canopy height metrics and their combination. Therefore, it remains unclear whether the combined data generated from a single sensor could lead to improved estimation of biomass without any additional cost in instrumentation. In addition, their data only cover one growth stage of maize and they are inadequate for assessing the performance of those input variables over all critical growth stages.
For the use of the comprehensive information in multiple types of remotely sensed data, multivariate regression techniques could be an essential approach for establishing direct relationships between remotely sensed variables and crop parameters , including multiple linear regression (MLR) , stepwise multiple linear regression (SMLR)  and partial least squares regression (PLSR) . However, these regression techniques are more suitable for the data that exhibit linear or exponential relationships between remotely sensed variables and crop biophysical/biochemical parameters [35, 36]. Moreover, the VIs and canopy height metrics derived from a consumer-grade camera may be redundant and highly autocorrelated. In contrast to conventional regression techniques, machine learning regression algorithms such as random forest (RF) , support vector regression (SVR)  and extreme learning machine (ELM)  are typically better at handling high-dimensional data and the non-linear relationships [40, 41]. Recent studies have also found that machine learning algorithms could yield higher accuracies for biomass estimation than conventional ones [7, 42].
To the best of our knowledge, few studies have examined machine learning techniques for the estimation of wheat AGB by combining the canopy spectral information and vertical structure information from UAV-derived VIs and canopy height metrics. Since such combined data could be obtained from a RGB camera on board a small UAV, it becomes necessary to investigate the performance of a consumer-grade level UAV system at an even lower cost. Thus, the objectives of this study were: (1) to examine the feasibility of combining spectral indices and canopy height metrics from a RGB camera mounted on a consumer-grade UAV for the improved estimation of AGB in wheat; (2) to evaluate the performance of three machine learning regression techniques over critical growth stages and spatial resolutions in comparison to the traditional SMLR.
Two experiments were conducted in the experimental station of the National Engineering and Technology Center for Information Agriculture (NETCIA) located in Rugao, Jiangsu province of eastern China (120º45′E, 32º16′N) (Fig. 1). A total of 36 plots were used for the experiments spanning two wheat growing seasons. The plots with a size of 6 × 5 m2 covered different wheat cultivars, planting densities and nitrogen (N) rates. In order to avoid the complexity of soil N levels, we applied the same N level as that in the preceding rice growing season for each plot. The detail of experimental design can be found as follows.
Experiment 1 was conducted in the winter wheat season of 2015–2016 with the sowing date of October 30, 2015. One wheat cultivar with the erectophile leaf type, ‘Yangmai 18’, was used for all plots. Four N rates (0, 80, 150, 220 kg N ha−1) and three planting densities (2.0 × 106 plants ha−1, 1.3 × 106 plants ha−1 and 1.0 × 106 plants ha−1, corresponding to 0.2 m, 0.3 m and 0.4 m row spacings) were applied with three replications. These N levels and planting densities could cover the possible rates used in local agronomic practices and lead to variations in AGB, canopy cover and background materials between plots. The N fertilizers were applied in 50% as basal fertilizer at the sowing day and 50% at the jointing stage.
Experiment 2 was conducted in the winter wheat season of 2016–2017 with the sowing date of November 15, 2016. Two winter wheat cultivars with different canopy structures, ‘Yangmai 15’ and ‘Yangmai 16’, were selected to represent planophile and erectophile leaf types, respectively. Three N rates (0, 150, 300 kg N ha−1) with two planting densities (1.6 × 106 plants ha−1 and 1 × 106 plants ha−1, corresponding to 0.25 m and 0.4 m row spacings) were applied with three replications. 50% of N fertilizers were applied at the sowing day and 50% at the jointing stage.
The UAV system for image collection was the DJI Phantom series (Edition 3 in 2015 and Edition 4 in 2016 with added obstacle avoidance for flight safety and slight upgrade in camera specifications as shown in Table 2), both of which represent a low-cost UAV system consisting of a four-rotor drone and a digital camera (SZ DJI Technology Co., Shenzhen, China). Before the initial flight, we set 25 ground control points (GCPs) with marked signs on the concrete roads across the study site to georeference the UAV images from different growth stages. The geographic coordinates were obtained from RTK-GPS (Real-Time Kinematic Global Positioning System, CHC X900 GNSS) with horizontal and vertical errors within 1 cm and 2 cm, respectively. In our campaigns, the UAV was set to automatic flying mode and followed pre-defined flight plan to acquire imagery with approximately 80% forward overlapping and 60% side overlapping. The images were captured in an automatic mode at 1 frame per 5 s with the JPEG format. The ISO of camera was set to 100 and the best exposure was set based on the weather condition. The aperture of camera was the default with f/5. The same flight path and camera setup excluding exposure time were applied to the whole season. The UAV was flown over the study site at critical growth stages (Table 3) at the height of 30 m above ground level. The speed of UAV was set at 0.5 m/s and it took about 12 min cover the whole study area. Each flight campaign was carried out at 11:00 am–14:00 pm local time during sunny day and acquired approximately 58 images with a spatial resolution of 1.66 cm. In order to generate digital terrain model (DTM) of the study site, an extra flight campaign was conducted after wheat sowing on November 16, 2016.
Field sampling of AGB from the 36 plots were conducted within 1 day of the UAV campaigns. Since the destructive sampling was conducted four times in each growing season, only a total of 30 plants were randomly harvested from each sampling region in Fig. 1 to represent each of the homogenous plots. The plants from each plot were harvested from above the ground and then separated into leaves, stems and panicles (for post-heading stages only). All components were oven-dried at 105 °C for 30 min and afterwards at 80 °C for about 48 h until a constant weight. The dry biomass of wheat organs (leaves, stems and panicles) was weighted, respectively. Moreover, the number of plants per unit ground area was also counted manually in the experimental fields. The AGB in tons per hectare (t/ha) was determined as the product of the dry weight per sampling plant and the number of plants per area. The basic statistics of the field-measured AGB was shown in Table 4. The plant height was measured with a ruler as the distance from the bottom to the top of wheat canopy. Five plants were randomly selected to represent the canopy height of each plot.
Generation of orthophotos and crop surface models
The UAV images were processed within the software Agisoft Photoscan 1.2.6 (Agisoft LLC, St. Petersburg, Russia) to generate orthophotos and digital surface models (DSMs). The key processing steps included image alignment, camera calibration, construction of dense point clouds, and generation of orthophotos and digital elevation models (DEMs). Firstly, the software automatically aligned the overlapping images using a feature point matching algorithm. Secondly, seven of the twenty-five evenly distributed GCPs were used to georeference each image. The camera internal parameters were estimated in Agisoft Photoscan based on image alignment and the GCPs positions. The estimated parameters were then used to compensate a linear model misalignment while georeferencing the model. Since the top of wheat canopy is sharp and small, we chose ‘Mild’ depth filtering recommended for reconstructing small details to build dense point cloud. Lastly, the orthophotos and DEMs used as crop surface models (CSMs) were generated after building mesh and texture with default parameters and exported as a TIFF image format for subsequent analysis. The details of processing steps and parameter settings can be found in Table 5.
Calculation of spectral indices
This study examined ten published VIs for the estimation of wheat AGB (Table 6). Most of the selected VIs have been related to crop biophysical and biochemical parameters, such as LAI , vegetation fraction , grain yield , biomass [1, 7], and nitrogen accumulation . These VIs were directly calculated using digital numbers from the orthophotos. In addition, a region of interest (ROI) was delineated from each plot within the orthophotos using ArcGIS 10.2.2 (Esri, Redlands, CA, USA) to exclude the border effect and the sampling region. The mean VIs of each ROI were extracted to represent the values of each plot.
Determination of canopy height metrics
To estimate the AGB of wheat, eight canopy height metrics (mean, median, standard deviation, coefficient of variation and percentiles 25%, 50%, 75%, 95%) were calculated from each canopy height model (CHM) (Table 7). The CHM was determined as the difference between CSM and DTM excluding outliers for each flight survey. The DTM for the entire season was determined from the images acquired during the post-sowing flight on November 16, 2016, while the CSM was derived from the UAV images for each growth stage to reflect crop growth dynamics. The same ROIs used for VI calculation were applied to the CHMs to extract plot-level canopy height metrics within ArcGIS 10.2.2 (Esri, Redlands, CA, USA).
Machine learning algorithms are widely used to handle the strong non-linearity between crop biophysical/biochemical parameters and remotely sensed variables. Compared to parametric regression techniques, machine learning algorithms are well suited for establishing predictive models with multiple input variables. To establish individual models for AGB estimation using ten VIs, eight canopy height metrics and their combination, we used three machine learning techniques implemented with the caret package in R x64 3.4.0 environment software (R Development Core Team, 2017).
RF is an ensemble learning method that combines a large number of decision trees to improve the accuracy of classification and regression trees (CART) . Each tree is built with a deterministic algorithm by selecting a random set of variables and a random sample from the calibration dataset. RF regression not only handles a large number of input variables, but also obtains a reasonable prediction accuracy using a small subset of variables . In addition, RF regression is beneficial to overcome the over-fitting problem of simple decision trees. For implementation, the two significant parameters (mtry and ntree) need to be optimized to obtain the best predictive power.
ELM is a single-hidden layer feed forward neural network (SLFN), whose learning speed is relatively faster than the conventional feed forward network . ELM is composed of an input layer, a hidden layer, and an output layer. Unlike traditional neural network algorithms, ELM aims to reach the smallest training error and the smallest norm of output weights . The weights of its hidden layer can be randomly generated without iterative optimization, which makes it suitable for real-time training. In addition, ELM is capable of handling complex data and is robust for regressions with multiple highly inter-correlated variables. Its potential in crop monitoring has been demonstrated in a recent study on the estimation of soybean biophysical and biochemical parameters from fused multi-sensor data .
SVR is an effective predictive tool based on the statistical learning theory . The advantage of SVR is the ability to handle high-dimensional data and a small number of training samples . Many studies in remote sensing have used SVR to estimate crop biophysical and biochemical parameters [56, 57]. As the critical parameter, kernel function was set as the radial basis function (RBF) to account for the nonlinear relationships in the wheat data.
Since the variables derived from UAV images might be inter-correlated, simple linear regression as measured by Pearson’s correlation coefficients were conducted for the relationships between individual variables and their relationships with AGB. In addition, SMLR was used as the reference for evaluating the performance of the machine learning algorithms relative to traditional techniques.
Since the goal was to build global models with various regression techniques across multiple treatments, growth stages, and seasons, we pooled the data from 2 years and all growing conditions to form a comprehensive dataset. Global models are more practical than local ones since frequent model calibrations for different growing conditions could be avoided. The pooled dataset was split into two parts with 70% for model calibration and the remainder 30% for model validation. The accuracy of model calibration was evaluated with the coefficient of determination (R2), the Root Mean Square Error (RMSE) and akaike information criterion (AIC). The estimation accuracies were assessed by the R2, RMSE and the relative RMSE (rRMSE) with validation data.
Determination of wheat canopy height model
Figure 2 shows a comparison of measured and DSM-derived elevation for the 25 GCPs as an assessment of the DSM elevation accuracy. The RMSE of the GCP elevation estimated with UAV images was 0.02 m for the pooled data from 2016 to 2017. Subsequently, the PH for the field plots derived from the CHM matched well with the field measurements (Fig. 3), exhibiting a R2 value of 0.89 and a RMSE value of 0.06 m for the 2 years. Overall, the CHM-derived PH was slightly lower than the measured PH (Bias = 0.06 m).
Correlation between UAV-derived variables
Figure 4 shows a matrix of Pearson’s correlation coefficients (r) for the relationships between UAV-derived variables and their relationships with AGB. For the VI group, eight of the ten VIs showed highly positive or negative correlations with extreme r values up to 1 (GRVI vs. MGRVI, GLI vs. RGBVI, and GLI vs. ExG) or -1 (MGRVI vs. ExR and GRVI vs. ExR). VARI and ExB were the most strongly and weakly correlated to AGB, respectively (VARI: r = 0.79, p-value < 0.0001; ExB: r = 0.19, p-value < 0.005). For the height metric group, six of the eight metrics showed highly positive correlations with the maximum r value up to 1 (P50 vs. median, P50 vs. mean, median vs. mean, and std vs. cv). The highest correlations with AGB were found for P95 and P75 (r = 0.83, p-value < 0.05), with the lowest for cv (r = 0.07, p-value > 0.05). Generally, these correlations were stronger than those of VIs with AGB.
Comparison of AGB estimation performance with the SMLR and machine learning techniques
Table 8 shows a comparison of the SMLR and three machine learning techniques for the estimation of AGB over the critical growth stages. Using the VIs alone, RF achieved best calibration (R2 = 0.70, RMSE = 1.51 t/ha, AIC = 369.23) and validation (R2 = 0.69, RMSE = 1.61 t/ha, rRMSE = 34.06%) performance among the three regression techniques, while the SMLR achieved the similar accuracy (validation: R2 = 0.70, RMSE = 1.58 t/ha, rRMSE = 34.49%). When using the canopy height metrics, the best performance was still found for RF and close performance for SVR and ELM. Compared with the SMLR, the performance of SVR and ELM was lower than the SMLR, while RF achieved the highest accuracy. Moreover, this accuracy for RF (Calibration: R2 = 0.73, RMSE = 1.44 t/ha, AIC = 398.32; validation: R2 = 0.74, RMSE = 1.39 t/ha, rRMSE = 30.95%) was even higher than that obtained using the VIs, with an increment of 0.05 in R2 and 0.22 t/ha in RMSE for the validation data.
The combination of VIs and canopy height metrics yielded further improvement for all regression techniques. Their accuracies were significantly higher than those achieved with the traditional approach of merely using VIs, with a uniform increment of 0.09 in validation R2 for all three regression techniques. Consistently, RF yielded the highest accuracy in calibration (R2 = 0.76, RMSE = 1.34 t/ha, AIC = 369.23) and validation (R2 = 0.78, RMSE = 1.34 t/ha, rRMSE = 28.98%). The scatter plots in Fig. 5 shows the data points are generally closer to the 1:1 line by combining the VIs and canopy height metrics but a small portion of them associated with high values of measured AGB are located under the diagonal.
Performance for individual growth stages
Figure 6 shows that the performance of AGB estimation was inconsistent across individual growth stages for the three types of input data and three regression techniques. The accuracy was generally a degradation trend from the highest to lowest for jointing to anthesis stage. The degradation of estimation accuracy from booting to heading was the most prominent change for all neighboring stages. In addition, the booting stage was stable for observing AGB for all three machine learning algorithms. Among the three regression techniques, RF was mostly the best performing one and SVR was the most sensitive to growth stage. In contrary to the multi-stage situation, using the VIs as the input data for RF yielded better accuracies than using the canopy height metrics. However, the combination of VIs and canopy height metrics consistently performed better than either type of input data alone. A similar change trend was observed when assessing the accuracies with RMSE.
Effect of spatial resolution on AGB estimation
The performance of AGB estimation for a number of image resolutions is displayed in Fig. 7. Generally, the accuracies were more variable for smaller pixel sizes and the comparable accuracy was obtained for 13.28 cm using canopy height metrics or the combined data as input for RF. Canopy height metrics performed better than VIs over the series of pixel sizes for AGB estimation at each critical growing stage. By combining the VIs and canopy height metrics, the performance was less sensitive to pixel size and their combination performed slightly better than the use of VIs and canopy height metrics alone.
Comparison of SMLR and the machine learning techniques
SMLR is a commonly used method for selecting explanatory variables in multivariate regression and is prone to overfitting in quantifying vegetation parameters . Our results demonstrated that the performance of SMLR was worse than that of RF and comparable to those of SVR and ELM, which was consistent with the findings in Li et al. . The stable regression performance of SMLR was probably attributed to the relatively small number of input variables (no more than 18) as compared to hundreds or thousands of bands in spectroscopy analysis . To evaluate its performance in variable selection, we tested the machine learning regression techniques with the variables selected by SMLR as the input data. The calibration and validation accuracies with selected variables (Table 9) were not consistently higher across three groups of input variables than those with all the original variables (Table 8), which means the variable selection did not help improve the regression performance significantly. In fact, variable selection for the machine learning algorithms would not only increase the complexity of data processing, but also bring the uncertainty due to the deficiency of high inter-correlated variables. Using all the 18 variables for the combined data would make a big burden since the machine learning algorithms are able to handle high-dimensional data. Since variable selection may be useful for reducing the data volume in case of large data sets, future research may include searching for advanced selection procedures other than SMLR.
The improvement from the combination of VIs and canopy height metrics
The use of VIs represents a widely used approach to estimating crop AGB from UAV images, but its performance remains to be improved, especially when only RGB images are available. The reasons for the lower accuracy obtained by using VIs alone can be attributed to three aspects. Firstly, the VIs were derived only from the RGB bands and the lack of near-infrared (NIR) bands precluded the enhancement of contrast in vegetation vigor [19, 59]. Secondly, VIs were prone to saturate in high biomass conditions . Thirdly, VIs were directly calculated from digital number (DN) images and it was hard to convert DNs to reflectance due to the wide spectral ranges of visible bands and inaccurate spectral response functions . Moreover, the spectral information in the VIs was mainly from the leaves or panicles in the top layer of wheat canopies. Since the AGB in wheat encompassed the biomass of leaves, stems and panicles, the VIs might not reflect the information from stems that have a higher proportion of AGB compared to leaves in the middle to late growing season.
Canopy height is an important metric for characterizing vertical structure. Previous studies have shown a moderate relationship between canopy height and biomass for barley , grassland , maize . This was confirmed by the good performance of canopy height metrics in the current study, even though the CHM-derived plant height was slightly lower than the field measurements (Fig. 3). Similar underestimations were also reported by Bendig et al. . The reasons for the underestimations could be explained in two aspects. Firstly, recurring wind in the field might blow the leaves in the canopy so that the position of the same leaves would change in overlapping images. Secondly, the top of a wheat plant was sharp and it was challenging to capture the canopy top at the spatial resolution of 1.66 cm in the UAV images.
The VIs and canopy height metrics used in this study were separately derived from orthophotos and CSMs, which were both generated with overlapping RGB images acquired from a consumer-grade UAV system. The orthophotos recorded canopy surface spectral properties in three visible bands , while the CSMs characterized canopy vertical structure [59, 62]. Combining VIs and canopy height metrics as the input data for regression techniques enabled the use of two types of information sources which are spectral information and structural information. Our results suggest that the use of combined information exhibited better performance for AGB estimation than the use of spectral or structural information alone. Li et al.  investigated the combination of spectral and structural information for AGB estimation in maize, but they did not provide an explicit comparison among the three types of input data. Their study only covered one growth stage in maize and did not consider the performance of the combined information for multiple growth stages. Our study used a combination of ten VIs and eight canopy height metrics to estimate wheat AGB for the critical growth stages for 2 years. Such a number of input variables for the regression techniques provided sufficient spectral information about the top canopy and the structural information about the canopy vertical gradient.
The optimal spatial resolution for AGB estimation
This study used the RGB imagery acquired with a low-cost UAV system to estimate AGB and obtained a R2 up to 0.78 for the multiple growth stages with the RF regression technique. Such high resolution images (1.66 cm pixel−1) would have to be collected at low altitudes with that system, which may be a limiting factor for the efficiency of image collection over large areas . This problem can be overcome by using a higher resolution camera or flying at a higher altitude. Nevertheless, a higher resolution camera may lead to the increase in cost and weight, which may shorten the UAV flight duration compared with a lightweight and consumer-grade camera. Therefore, the sensitivity of AGB estimation accuracy to image spatial resolution was an important reference for the configuration of a UAV flight altitude. As indicated in Zarco-Tejada et al. , a relatively lower image resolution may still yield an acceptable accuracy. Our results suggested that it was feasible to adjust the flight altitude and maintain comparable performance at the same time (Fig. 7).
The results demonstrated that the image resolution at 13.28 cm pixel−1 was the optimal for AGB estimation. Compared with original resolution of 1.66 cm pixel−1 acquired at 30 m, the shape of wheat canopy changed slightly but it was still easy to be identified at 13.28 cm pixel−1 (Fig. 8). However, when degrading to the lower resolutions, the mixed pixels from soil background and wheat led to the lower accuracy for AGB estimation (Fig. 7). Therefore, we suggest that UAV campaigns be carried out at 240 m to achieve the resolution of 13.28 cm pixel−1. That means we might be able to increase the efficiency to eight times with the same UAV system without a compromise of the estimation accuracy. The images at lower resolutions (e.g., 13.28 cm pixel−1) could be obtained by either increasing the flight altitude or using a lower-definition camera. Flying at 240 m is technically feasible and currently allowed by the local aviation regulation policy. Considering the difficulties in locating GCPs on the 13.28 cm pixel−1 image, one solution for future work would be to use new UAV systems with embedded RTK unit and avoid the use of GCPs.
Comparison of RF to SVR and ELM for AGB estimation
Machine learning techniques have proved to be powerful for non-linear regression between remotely sensed data and biomass [64, 65]. This study evaluated the performance of three machine learning techniques with VIs, canopy height metrics and their combination as the input variables for AGB estimation, respectively. The results demonstrated that RF outperformed ELM and SVR consistently. In relevant studies on the estimation of crop AGB, RF was also found to be superior to SVR and artificial neural network (ANN) for wheat  and to stepwise multiple linear regression (SMLR) for maize [7, 12] and wetland vegetation .
RF regression is considered as one of popular ensemble learning algorithms for combining a large set of regression sub-models . It is capable to model a large number of inter-correlated input variables and is not sensitive to noise or over-fitting [37, 67]. SVR tries to fit a hyperplane with calibration data as many as possible based on statistical learning principle. The estimate accuracy of SVR depends on a proper meta-parameters settings and selection of the kernel function. The optimal parameters can be obtained by grid search and iterative tuning. ELM is an efficient and rapid learning algorithm without much human intervention and does not need any kernel function. In this study, most of these variables derived from the UAV images were inter-correlated. RF is more suitable for dealing with two or more variables correlated with each other due to its insensitiveness to collinearity . Previous studies have also proved that it is more likely to achieve high accuracy with RF due to its stability and robustness for complex and non-linear regressions [12, 64, 66]. The performance of RF for AGB estimation in wheat still needs to be validated with data sets from more study sites and varieties.
This study compared the performance of the SMLR and three machine learning techniques for AGB estimation with VIs, canopy height metrics and their combination derived from high overlapping imagery acquired with a low-cost UAV system. Results demonstrated that the combination of VIs and canopy height metrics with all regression techniques improved the estimation accuracy over the use of VIs or canopy height metrics alone. In addition, RF yielded the most accurate estimations among the four regression techniques. Using RF, we demonstrated that a comparable accuracy for AGB estimation was obtained at the resolution of 13.28 cm pixel−1, which was reduced to one-eighth of the original orthophotos.
The findings imply that a consumer-grade camera mounted on a lightweight UAV could yield an accuracy of R2 up to 0.78 and a RMSE up to 1.34 t/ha for the AGB estimation in wheat. We proposed an inexpensive approach consisting of the RF algorithm and the combination of VIs and canopy height metrics derived from a low-cost UAV system at the consumer-grade level. This approach can be assessed for the efficient and economic monitoring of other growth parameters such as leaf area index in future research.
Tilly N, Aasen H, Bareth G. Fusion of plant height and vegetation indices for the estimation of barley biomass. Remote Sens. 2015;7(9):11449–80.
Gnyp ML, Bareth G, Li F, Lenz-Wiedemann VIS, Koppe W, Miao Y, Hennig SD, Jia L, Laudien R, Chen X, et al. Development and implementation of a multiscale biomass model using hyperspectral vegetation indices for winter wheat in the North China Plain. Int J Appl Earth Obs Geoinf. 2014;33:232–42.
Boschetti M, Bocchi S, Brivio PA. Assessment of pasture production in the Italian Alps using spectrometric and remote sensing information. Agric Ecosyst Environ. 2007;118(1):267–72.
Jin X, Kumar L, Li Z, Xu X, Yang G, Wang J. Estimation of winter wheat biomass and yield by combining the AquaCrop model and field hyperspectral data. Remote Sens. 2016;8(12):1–15.
Fu Y, Yang G, Wang J, Song X, Feng H. Winter wheat biomass estimation based on spectral indices, band depth analysis and partial least squares regression using hyperspectral measurements. Comput Electron Agric. 2014;100:51–9.
Bendig J, Yu K, Aasen H, Bolten A, Bennertz S, Broscheit J, Gnyp ML, Bareth G. Combining UAV-based plant height from crop surface models, visible, and near infrared vegetation indices for biomass monitoring in barley. Int J Appl Earth Obs Geoinf. 2015;39:79–87.
Li W, Niu Z, Chen H, Li D, Wu M, Zhao W. Remote estimation of canopy height and aboveground biomass of maize using high-resolution stereo images from a low-cost unmanned aerial vehicle system. Ecol Indic. 2016;67:637–48.
Cheng T, Song R, Li D, Zhou K, Zheng H, Yao X, Tian Y, Cao W, Zhu Y. Spectroscopic estimation of biomass in canopy components of paddy rice using dry matter and chlorophyll indices. Remote Sens. 2017;9(4):319.
Greaves HE, Vierling LA, Eitel JUH, Boelman NT, Magney TS, Prager CM, Griffin KL. High-resolution mapping of aboveground shrub biomass in Arctic tundra using airborne lidar and imagery. Remote Sens Environ. 2016;184:361–73.
Mutanga O, Adam E, Cho MA. High density biomass estimation for wetland vegetation using WorldView-2 imagery and random forest regression algorithm. Int J Appl Earth Obs Geoinf. 2012;18(1):399–406.
Zheng H, Cheng T, Li D, Zhou X, Yao X, Tian Y, Cao W, Zhu Y. Evaluation of RGB, color-infrared and multispectral images acquired from unmanned aerial systems for the estimation of nitrogen accumulation in rice. Remote Sens. 2018;10(6):824.
Wang LA, Zhou X, Zhu X, Dong Z, Guo W. Estimation of biomass in wheat using random forest regression algorithm and remote sensing data. Crop J. 2016;4(3):212–9.
Wu G, Leeuw JD, Skidmore AK, Prins HHT, Liu Y: Exploring the possibility of estimating the aboveground biomass of Vallisneria spiralis L. using Landsat TM image in Dahuchi, Jiangxi Province, China. In: MIPPR 2005: geospatial information, data mining, and applications: 2006. International Society for Optics and Photonics; p. 60452P-60452P-60411.
van Iersel W, Straatsma M, Addink E, Middelkoop H. Monitoring height and greenness of non-woody floodplain vegetation with UAV time series. ISPRS J Photogramm Remote Sens. 2018;141:112–23.
Schut AGT, Traore PCS, Blaes X, de By RA. Assessing yield and fertilizer response in heterogeneous smallholder fields with UAVs and satellites. Field Crops Res. 2018;221:98–107.
Moeckel T, Dayananda S, Nidamanuri R, Nautiyal S, Hanumaiah N, Buerkert A, Wachendorf M. Estimation of vegetable crop parameter by multi-temporal UAV-borne images. Remote Sens. 2018;10(5):805.
Roth L, Streit B. Predicting cover crop biomass by lightweight UAS-based RGB and NIR photography: an applied photogrammetric approach. Precis Agric. 2017;19(1):93–114.
Weiss M, Baret F. Using 3D point clouds derived from UAV RGB imagery to describe vineyard 3D macro-structure. Remote Sens. 2017;9(2):111.
Hunt JER, Hively WD, Fujikawa SJ, Linden DS, Daughtry CST, McCarty GW. Acquisition of NIR-green-blue digital photographs from unmanned aircraft for crop monitoring. Remote Sens. 2010;2(1):290–305.
Hunt ER, Cavigelli M, Daughtry CST, Mcmurtrey JE, Walthall CL. Evaluation of digital photography from model aircraft for remote sensing of crop biomass and nitrogen status. Precis Agric. 2005;6(4):359–78.
Bendig J, Bolten A, Bennertz S, Broscheit J, Eichfuss S, Bareth G. Estimating biomass of barley using crop surface models (CSMs) derived from UAV-based RGB imaging. Remote Sens. 2014;6(11):10395–412.
Iqbal F, Lucieer A, Barry K, Wells R. Poppy crop height and capsule volume estimation from a single UAS flight. Remote Sens. 2017;9(7):647.
Miller CD, Fox-Rabinovitz JR, Allen NF, Carr JL, Kratochvil RJ, Forrestal PJ, Daughtry CST, McCarty GW, Hively WD, Hunt ER. NIR-green-blue high-resolution digital images for assessment of winter cover crop biomass. GISci Remote Sens. 2011;48(1):86–98.
Jing R, Gong Z, Zhao W, Pu R, Deng L. Above-bottom biomass retrieval of aquatic plants with regression models and SfM data acquired by a UAV platform: a case study in Wild Duck Lake Wetland, Beijing, China. ISPRS J Photogramm Remote Sens. 2017;134:122–34.
Thenkabail PS, Smith RB, Pauw ED. Hyperspectral vegetation indices and their relationships with agricultural crop characteristics. Remote Sens Environ. 2000;71(2):158–82.
Reddersen B, Fricke T, Wachendorf M. A multi-sensor approach for predicting biomass of extensively managed grassland. Comput Electron Agric. 2014;109:247–60.
Watanabe K, Guo W, Arai K, Takanashi H, Kajiyakanegae H, Kobayashi M, Yano K, Tokunaga T, Fujiwara T, Tsutsumi N. High-throughput phenotyping of sorghum plant height using an unmanned aerial vehicle and its application to genomic prediction modeling. Front Plant Sci. 2017;8:421.
Schirrmann M, Giebel A, Gleiniger F, Pflanz M, Lentschke J, Dammer K-H. Monitoring agronomic parameters of winter wheat crops with low-cost UAV imagery. Remote Sens. 2016;8(9):706.
Kim D-W, Yun H, Jeong S-J, Kwon Y-S, Kim S-G, Lee W, Kim H-J. Modeling and testing of growth status for Chinese cabbage and white radish with UAV-based RGB imagery. Remote Sens. 2018;10(4):563.
Holman FH, Riche AB, Michalski A, Castle M, Wooster MJ, Hawkesford MJ. High throughput field phenotyping of wheat plant height and growth rate in field plot trials using UAV based remote sensing. Remote Sens. 2016;8(12):1031.
Madec S, Baret F, de Solan B, Thomas S, Dutartre D, Jezequel S, Hemmerlé M, Colombeau G, Comar A. High-throughput phenotyping of plant height: comparing unmanned aerial vehicles and ground LiDAR estimates. Front Plant Sci. 2017;8:2002.
Rivera J, Verrelst J, Delegido J, Veroustraete F, Moreno J. On the semi-automatic retrieval of biophysical parameters based on spectral index optimization. Remote Sens. 2014;6(6):4927–51.
Ali I, Greifeneder F, Stamenkovic J, Neumann M, Notarnicola C. Review of machine learning approaches for biomass and soil moisture retrievals from remote sensing data. Remote Sens. 2015;7(12):15841.
Grossman YL, Ustin SL, Jacquemoud S, Sanderson EW, Schmuck G, Verdebout J. Critique of stepwise multiple linear regression for the extraction of leaf biochemistry information from leaf reflectance data. Remote Sens Environ. 1996;56(3):182–93.
Yue J, Feng H, Yang G, Li Z. A comparison of regression techniques for estimation of above-ground winter wheat biomass using near-surface spectroscopy. Remote Sens. 2018;10(1):66.
Atzberger C, Guerif M, Baret F, Werner W. Comparative analysis of three chemometric techniques for the spectroradiometric assessment of canopy chlorophyll content in winter wheat. Comput Electron Agric. 2010;73(2):165–73.
Breiman L. Random forest. Mach Learn. 2001;45:5–32.
Mountrakis G, Im J, Ogole C. Support vector machines in remote sensing: a review. ISPRS J Photogramm Remote Sens. 2011;66(3):247–59.
Huang G-B, Zhu Q-Y, Siew C-K. Extreme learning machine: theory and applications. Neurocomputing. 2006;70(1):489–501.
Hansen PM, Schjoerring JK. Reflectance measurement of canopy biomass and nitrogen status in wheat crops using normalized difference vegetation indices and partial least squares regression. Remote Sens Environ. 2003;86(4):542–53.
Prabhakara K, Hively WD, Mccarty GW. Evaluating the relationship between biomass, percent groundcover and remote sensing indices across six winter cover crop fields in Maryland, United States. Int J Appl Earth Obs Geoinf. 2015;39:88–102.
Maimaitijiang M, Ghulam A, Sidike P, Hartling S, Maimaitiyiming M, Peterson K, Shavers E, Fishman J, Peterson J, Kadam S, et al. Unmanned Aerial System (UAS)-based phenotyping of soybean using multi-sensor data fusion and extreme learning machine. ISPRS J Photogramm Remote Sens. 2017;134:43–58.
Gonsamo A. Leaf area index retrieval using gap fractions obtained from high resolution satellite data: comparisons of approaches, scales and atmospheric effects. Int J Appl Earth Obs Geoinf. 2010;12(4):233–48.
Torres-Sanchez J, Pena JM, de Castro AI, Lopez-Granados F. Multi-temporal mapping of the vegetation fraction in early-season wheat fields using images from UAV. Comput Electron Agric. 2014;103:104–13.
Zhou X, Zheng HB, Xu XQ, He JY, Ge XK, Yao X, Cheng T, Zhu Y, Cao WX, Tian YC. Predicting grain yield in rice using multi-temporal vegetation indices from UAV-based multispectral and digital imagery. ISPRS J Photogramm Remote Sens. 2017;130:246–55.
Gitelson AA, Kaufman YJ, Stark R, Rundquist D. Novel algorithms for remote estimation of vegetation fraction. Remote Sens Environ. 2002;80(1):76–87.
Woebbecke DM, Meyer GE, Bargen KV, Mortensen DA. Color indices for weed identification under various soil, residue, and lighting conditions, vol. 38. St. Joseph, MI: ETATS-UNIS: American Society of Agricultural Engineers; 1995.
Meyer GE, Neto JC. Verification of color vegetation indices for automated crop imaging applications. Comput Electron Agric. 2008;63(2):282–93.
Mao W, Wang Y, Wang Y. Real-time detection of between-row weeds using machine vision. In: 2003, Las Vegas, NV July 27–30, 2003.
Neto JC. A combined statistical-soft computing approach for classification and mapping weed species in minimum-tillage systems (Doctoral dissertation). University of Nebraska - Lincoln. 2004. Retrieved from http://digitalcommons.unl.edu/dissertations/AAI3147135.
Tucker CJ. Red and photographic infrared linear combinations for monitoring vegetation. Remote Sens Environ. 1979;8(2):127–50.
Louhaichi M, Borman MM, Johnson DE. Spatially located platform and aerial photography for documentation of grazing impacts on wheat. Geocarto Int. 2001;16(1):65–70.
Kawashima S, Nakatani M. An algorithm for estimating chlorophyll content in leaves using a video camera. Ann Bot. 1998;81(1):49–54.
Huang GB, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B. 2012;42(2):513–29.
Vapnik VN. The nature of statistical learning theory. Berlin: Springer; 1995.
Yao X, Huang Y, Shang G, Zhou C, Cheng T, Tian Y, Cao W, Zhu Y. Evaluation of six algorithms to monitor wheat leaf nitrogen concentration. Remote Sens. 2015;7(11):14939–66.
Gao Y, Lu D, Li G, Wang G, Chen Q, Liu L, Li D. Comparative analysis of modeling algorithms for forest aboveground biomass estimation in a subtropical region. Remote Sens. 2018;10(4):627.
Jia F, Liu G, Liu D, Zhang Y, Fan W, Xing X. Comparison of different methods for estimating nitrogen concentration in flue-cured tobacco leaves based on hyperspectral reflectance. Field Crops Res. 2013;150(15):108–14.
Tilly N, Hoffmeister D, Cao Q, Huang S, Lenz-Wiedemann V, Miao Y, Bareth G. Multitemporal crop surface models: accurate plant height measurement and biomass estimation with terrestrial laser scanning in paddy rice. J Appl Remote Sens. 2014;8(1):083671.
Hatfield JL, Gitelson AA, Schepers JS, Walthall CL. Application of spectral remote sensing for agronomic decisions. Agron J. 2008;100(3):117–31.
Ehlert D, Horn H-J, Adamek R. Measuring crop biomass density by laser triangulation. Comput Electron Agric. 2008;61(2):117–25.
Li W. Correlating the horizontal and vertical distribution of LiDAR point clouds with components of biomass in a picea crassifolia forest. Forests. 2014;5(8):1910–30.
Zarco-Tejada PJ, Diaz-Varela R, Angileri V, Loudjani P. Tree height quantification using very high resolution imagery acquired from an unmanned aerial vehicle (UAV) and automatic 3D photo-reconstruction methods. Eur J Agron. 2014;55(2):89–99.
Gleason CJ, Im J. Forest biomass estimation from airborne LiDAR data using machine learning approaches. Remote Sens Environ. 2012;125:80–91.
Prasad R, Pandey A, Singh KP, Singh VP, Mishra RK, Singh D. Retrieval of spinach crop parameters by microwave remote sensing with back propagation artificial neural networks: a comparison of different transfer functions. Adv Space Res. 2012;50(3):363–70.
Lin L, Wang F, Xie X, Zhong S. Random forests-based extreme learning machine ensemble for multi-regime time series prediction. Expert Syst Appl. 2017;83(C):164–76.
Jin X, Diao W, Xiao C, Wang F, Chen B, Wang K, Li S. Estimation of wheat agronomic parameters using new spectral indices. PLoS ONE. 2013;8(8):e72736.
All authors have made significant contributions to this research. TC, QC, XY, YT, YZ and WC conceived and designed the experiments; NL, JZ, ZH and DL performed the experiments; NL, JZ, DL and ZH processed and analyzed the data; TC and NL wrote the paper.
We would like to thank Min Jia, Xiao Zhang, Jie Zhu, Chaojie Niu, Chunchen Ma and Hengbiao Zheng for their help in the data collection. The authors also thank two anonymous reviewers for their detailed suggestions for improving the manuscript.
The authors declare that they have no competing interests.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Consent for publication
Ethics approval and consent to participate
This work was supported by the National Key R&D Program (2016YFD0300608), the National Natural Science Foundation of China (31601222, 31725020), the Academic Program Development of Jiangsu Higher Education Institutions (PAPD), and the project for Student Research Training (SRT) in the College of Agriculture at Nanjing Agricultural University (1711A20).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.