Skip to main content

High-throughput phenotyping allows the selection of soybean genotypes for earliness and high grain yield



Precision agriculture techniques are widely used to optimize fertilizer and soil applications. Furthermore, these techniques could also be combined with new statistical tools to assist in phenotyping in breeding programs. In this study, the research hypothesis was that soybean cultivars show phenotypic differences concerning wavelength and vegetation index measurements.


In this research, we associate variables obtained via high-throughput phenotyping with the grain yield and cycle of soybean genotypes. The experiment was carried out during the 2018/2019 and 2019/2020 crop seasons, under a randomized block design with four replications. The evaluated soybean genotypes included 7067, 7110, 7739, 8372, Bonus, Desafio, Maracai, Foco, Pop, and Soyouro. The phenotypic traits evaluated were: first pod height (FPH), plant height (PH), number of branches (NB), stem diameter (SD), days to maturity (DM), and grain yield (YIE). The spectral variables evaluated were wavelengths and vegetation indices (NDVI, SAVI, GNDVI, NDRE, SCCCI, EVI, and MSAVI). The genotypes Maracai and Foco showed the highest grain yields throughout the crop seasons, in addition to belonging to the groups with the highest means for all VIs. YIE was positively correlated with the NDVI and certain wavelengths (735 and 790 nm), indicating that genotypes with higher values for these spectral variables are more productive. By path analyses, GNDVI and NDRE had the highest direct effects on the dependent variable DM, while NDVI had a higher direct effect on YIE.


Our findings revealed that early and productive genotypes can be selected based on vegetation indices and wavelengths. Soybean genotypes with a high grain yield have higher means for NDVI and certain wavelengths (735 and 790 nm). Early genotypes have higher means for NDRE and GNDVI. These results reinforce the importance of high-throughput phenotyping as an essential tool in soybean breeding programs.


Soybean (Glycine Max (L.) Merrill) is the most important oilseed crop and a major commodity worldwide. The crop complex in Brazil [a combination of grains and main derivatives (oil and bran)] surpassed that of the USA in the 2019/2020 crop season [1]. Due to large-scale world population growth combined with unstable product prices, the demand for quality raw materials and fair prices has increased [2], requiring highly productive cultivars and increasingly efficient farming systems.

In this sense, Brazilian soybean breeding programs have sought cultivars that combine high grain yield and earliness. Earliness has been a target because it allows farmers to grow corn or cotton in the off-season, after soybean cultivation (in-season). Furthermore, early-cycle genotypes remain less time in the field and are subject to less disease pressure [3]. However, plant cycle characterization is a time and labor-demanding task since it requires on-field counting of the number of days from emergence until flowering or maturation of each genotype. This is because hundreds of soybean genotypes are evaluated annually in the breeding programs, and the cycle monitoring for each plot must be performed daily. To overcome these difficulties, the use of remote sensing techniques emerges as a high potential tool, providing specific and large-scale information for crop assessment [4,5,6,7].

Remote sensing-based high-throughput phenotyping (HTP) is a reliable and fast approach to real-time and large-scale plant trait measurements [8,9,10]. Adopting this approach is essential to achieve greater efficiency of plant breeding, as it provides monitoring and decision support with applications in several scenarios, such as monitoring the plant status [8, 11,12,13], discriminating cultivars [4, 6], predicting crop yield [7, 14, 15], and selecting genotypes for traits of interest [9, 16, 17]. Unmanned Aerial Vehicles (UAVs) are essential for remote sensing-based HTP since they provide fast real-time data via remote sensors. These tools are required for obtaining Vegetation Indices (VIs), which are mathematical models for different wavelengths [7, 10, 18]. UAVs can estimate the spectral component of vegetation through combinations between red and near-infrared spectral bands [19] and can be assembled to assess the growth vigor, nutrient status, and photosynthetic activity of the plants in the field [20,21,22]. Thus, the use of spectral variables obtained by UAV imaging shows to be a promising approach for reliable, faster, and cost-effective measurements of the cycle and yield-related traits in soybean.

Santana et al. [11] assessed the relationship between VIs obtained from UAV multispectral imagery and leaf N content and yield-related traits in corn varieties grown in different N topdressing levels, and they verified a positive relationship between NDVI and NDRE and grain yield under adequate N levels. Da Silva et al. [7], in a study aiming at identifying which VIs can be used in soybean grain yield prediction by using UAV and remote multispectral sensor, verified that NDVI and SAVI had the higher direct effect on grain yield. However, further studies assessing the relationship between VIs and cycle and yield-related traits in soybean cultivars are still needed. Identifying the cause-and-effect relationship between spectral and agronomic variables provides an easier and faster phenotyping process in breeding programs since efforts can be directed only to the wavelengths and VIs showing the highest cause-and-effect relationship with cycle and yield. Additionally, genotypes with better means for these spectral variables should be identified to achieve an efficient selection for yield and earliness.

The research hypothesis was that soybean cultivars show phenotypic differences concerning the measurements of wavelengths and VIs. Thus, the objective of this study was to identify variables obtained by UAV-based HTP that are related to the grain yield and cycle of soybean genotypes.


Field trials

During the 2018/2019 and 2019/2020 crop seasons, two experiments were carried out at the experimental field of the Federal University of Mato Grosso do Sul, campus of Chapadão do Sul (18° 46′ 26″ S, 52° 37′ 28″ W, mean altitude of 810 m). According to the Köppen classification system, the climate is classified as tropical savanna (Aw). The soil of the experimental area is classified as dystrophic Red Latosol with a clay texture [23], and has the following chemical characteristics in the 0–20 cm layer, according to International System of Units: pH (CaCl2) = 5.0; H + Al = 35.0 mmolc dm−3; Ca = 26.0 mmolc dm−3; Mg = 6.0 mmolc dm−3; K = 55.0 mg dm−3; P = 16.0 mg dm−3; S = 24.0 mg dm−3; B = 0.46 mg dm−3; Cu = 0.7 mg dm−3; Fe = 25.0 mg dm−3; Mn = 10.4 mg dm−3; Zn = 5.2 mg dm−3; OM = 24.0 g dm−3; CEC = 68.0 mmolc dm−3; and base saturation = 48.8%.

The experimental design consisted of randomized blocks with four replications. The evaluated soybean genotypes included 7067, 7110, 7739, 8372, Bonus, Desafio, Maracai, Foco, Pop, and Soyouro. The main phenotypic traits of the cultivars are shown in Table 1. The plots consisted of five rows (4 m long) with a spacing of 0.45 m. The sowing density was 15 plants m−1. The climatic conditions during the experiments are shown in Fig. 1, respectively.

Table 1 Characterization of evaluated soybean cultivars
Fig. 1
figure 1

Weather conditions during the 2018/2019 and 2019/2020 crop seasons

Evaluated variables

During both crop seasons, the high-throughput phenotyping measurements were performed at 60 days after emergence (DAE). On this date, soybean cultivars were in R1 and R2 according to phenological scale of Fehr and Caviness [24], showing the highest vegetative peak and maximum physiological development [25]. For spectral data acquisition, we used a Sensefly eBee RTK fixed-wing unmanned aerial vehicle (UAV), with autonomous take-off control, flight plan, and landing. The overflights were performed with 75% lateral and 80% longitudinal overlap of the taken images. The overflight was performed at 100 m altitude, allowing a spatial image resolution of 0.10 m. The aircraft can fly between 50 and 400 ha field per flight (at a fully charged battery and dependent of the needed spatial resolution, wind speed and flight altitude). The flight autonomy depends on the image’s desired spatial resolution and the overlap of the passes or flight lines, which can be up to 45 min. The aircraft has 0.96 m of wingspan, and the weight without a camera and battery is 0.46 kg. The nominal take-off weight with camera e battery is 0.73 kg. The nominal cruise speed is between 40 and 90 km h−1 (wind dependent).

SenseFly eBee RTK was equipped with a luminosity sensor and the Parrot Sequoia multispectral camera, with 1280 × 960 pixels and pixel size of 3.75 × 3.75 µm (Focal Length of 3.98 mm). The Parrot Sequoia includes a sunshine sensor at the top of the equipment, which registers the sun’s total spectral irradiance at-sensor level and, thus, facilitates the automatic determination of the at-sensor reflectance. The assumed Full-Width Half Maximum (FWHM) provided in the specification sheet, by guessing the shape of the relative spectral response function [26], are: Green 530–570 nm; Red 640–680 nm; Red-edge 730–740 nm; and NIR 770–810 nm. The overflights were carried out near the zenith due to the minimization of the shadows of the trees, at 11 a.m., given that the multispectral sensor is passive type, that is, dependent on the solar luminosity.

The following wavelengths were evaluated: green (550 nm), red (660 nm), near-infrared (735 nm), and infrared (790 nm). The information acquired in these wavelengths allowed calculating the different vegetation indices, as shown in Table 2. The aerial survey was carried out using Real-Time Kinematics (RTK) technology, which was used to estimate the position of the camera at the time of image collection, with an accuracy of 2.5 cm. The images were mosaiced and orthorectified using the Pix4Dmapper software package. The positional accuracy of the orthoimages was verified using ground control points (GCP), obtained via data surveys in combination with RTK. A calibration reference plaque (calibration target) is also used, in the Pix4DMapper software, to convert the digital number of the pixels into reflectance values.

Table 2 Vegetation index (VIs) equations generated from high-throughput phenotyping and its respective reference

From each plot, five plants were randomly selected to evaluate the following agronomic traits: first pod height (FPH, cm), plant height (PH, cm), main stem diameter (SD, cm), hundred grain mass (HGM), days to maturity (DM), and grain yield (YIE). A measuring tape was used to evaluate both the FPH and PH. The SD was assessed with the aid of a digital caliper. The DM corresponded to the number of days between the emergence and maturity of the plants. The HGM was assessed using an analytical precision balance and corrected to 13% humidity. The central row of each plot was manually harvested to evaluate the YIE, which was then corrected for 13% humidity and extrapolated to kg ha−1. Figure 2 demonstrates a diagram of ground data collection.

Fig. 2
figure 2

Diagram of ground data collection

Statistical analysis

The data were submitted to individual analyses of variance, considering all effects as fixed. After verifying that the ratio between the largest and smallest mean squared errors did not exceed 7.0, a joint analysis was performed in accordance with the model described in Eq. 1. The Scott–Knott test [27] was used for grouping the means.

$$ Y_{ijk} = \mu + B_{k} + G + S_{j} + G \times S_{ij} + \varepsilon_{ijk} , $$

where \(Y_{ijk}\) is the observation for the k-th block evaluated in the i-th cultivar during the j-th crop season; \(B_{k}\) is the fixed block effect; \(G_{i}\) is the fixed genotype effect; \(S_{j}\) is the random crop season effect; \(G \times S_{ij}\) is the random interaction between genotypes and crops; and \( \varepsilon_{ijk}\) is the error associated with observation \(Y_{ijk}\).

Pearson’s correlations (r) between the evaluated trait pairs were estimated according to Eq. 2:

$$ r = \frac{{COV_{{\left( {xy} \right)}} }}{{\sqrt {\hat{\sigma }_{x}^{2} } \times \hat{\sigma }_{y}^{2} }}, $$

where COV(XY) is the covariance between traits X and Y; \(\hat{\sigma }_{x}^{2}\) is the variance of trait X; and \(\hat{\sigma }_{y}^{2}\) is the variance of the YIE.

The graphical expression was performed using the functional relationship between the correlation coefficient estimates of the different environments, using a correlation network generated using Rbio software [28], in which the proximity between the nodes (traces) was proportional to the total value of the correlation between these nodes [29]. The thickness of the edges was controlled by applying a 0.60 cut-off value, in which only |rij| ≥ 0.60 have their edges highlighted. Thus, positive correlations were highlighted in green, while negatives correlations were highlighted in red.

The path analysis, considering YIE or DM as the principal dependent variable and the wavelengths and VIs as explanatory variables, was conducted according to the model described in Eqs. 3 and 4:

$$ YIE = \hat{\beta }_{1} 550 + \hat{\beta }_{2} 660 + \cdots + \hat{\beta }_{11} MSAVI + p_{\varepsilon } , $$
$$ DM = \hat{\beta }_{1} 550 + \hat{\beta }_{2} 660 + \cdots + \hat{\beta }_{11} MSAVI + p_{\varepsilon } , $$

where \(\upbeta _{1} , \,\upbeta _{2} , \cdots\upbeta _{11}\) are the direct effects for the variables 550, 660, 735, 790, NDVI, SAVI, GNVDI, NDRE, SCCCI, EVI, and MSAVI; and \({\text{p}}_{\upvarepsilon }\) is the residual effect. All statistical analyses were performed using Genes [30], Sisvar [31], and Rbio software, following the criteria recommended by Cruz et al. [32].


Table 3 shows the analyses of variance for the agronomic traits, wavelengths, and vegetation indices evaluated in ten soybean cultivars. There were significant differences (p-value ≤ 0.05) between the genotypes (G) for all analyzed variables. The crop season (S) was not significant for the FPH, GNDVI, and SCCCI. It is important to emphasize that all evaluated variables showed a coefficient of variation (CV) below 20%. The wavelengths and VIs showed the lowest CVs, varying from 1.15 (NDVI) to 7.88% (EVI). The CVs of the agronomic traits varied from 9.33 (DM) to 18.89 (SD). These results reveal a high precision of the measurements, especially for the spectral variables, and the possibility of an accurate association between the spectral variables and cycle and yield-related variables.

Table 3 P-values for first pod height (FPH), plant height (PH), number of branches (NB), main stem diameter (SD), days to maturity (DM), grain yield (YIE), wavelengths (550, 660, 735 and 790 nm) and vegetation indices (NDVI, SAVI, GNDVI, NDRE, SCCCI, EVI and MSAVI), evaluated in 10 soybean genotypes cultivated in the 2018/2019 and 2019/2020 crop seasons

Regarding the grouping of the agronomic trait means (Table 4), the genotype 7739 had the highest FPH, NB, and diameter of the main stem (DS). The genotype 8372 had the highest FPH, NB, and DM, while the genotype Bonus showed the highest PH and SD. The genotype Pop had higher PH, NB, and SD means along with a lower DM. The genotypes Maracai and Foco showed the highest grain yields throughout the crop seasons.

Table 4 Grouping of means ± standard deviation for first pod height (FPH), plant height (PH), number of branches (NB), main stem diameter (SD), days to maturity (DM), and grain yield (YIE), evaluated in 10 soybean genotypes cultivated in the 2018/2019 and 2019/2020 crop seasons

The genotype 7739 presented the highest means for all assessed wavelengths, as shown in Table 5. Other genotypes obtained high means for two of the wavelengths, including 8372 (660 and 735 nm), Bonus (550 and 790 nm), Foco, and Maracai (735 and 790 nm). Table 6 shows the mean groupings of the VIs between the genotypes. It is important to note that the genotypes Maracai and Foco belonged to the groups with the highest means for all VIs. The genotypes 7067, 71,110, Bonus, Desafio, Maracai, Foco, and Soyouro obtained the highest means for the GNDVI. For the NDVI, the genotypes 7739, 8372, Bonus, Maracai, and Foco presented the better results.

Table 5 Grouping of means ± standard deviation for wavelengths green (550 nm), red (660 nm), near-infrared (735 nm), and infrared (790 nm), evaluated in 10 soybean genotypes cultivated in the 2018/2019 and 2019/2020 crop seasons
Table 6 Grouping of means ± standard deviation for NDVI, SAVI, GNDVI, NDRE, SCCCI, EVI, and MSAVI, evaluated in 10 soybean genotypes in the 2018/2019 and 2019/2020 crop seasons

The Pearson’s correlation network between the evaluated variables is shown in Fig. 3. The YIE was positively correlated with the NDVI and certain wavelengths (735 and 790 nm). The path analysis considering the DM as the principal dependent variable is shown in Table 7. The GNDVI and NDRE vegetation indices had the highest direct effects (module), which were also in the same direction as their correlations with the DM. Table 8 shows the direct and indirect effects of the wavelengths and VIs on the YIE. The NDVI had a higher direct effect (module), which was in the same direction as its correlations with the YIE and MSAVI. The coefficients of determination (R) for the path analysis considering DM as the principal dependent variable (Table 7) was 0.71, while the R of the analysis considering YIE as the principal dependent variable (Table 8) was 0.81. Both R values are considered adequate, revealing that the evaluated variables explained most of the data variation.

Fig. 3
figure 3

Pearson correlation network between first pod height (FPH), plant height (PH), number of branches (NB), main stem diameter (SD), days to maturity (DM), grain yield (YIE), wavelengths (550, 660, 735 and 790 nm) and vegetation indices (NDVI, SAVI, GNDVI, NDRE, SCCCI, EVI, and MSAVI), evaluated in 10 soybean genotypes in the 2018/2019 and 2019/2020 crop seasons

Table 7 Path analysis for the effects of wavelengths (550, 660, 735 and 790 nm) and vegetation indices (NDVI, SAVI, GNDVI, NDRE, SCCCI, EVI, and MSAVI) on days to maturity (DM), evaluated in 10 soybean genotypes in the 2018/2019 and 2019/2020 crop seasons
Table 8 Path analysis for the effects of wavelengths (550, 660, 735, and 790 nm) and vegetation indices (NDVI, SAVI, GNDVI, NDRE, SCCCI, EVI, and MSAVI) on grain yield (YIE), evaluated in 10 soybean genotypes in the 2018/2019 and 2019/2020 crop seasons

The relationships between the NDVI, NDRE, and GNDVI with the grain yield and days to maturity are shown in Fig. 4. The dashed lines between the VIs show a positive correlation and high magnitude between the NDRE and GNDVI. The NDVI presents a direct positive effect on the grain yield. As previously mentioned, there is a direct relationship between both variables, so it is possible to estimate the final production of a crop using NDVI data. The NDRE and GNDVI showed direct negative effects on the days to maturity, meaning that the higher the NDVI and GNDVI, the lower the DM of the crop.

Fig. 4
figure 4

Path diagram for results from Tables 6 and 7 for dependent variables (grain yield—YIE and days to maturity—DM) as function of selected vegetation indices (NDVI, NDRE and GNDVI). Dashed lines indicate Pearson’s correlations between variables, while continuous lines reveal the direct effects obtained by path analysis (previously performed)


The significance of the other variables can be related to the distinct weather conditions between the crop seasons, such as rainfall and temperature. The interaction (G × S) was significant for the first pod height (FPH), plant height (PH), number of branches (NB), days to maturity (DM), grain yield (YIE), 550 nm wavelength, and NDVI.

The variables differ between the genotypes due to their genetic and morphological characteristics and the environmental conditions to which they were subjected. Overall, the plants showed a low reflectivity in the visible spectrum range (400 to 700 nm) due to the influence of chlorophyll, the most abundant pigment in leaves. The chlorophyll presented a high absorption of blue and red wavelengths, while reflecting the green wavelength, which is remarkable, especially in healthy plants [33].

The presence of genotypes Maracai and Foco in the groups with the highest means for all VIs is interesting, as these genotypes were the ones with the highest grain yield means. This finding supports the existence of a high correlation between the YIE and VIs during the reproductive stage of soybean, in which the plant reaches the maximum leaf area index and consequently has a high photosynthetic rate [34]. Another meaningful relationship was observed for Pop and Soyouro, which were the earliest genotypes and showed the highest means for the NDRE and SCCCI indices. This finding is supported by the association between soybean reflectance and phenological crop stage, in which cultivars with short cycles have faster development and higher chlorophyll concentration [35].

For the VIs, high values were only obtained for the NDVI and GNDVI. This result can be explained by the greater sensitivity of these VIs to the identification of canopy biomass, since both the GNDVI and NDVI are more sensitive to detect differences in the plant canopy [36], especially in terms of chlorophyll content and photosynthetic activity [37]. The VIs EVI, SAVI and MSAVI differ from NDVI and GNDVI especially by using correction factors, such as areas with a high presence of bare soil [6], while NDRE is more sensitive to detecting differences in late stages of growth, characterizing one of the possible reasons why the NDVI and GNVI values are higher.

The positive correlation between YIE x NDVI and wavelengths (735 and 790 nm) indicates that the higher the estimates of these wavelengths and the NDVI, the higher the grain yield achieved by the evaluated genotypes. Such results are relevant because although the grain yield is the most crucial trait in a soybean breeding program, it has low heritability due to the high environmental effect and laborious measurement [16, 38].

In this sense, including the NDVI and 735/790 nm wavelength measurements as auxiliary variables for selecting soybean genotypes is a promising strategy since they are easier to measure, faster to obtain, require less labor, and provide more accurate results compared to grain yield measurements [7, 39]. The NDVI and 735/790 nm wavelengths can remotely measure a large number of candidates for selection [16], which can improve the efficiency of breeding programs. In addition, the NDVI was positively correlated with the 735, 790, and 550 nm wavelengths, which in turn showed a positive correlation with the 660 nm wavelength. There was also a strong negative correlation between DM, GNDVI, and NDRE.

Although important, Pearson’s correlation coefficients can produce misunderstandings regarding the relationship between two variables, which may not be a true cause-and-effect relationship. A high or low correlation coefficient between two variables may result from the effects of a third variable or group of variables, thus not giving the exact relative importance of the direct and indirect effects of these factors [32]. Therefore, we performed path analysis, which investigates cause-and-effect relationships. This analysis promotes a detailed understanding of the effects of the variables involved and justifies the existence of positive and negative correlations (of a high or low magnitude) between the studied variables [40].

However, to obtain the direct and indirect effects by path analysis, the matrix X′X must be well-conditioned. Under the presence of multicollinearity, the variances associated with the path coefficient estimators can reach the highest values, becoming unreliable. Furthermore, the parameter estimates can assume values beyond the parametric space [32]. According to the criteria established in Montgomery and Peck [41], the obtained phenotypic correlation matrix estimates has strong multicollinearity since the condition number (CN) was equal to 521 and 223 when considering the YIE and DM as principal dependent variables, respectively. The CN of the phenotypic correlation matrix is calculated by the ratio of its highest eigenvalue over its lowest eigenvalue. When the condition number is less than 100, multicollinearity is weak; between 100 and 1000, multicollinearity is moderate to strong; finally, when greater than 1000, multicollinearity is severe [41]. Thus, a constant k = 0.05 was added to the X′X diagonal matrix to correct the multicollinearity for both cases.

These results of path analysis on days to maturity reveal a negative cause-and-effect relationship between the VIs and DM. Thus, the higher the values of these indices, the earlier the soybean genotypes are. This is due to the rapid initial development and higher chlorophyll concentration of these genotypes [35].

For path analysis on grain yield, we found a positive cause-and-effect relationship between NDVI and YIE. Thus, the higher the NDVI values, the higher the yield of the soybean genotypes. Lopresti et al. [14] reported that wheat crop monitoring using grain yield maps (obtained using the NDVI) could predict the grain yield 30 days before harvest. The NDVI allows for the monitoring of the soybean biomass growth, which provides information throughout the sub-periods of the crop cycle, thus establishing production estimates [42].

Soybean breeding programs increasingly seek to develop early soybean genotypes to facilitate the cultivation of crops, such as maize and cotton, during the second harvest. Thus, the DM is a continuously evaluated trait in hundreds of genotypes, but there is a lack of information concerning its relationships with the emitted wavelength and vegetation indices. The results provided by the correlation network demonstrate a negative association between the NDRE, GNDVI, and SCCCI with the DM, indicating that genotypes with higher values for these VIs can be selected for earliness. This is an important finding for soybean breeding, as it reveals the possibility of identifying early genotypes by UAV-based HTP using the VIs mentioned. Whereas traditional phenotyping of the soybean cycle is a time and labor-consuming task, requiring daily field visits to count the number of days to maturity, the use of VIs as a tool for selecting early genotypes can contribute to a significant decrease in the time and effort spent on this step of the program.

The acquisition of large-scale phenotypic data has become one of the major bottlenecks hindering crop breeding [22]. Our study provides relevant information to support management and decision-making in soybean breeding since we demonstrate that it is possible to select genotypes for earliness and yield through an easy and economical high-throughput phenotyping approach. Using the approach employed here, which involves the employment of statistical techniques to study the relationship between agronomic traits and VIs as well as the selection of genotypes based on VIs obtained by UAV imagery can increase the efficiency of current breeding programs by enabling large-scale evaluations with time and labor savings. In this sense, further studies addressing yield and maturity prediction of soybean genotypes based on the vegetation indices studied here are very promising and could be used to improve the efficiency of phenotypic evaluations in breeding programs.


Soybean genotypes with a high grain yield (Maracai and Foco) have higher vegetation index values, especially for the 735 and 790 nm wavelengths and NDVI. This vegetation index has a cause-and-effect relationship with the grain yield of soybean. Our findings suggest that NDVI can be used for high-throughput phenotyping to select genotypes for high grain yield in soybean breeding programs.

The earliest soybean genotypes have a higher NDRE and GNDVI. Due to the requirement for earlier genotypes, the number of days to maturity has been increasingly evaluated in soybean breeding programs. For a cause-and-effect relationship with the DM, we recommend that the NDRE and GNDVI vegetation indices be used for high-throughput phenotyping in soybean breeding programs seeking to select earlier genotypes.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.


  1. Conab. Monitoring of the Brazilian crop seasons. 2020;7:1–25.

  2. Li W, Clark B, Taylor JA, Kendall H, Jones G, Li Z, et al. A hybrid modelling approach to understanding adoption of precision agriculture technologies in Chinese cropping systems. Comput Electron Agric. 2020;172: 105305.

    Article  Google Scholar 

  3. Finoto EL, Carrega WC, Sediyama T, Alves de Albuquerque JA, Cecon PR, Reis MS. Efeito da aplicação de fungicida sobre caracteres agronômicos e severidade das doenças de final de ciclo na cultura da soja. Rev Agro@Mbiente on-Line. 2011;5:44.

  4. da Silva Junior CA, Nanni MR, Teodoro PE, Silva GFC. Vegetation indices for discrimination of soybean areas: a new approach. Agron J. 2017;109:1331–43.

    Article  Google Scholar 

  5. Li Z, Taylor J, Frewer L, Zhao C, Yang G, Liu Z, Gaulton R, Wicks D, Mortimer H, Cheng X, Yu C. A comparative review of the state and advancement of site-specific crop management in the UK and China. Front Agric Sci Eng. 2019.

    Article  Google Scholar 

  6. da Silva Junior CA, Teodoro LPR, Teodoro PE, Baio FHR, de Andrea Pantaleão A, Capristo-Silva GF, et al. Simulating multispectral MSI bandsets (Sentinel-2) from hyperspectral observations via spectroradiometer for identifying soybean cultivars. Remote Sens Appl Soc Environ. 2020;19: 100328.

    Google Scholar 

  7. da Silva EE, Rojo Baio FH, Ribeiro Teodoro LP, da Silva Junior CA, Borges RS, Teodoro PE. UAV-multispectral and vegetation indices in soybean grain yield prediction based on in situ observation. Remote Sens Appl Soc Environ. 2020;18: 100318.

    Google Scholar 

  8. Bai G, Ge Y, Hussain W, Baenziger PS, Graef G. A multi-sensor system for high throughput field phenotyping in soybean and wheat breeding. Comput Electron Agric. 2016;128:181–92.

    Article  Google Scholar 

  9. Andrade SM, Teodoro LPR, Baio FHR, Campos CNS, Roque CG, da Silva Júnior CA, et al. High-throughput phenotyping of soybean genotypes under base saturation stress conditions. J Agron Crop Sci. 2021.

    Article  Google Scholar 

  10. Borges MVV, de Oliveira Garcia J, Batista TS, Silva ANM, Baio FHR, da Silva Junior CA, et al. High-throughput phenotyping of two plant-size traits of Eucalyptus species using neural networks. J For Res. 2021.

    Article  Google Scholar 

  11. Santana DC, Cotrim MF, Flores MS, Rojo Baio FH, Shiratsuchi LS, da Silva Junior CA, et al. UAV-based multispectral sensor to measure variations in corn as a function of nitrogen topdressing. Remote Sens Appl Soc Environ. 2021;23: 100534.

    Google Scholar 

  12. Osco LP, Junior JM, Ramos AP, Furuya DE, Santana DC, Teodoro LP, et al. Leaf nitrogen concentration and plant height prediction for maize using UAV-based multispectral imagery and machine learning techniques. Remote Sens. 2020;12(19):3237.

    Article  Google Scholar 

  13. da Silva Junior CA, Teodoro PE, Teodoro LPR, Della-Silva JL, Shiratsuchi LS, Baio FHR, et al. Is it possible to detect boron deficiency in eucalyptus using hyper and multispectral sensors? Infrared Phys Technol. 2021;116: 103810.

    Article  CAS  Google Scholar 

  14. Lopresti MF, Di Bella CM, Degioanni AJ. Relationship between MODIS-NDVI data and wheat yield: a case study in Northern Buenos Aires province, Argentina. Inf Process Agric. 2015;2:73–84.

    Google Scholar 

  15. Marques Ramos AP, Prado Osco L, Elis Garcia Furuya D, Nunes Gonçalves W, Cordeiro Santana D, Pereira Ribeiro Teodoro L, et al. A random forest ranking approach to predict yield in maize with UAV-based vegetation spectral indices. Comput Electron Agric. 2020;178: 105791.

    Article  Google Scholar 

  16. Rutkoski J, Poland J, Mondal S, Autrique E, Pérez LG, Crossa J, et al. Canopy temperature and vegetation indices from high-throughput phenotyping improve accuracy of pedigree and genomic selection for grain yield in wheat. G3 Genes Genomes Genet. 2016;6:2799–808.

    Google Scholar 

  17. Taveira A, Pantaleão A, Campos C, Baio F, Teodoro L, Teodoro P. Selection of soybean F3 populations for agronomic and physiological traits and vegetation indices using multivariate approaches. Rev la Fac Ciencias Agrar UNCuyo. 2020;52:22–31.

    Google Scholar 

  18. Franchini JC, Jorge LDC, Correa FB, Greggio F, Debiasi H, Balbinot Junior AA, et al. Estimativa da produtividade da soja pelo uso de imagens aéreas. Embrapa Soja-Artigo em An Congr. In: Congresso Brasileiro de soja, 8, 2018, Goiânia. Inovação, tecnologias…; 2018.

  19. de Almeida AQ, Ribeiro A, Delgado RC, Rody YP, de Oliveira AS, Leite FP. Eucalyptus leaf area index estimated by vegetation indices using Landsat-5 TM images. Floresta e Ambient. 2015;22:368–76.

    Article  Google Scholar 

  20. Maes WH, Steppe K. Perspectives for remote sensing with unmanned aerial vehicles in precision agriculture. Trends Plant Sci. 2019;24:152–64.

    Article  CAS  Google Scholar 

  21. Zhao C, Zhang Y, Du J, Guo X, Wen W, Gu S, et al. Crop phenomics: current status and perspectives. Front Plant Sci. 2019;10:714.

    Article  Google Scholar 

  22. Yang W, Feng H, Zhang X, Zhang J, Doonan JH, Batchelor WD, et al. Crop phenomics and high-throughput phenotyping: past decades, current challenges, and future perspectives. Mol Plant. 2020;13:187–214.

    Article  CAS  Google Scholar 

  23. Pavan MA, Bloch MDF, Zempulski HDC, Miyazawa M, Zocoler DC. Man análise química solo e control qualidade. Londrina: IAPAR; 1992.

    Google Scholar 

  24. Fehr WR, Caviness CE. Stages of soybean development. Special report. Iowa State Univ. 1977;80:1–12.

  25. Teodoro LPR, Bhering LL, Gomes BEL, Campos CNS, Baio FHR, Gava R, et al. Understanding the combining ability for physiological traits in soybean. PLoS ONE. 2019;14: e0226523.

    Article  CAS  Google Scholar 

  26. Padró J-C, Carabassa V, Balagué J, Brotons L, Alcañiz JM, Pons X. Monitoring opencast mine restorations using unmanned aerial system (UAS) imagery. Sci Total Environ. 2019;657:1602–14.

    Article  Google Scholar 

  27. Scott AJ, Knott M. A cluster analysis method for grouping means in the analysis of variance. Biometrics. 1974;30:507–12.

    Article  Google Scholar 

  28. Bhering LL. Rbio: a tool for biometric and statistical analysis using the R platform. Crop Breed Appl Biotechnol. 2017;17:187–90.

    Article  Google Scholar 

  29. Fruchterman TMJ, Reingold EM. Graph drawing by force-directed placement. Softw Pract Exp. 1991;21:1129–64.

    Article  Google Scholar 

  30. Cruz CD. Genes: a software package for analysis in experimental statistics and quantitative genetics. Acta Sci Agron. 2013;35:271–6.

    Article  Google Scholar 

  31. Ferreira DF. Sisvar: a computer statistical analysis system. Ciência e Agrotecnologia. 2011;35:1039–42.

    Article  Google Scholar 

  32. Cruz CD, Regazzi AJ, Carneiro PCS. Modelos biométricos aplicados ao melhoramento. Viçosa: UFV; 2012.

    Google Scholar 

  33. de Barros Santo R, Holzhausen PPP, Rocha JV, Figueiredo GKDA, Lamparelli RAC. Análise dos fatores de reflectância de folhas de cana de açúcar, gerados a partir de dados de espectoradiômetro.

  34. Franchini JC, Balbinot Junior AA, Jorge LADC, Debiasi H, Dias WP, Godoy CV, et al. Use of aerial imagery obtained with drones in soyabean production systems. Doc Soja. Embrapa Centro Nacional de Pesquisa de Soja; 2018.

  35. Gross JA, dos Santos FC, Toniolo GR, Gaida W, Filho WP, Breunig FM. Resposta espectral de dosséis de soja em função de concentrações de clorofilas. Rev Bras Cartogr. 2016;68.

  36. Pinto DG, Fontana DC, Dalmago GA, Vicari MB, Fochessato E. Análise da dinâmica temporal dos índices de vegetação NDVI, GNDVI e EVI obtidos em dossel de canola. Embrapa Trigo-Artigo em An Congr. In: Congresso brasileiro de agrometeorologia, 19, 2015, Lavras, MG. Anais…; 2015.

  37. Gitelson AA, Kaufman YJ, Merzlyak MN. Use of a green channel in remote sensing of global vegetation from EOS-MODIS. Remote Sens Environ. 1996;58:289–98.

    Article  Google Scholar 

  38. Costa MM, Di Mauro AO, Unêda-Trevisoli SH, Arriel NHC, Bárbaro IM, Silveira GD, et al. Heritability estimation in early generations of two-way crosses in soybean. Bragantia. 2008;67:101–8.

    Article  Google Scholar 

  39. Seo B, Lee J, Lee K-D, Hong S, Kang S. Improving remotely-sensed crop monitoring by NDVI-based crop phenology estimators for corn and soybeans in Iowa and Illinois, USA. F Crop Res. 2019;238:113–28.

    Article  Google Scholar 

  40. Teodoro PE, da Silva CA, Corrêa CC, Ribeiro LP, de Oliveira EP, Lima MF, et al. Path analysis and correlation of two genetic classes of maize (Zea mays L.). J Agron. 2014;13:23–8.

    Article  Google Scholar 

  41. Montgomery DC, Peck EA, Geoffrey Vining G. Introduction to linear regression analysis. 3rd ed. New York: Wiley; 2001.

    Google Scholar 

  42. Fontana DC, Pinto DG, Junges AH, Bremm C. Using temporal NDVI/MODIS profiles for inferences on the crop soybean calendar. Bragantia. 2015;74:350–8.

    Article  Google Scholar 

  43. Rouse JW, Haas RH, Schell JA, Deering DW. Monitoring vegetation systems in the Great Plains with ERTS. NASA Spec Publ. 1974;351:309.

    Google Scholar 

  44. Huete A, Justice C, Van Leeuwen W. MODIS vegetation index (MOD13). Algorithm Theor Basis Doc. 1999;3:295–309.

    Google Scholar 

  45. Gitelson AA, Merzlyak MN. Remote estimation of chlorophyll content in higher plant leaves. Int J Remote Sens. 1997;18:2691–7.

    Article  Google Scholar 

  46. Raper TB, Varco JJ. Canopy-scale wavelength and vegetative index sensitivities to cotton growth parameters and nitrogen status. Precis Agric. 2015;16:62–76.

    Article  Google Scholar 

  47. Justice CO, Vermote E, Townshend JRG, Defries R, Roy DP, Hall DK, et al. The moderate resolution imaging spectroradiometer (MODIS): land remote sensing for global change research. IEEE Trans Geosci Remote Sens. 1998;36:1228–49.

    Article  Google Scholar 

  48. Qi J, Chehbouni A, Huete AR, Kerr YH, Sorooshian S. A modified soil adjusted vegetation index. Remote Sens Environ. 1994;48:119–26.

    Article  Google Scholar 

Download references


The authors would like to thank the Federal University of Mato Grosso do Sul (UFMS) and National Council for Scientific and Technological Development (CNPq)—Grant Number 303767/2020-0. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brazil (CAPES), Finance Code 001 and and Fundação de Apoio ao Desenvolvimento do Ensino, Ciência e Tecnologia do Estado de Mato Grosso do Sul (FUNDECT) process number 71/019.039/2021.


Not applicable.

Author information

Authors and Affiliations



MPOC and RGS designed the field trials and collected the phenotypic data. LPRT, CASJ, FHRB and PET designed the manuscript, performed all statistical analyses and produced a draft of the manuscript. DCS and MFC contributed with a critical review of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Paulo Eduardo Teodoro.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

All authors of the manuscript have read and agreed to its content and are accountable for all aspects of the accuracy and integrity of the manuscript.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Santana, D.C., de Oliveira Cunha, M.P., dos Santos, R.G. et al. High-throughput phenotyping allows the selection of soybean genotypes for earliness and high grain yield. Plant Methods 18, 13 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: