Improving the sorting efficiency of maize haploid kernels using an NMR-based method with oil content double thresholds

Background Maize haploid breeding technology can be used to rapidly develop homozygous lines, significantly shorten the breeding cycle and improve breeding efficiency. Rapid and accurate sorting haploid kernels is a prerequisite for the large-scale application of this technology. At present, the automatic haploid sorting based on nuclear magnetic resonance (NMR) using a single threshold method has been realized. However, embryo-aborted (EmA) kernels are usually produced during in vivo haploid induction, and both haploids and EmA kernels have lower oil content and are separated together using a single threshold method based on NMR. This leads to a higher haploid false discrimination rate (FDR) and requires secondary manual sorting to select the haploid kernels from the mixtures, which increases the sorting cost and decreases the haploid sorting efficiency. In order to improve the correct discrimination rate (CDR) in sorting haploids, a method to distinguish EmA kernels is required. Results Single kernel weight and oil content were measured for the diploid, haploid, and EmA kernels derived from three maize hybrids and nine inbred lines by in vivo induction. The results showed that the distribution of oil content showed defined boundaries between the three types of kernels, while the single kernel weight didn't. According to the distribution of oil content in the three types of kernels, a double-threshold method was proposed to distinguish the embryo-aborted kernels, haploid and diploid kernels based on NMR and their oil content. The double thresholds were set based on the minimum oil content of diploid kernels and the maximum content of EmA kernels as the upper and lower boundary values, respectively. The CDR of EmA kernels in different maize materials was > 97.8%, and the average FDR was reduced by 27.9 percent. Conclusions The oil content is an appropriate indicator to discriminate diploid, haploid and EmA kernels. An oil content double-threshold method based on NMR was first developed in this study to identify the three types of kernels. This methodology could reduce the FDR of haploids and improve the sorting efficiency of automated sorting system. Thus, this technique represents a potentially efficient method for haploid sorting and provides a reference for the process of automated sorting of haploid kernels with high efficiency using NMR.

use of haploid breeding technology and is also a key point in haploid engineering breeding. Early haploid sorting methods can be performed at various seedling and plant growth stages. At the seedling stage, the traits of radicle length, coleoptile length, and the number of lateral seminal roots of haploid seedlings are significantly different from diploids [2]. In addition, the number of chromosomes, stomata and chloroplasts and the DNA content of haploids are significantly lower than those of diploids [3,4], so flow cytometry and microscopy are usually used to identify haploid and diploid individuals. However, these methods are carried out after seed germination or even at the seedling stage, the processes are complicated, and the efficiency is low. Haploid identification can also be conducted at the plant growth stage based on morphological characteristics; haploid plants have upright and narrow leaves and grow slowly, and plant height and ear position are significantly lower than those of diploid plants [5][6][7]. Nevertheless, the above methods are only feasible in the field or greenhouse; additionally, these methods delay haploid identification and lead to resource waste, so they are generally not used in maize haploid breeding.
In maize haploid breeding, it is ideal to discriminate haploids at the kernel stage. Presently, haploid kernels are mainly sorted by hand based on the dominant anthocyanin synthesis gene R1-nj (Navajo), which imparts a purple color to both the embryo and endosperm in double fertilized (diploid) kernels but only in the endosperm of haploid kernels, allowing haploids to be identified and screened by their colorless embryos [8][9][10]. However, manual sorting is not efficient for haploid kernels with unclear color expression in the embryo and endosperm [11,12], and results in high labor costs and high timeconsumption. In order to improve the efficiency of haploid sorting, the first automatic sorting equipment was developed by means of computer vision. This system was based on the R1-nj color marker expressed in the embryo to identify haploid and diploid kernels, the correct discrimination rate (CDR) of sorting was ~ 80% [13,14]. Nevertheless, haploid kernels cannot be identified by imaging when the embryo side is facing away from the light source. For this reason, the rugged slope module was appended, and a high-speed camera was used to acquire multiple pieces of information from each kernel, which resulted in the CDR of 95% [15], but the haploids could not be sorted in flint corn and tropical germplasms, where the expression of the R1-nj (Navajo) marker is usually inhibited [16][17][18]. Moreover, deep convolutional networks and convolutional neural networks based on R1-nj color, texture and morphology were used to discriminate haploids with an accuracy of more than 94% [19,20], but embryo-aborted (EmA) kernels were not identified. Furthermore, fluorescence tags, as another novel indicator, have been used to detect haploids in immature embryos. Diploid embryos show a fluorescence signal after pollination using a fluorescent inducer with the yellow fluorescent protein (YFP) signal as the male, while haploid embryos do not [21][22][23]. The fluorescence signal of YFP is usually weak in mature embryos, so it is limited in application. The spectral method has also been used for screening of haploid kernels, after obtaining the spectral features by near infrared spectroscopy (NIR) [24][25][26][27], diffuse transmission [28], kernel locality preserving projection [29], and hyperspectral imaging technology [30,31]. Machine learning algorithms or deep belief networks were used to construct a haploid selection model to analyze the NIR spectral data, and the haploid sorting accuracy reached over 90% [32,33]. A quantitative analysis method based on the spectral features of the oil content of kernels was developed to sort haploids, and the haploid sorting accuracy was above 90% [34,35]. The spectral method provides a new concept in haploid sorting; however, repeated modeling will take a long time, and no equipment has yet been developed based on the spectra for the automatic sorting of maize haploid kernels.
The xenia effect is a common phenomenon in hybrid seed formation in maize, and the effect of xenia on the oil content in maize kernels is especially obvious because it can significantly increase the oil content of hybrid seeds when high oil lines are used as pollen donors [36]. China Agricultural University first reported that the oil xenia effect could be an indicator to discriminate haploids from hybrid kernels by their oil content, and the first high oil inducer line (CAUHOI) was successfully developed [37][38][39]. The xenia effect on oil content can lead to a significant difference in oil content between haploid (3.86%) and diploid (5.26%) kernels [36]. According to this principle, an automated haploid sorting system was developed based on nuclear magnetic resonance (NMR), and the CDR reached more than 92% [40,41]. The feasibility of sorting haploid kernels using NMR was further confirmed [42] and shown to be a simple and reliable method to identify haploid kernels from large quantities of hybrid kernels, greatly improving the efficiency of haploid sorting [43]. Recent researches on sorting haploids based on NMR spectrum and manifold learning showed that the CDR reached as high as 98 and 90% for haploids induced by high-oil and conventional inducers, respectively [44]. Although the haploid CDR has been improved to some extent, the EmA kernels have not been sorted with this level of success. At present, oil content analysis is usually employed with a single threshold to discriminate haploid kernels using NMR. However, EmA kernels are produced along with haploid and diploid kernels on the same ears during in vivo haploid induction [45][46][47]. There is a highly positive correlation between the haploid induction rate (HIR) and the embryo abortion rate (EmAR) [47,48]. The haploid and EmA kernels, both of which have lower oil content than diploid kernels, are usually mixed together during the process of haploid sorting by NMR. This leads to a higher haploid false discrimination rate (FDR), and manual secondary sorting is then needed to select the haploid kernels from the mixtures, which increases the sorting cost and decreases the haploid sorting efficiency. Therefore, the proper identification of EmA kernels is an important step in automated sorting systems based on NMR. However, little information is available about sorting EmA kernels. The objectives of this study were to (1) analyze the embryo abortion rate (EmAR) in different parental materials by in vivo haploid induction; (2) measure the single kernel weight and oil content of EmA, haploid, and diploid kernels; and (3) propose and verify the effectiveness of a double-threshold method for sorting haploids to reduce the FDR in maize haploid kernel sorting.

Variations in the embryo abortion rate
EmA kernels appear in different crosses during the process of haploid induction. In the crosses using three hybrids as females and four different inducers as males, the EmARs of the three hybrids were 4.14, 2.20, and 1.47%, respectively. The hybrid G1 (4F1/Zheng58) had the highest EmAR at 4.14% when the M1 (CAUHOI) inducer was used as the male. The three hybrids had the lowest EmARs of 2.31, 1.14, and 0.71% when the M2 (CHOI2) inducer was used as the male. The EmARs of the different inbred lines differed after in vivo induction, and ranged from 4.32-11.93%; the inbred line G6 (Lx9801) had the highest EmAR at 11.93%, while the inbred line G12 (L217) had the lowest EmAR at 4.32%. The average EmAR and HIR were 4.64% and 7.77%, respectively (Table 1). These results indicated that the identification of EmA kernels is important during haploid kernel sorting.

Comparison of single kernel weight and oil content in the different kernel types
The single kernel weight of diploid, haploid, and EmA kernels from the hybrid G1 (4F1/Zheng58) were 0.44 g, 0.45 g, and 0.36 g, respectively, and these were the highest weight of each kernel type in the experiment. The three kernel types from the inbred line G10 (Dan 598) had the lowest weight; the single kernel weight of diploid, haploid, and EmA kernels were 0.26 g, 0.26 g, and 0.21 g, respectively. The single kernel weight of diploid and haploid kernels was similar; the mean single kernel weight for EmA kernels was 0.26 g, which was lower than the mean weight for single diploid and haploid kernels ( Table 2). The results of statistical analyses (Student's t test) showed that the differences in single kernel weight were highly significant between EmA and diploid kernels (t = 8.01, p < 0.01) and between EmA and haploid kernels (t = 7.10, p < 0.01), but there was no significant difference between diploid and haploid kernels (t = 0, p > 0.05). Therefore, haploid sorting cannot be achieved based on single kernel weight (Fig. 1). For the kernel oil content, we found that the oil content of diploid kernels from the hybrid G1 (4F1/Zheng58) and the inbred lines G9 (P138) and G11 (Zheng22) were 6.04, 6.20, and 5.93%, respectively. These were the highest values for diploid kernels, while the oil content of haploid kernels from inbred line G6 (Lx9801) was the lowest at 2.19%, and the average oil contents of all haploid except for G8 and EmA kernels were < 3.5% and 1%, respectively. Student's t-test showed that the difference in oil content was highly significant in comparisons of EmA and diploid kernels (t = 30.88, p < 0.01) and EmA and haploid kernels (t = 15.50, p < 0.01) and between diploid and haploid kernels (t = 16.71, p < 0.01). Therefore, the oil content can be used to identify haploid, diploid, and EmA kernels with a high degree of confidence.

Distribution of the single kernel weight and oil content in the different kernel types
The oil content of diploid, haploid, and EmA kernels showed a clear distribution range in each experimental material (Fig. 2). In all panels, the diploid kernels had the highest oil content, the haploids had the middle oil content, the EmA kernels had the lowest oil content, and the boundaries between the three were clear (Fig. 2). Therefore, it was easy to identify the three different kernel types on the basis of their oil content.
Single kernel weight appeared to show a continuous  . 2), and there was no obvious discontinuous distribution among the three kernel types because there were overlaps between any two types of kernels. Therefore, the oil content is an appropriate indicator to distinguish diploid, haploid, and EmA kernels during in vivo haploid induction using high oil inducers. The minimum oil content of diploid kernels was set as the upper limit, and the maximum oil content of EmA kernels was set as the lower limit; this double threshold for the oil content could be used to discriminate diploid, haploid, and EmA kernels in the NMR automated sorting system. Any kernels with oil contents above the upper limit or below the lower limit would be considered as non-haploids and sorted together, while kernels with oil content between the upper and lower limits would be regarded as haploids.

Analysis of the CDR for EmA kernels
Based on the distribution of the oil content for EmA and haploid kernels (Fig. 2), the oil content of EmA kernels was < 2.0%, while it was > 2.0% for haploid kernels (Fig. 2). Therefore, an oil content of 2.0% was set as the lower threshold to test the sorting of EmA kernels, and the results showed that the average CDR for EmA kernels was 97.8%. Among the 12 maize hybrids and inbred lines used in this study, the lowest CDR for EmA kernels was 78% for G6 (Lx9801), followed by 95% for G5 (Zheng58), while the kernels from the remaining 10 materials were sorted completely with a 100% CDR for EmA kernels (Table 3).

Analysis of CDR H , CRR D and FDR
In order to determine the upper limit of the doublethreshold method, five different oil contents of 3.0, 3.5, 4.0, 4.5, and 5.0% were set to test their effects on the correct discrimination rate of haploids (CDR H ) and the correct rejection rate of diploids (CRR D ), which are two important standards for judging the efficiency of haploid sorting. The CDR H increased with an increasing upper limit value; when the upper limit threshold for the oil content was set at 5.0%, the average CDR H was 93.9%, and nine out of 12 materials reached 100% ( Table 4). The CDR H was lower, only 32.5%, at an oil content threshold of 3.0%, and the CDR was < 50% for most of the materials. However, the trend for CRR D was the opposite of that for CDR H ; at the 3.0% oil content, all hybrids and inbred lines had CRR D values of 100%, while the lowest CRR D at 5.0% oil content was only 82.3%. A higher oil content threshold was better for sorting haploid kernels, and a lower threshold was better for sorting diploid kernels.
The CDR H and CRR D would allow haploid kernels to be sorted correctly, and analysis of the CDR H and CRR D combination showed that the average rate could be > 90% at a threshold oil content of 4.5%. Thus, the upper limit of the double threshold for the different maize materials is between 3.5 and 4.5% (Table 4). For a single threshold, the average FDR was > 50% at five different oil content thresholds, and the FDR was the highest (65.1%) at a 3% oil content threshold. For the double thresholds, the FDR was 44.6% lower than that of the single-threshold method, and the lowest rate was only 0.9% at a 3.0% oil content threshold. Thus, using a double-threshold method can significantly reduce the FDR in contrast to using a single threshold (Table 4).

Comparison between single-and double-threshold methods
The FDR, another important indicator to evaluate haploid sorting efficiency, is defined as the percent of nonhaploid kernels in the haploid kernel group. CDR H and CRR D were the same for both the single-and doublethreshold methods, but the FDR of the double-threshold method was reduced by 27.9% compared to that when using a single-threshold method. The minimum FDR for the hybrid G1 (4F1/Zheng58) and inbred line G9 (P138) was 40% (Table 5) using a single threshold because the haploid and EmA kernels were mixed together, and five of the 12 maize inbred lines and hybrids had FDRs > 50%. Compared with a single threshold, using a doublethreshold can effectively reduce the FDR; kernels from G1 (4F1/Zheng58) and G9 (P138) showed the greatest reduction, with an FDR for both materials of 0. The FDR also declined in the other materials with the use of a double-threshold (Table 5). This indicates that the double-threshold method can effectively reduce the FDR of haploids by correct identification of EmA kernels.

Discussion
In previous studies, the R1-nj color marker was usually used for automated sorting of maize haploid kernels. The principle behind this method is based on a computer vision system to discriminate haploid and diploid kernels by obtaining images of each kernel, and the CDR of haploids was ~ 80% [13,14]. This color marker was also used to discriminate haploid kernels by artificial methods. However, the expression of this color marker is usually suppressed in maize kernels with the C-I gene; thus, it is not feasible to use this marker in some maize inbred Table 3 The CDR EmA of the 12 different maize materials using the double-threshold method    lines or hybrids that carry inhibitors of R1-nj expression. China Agricultural University first reported that the xenia effect could be an indicator to identify haploid kernels [36]; the effects of oil mass, seed weight, and oil content on the sorting of haploid and diploid kernels were analyzed, and oil content was found to be optimal for classification [42]. In the automated NMR system based on oil content, the CDR for haploid kernels was 94% [41]. EmA kernels are a common phenomenon in in vivo haploid induction [21,47,49], and EmA kernels with lower oil contents are usually mixed with haploid kernels when a single threshold for oil content is used in sorting. In the present study, the percent of EmA to haploid kernels in maize hybrids and inbred lines were 53.6% and 87.4%, respectively. This implies that > 50.0% of the sorted kernels will be EmA kernels, requiring a second manual sorting, which increases the cost of sorting. Therefore, the CDR of haploid and EmA kernels from diploid kernels is an important part of automated haploid sorting. To this end, single kernel weight and oil content were measured for EmA, haploid, and diploid kernels during haploid induction. The results showed that a single kernel weight alone is not suitable for identifying the three types of kernels because there is no significant difference between diploid and haploid kernel weights, and there are overlaps between the weights of any two groups of kernels. However, the oil content showed an obvious difference between the diploid (5.5%), haploid (3.1%), and EmA (0.61%) kernels, so it could be used as an indicator. Based on this, a double threshold method based on evaluation of the oil content is first proposed for haploid kernel sorting in the present study; we found that this method reduced the haploid FDR by 27.9%, which improved the accuracy of haploid sorting and advanced haploid breeding technology in practice.

Materials
There is a certain degree of overlap in the oil content between haploid and diploid kernels [41,43], which leads to an increase in the haploid FDR. Therefore, the determination of threshold T1 is important, and previous studies have reported the determination of threshold T1 using the Bayesian classifier or the least squares error method [11,50]. In our study, the average rate for the combination of CRR D and CDR H was > 90% as a standard for setting the upper limit thresholds. Threshold T1 for the different maize materials was 3.5-4.5%. However, EmA kernels could not be screened with a single threshold method. To sort EmA kernels, we had to also set a lower limit threshold, T2. There were obvious distinctions between the range of oil contents in EmA, haploid, and diploid kernels (Fig. 1b), and the oil content of EmA kernels was usually < 2.0% (Figs. 2, 3). The oil contents of diploid and haploid kernels were > 2.0%, which indicated that an oil content of 2% can be set as the lower limit, T2, to screen for EmA kernels. The kernels with oil contents between T1 and T2 were regarded as haploid, and kernels in which the oil content was < T2 and > T1 were regarded as EmA and diploid kernels, respectively, and were sorted together. Therefore, rapid and accurate sorting of haploid kernels can be realized by setting the appropriate oil content thresholds using the double-threshold method.
Initially, an oil content threshold was set to sort haploids using NMR, which neglected the existence of EmA Fig. 3 Distribution of oil contents for haploid, diploid, and EmA kernels. Dashed vertical lines labeled T1 and T2 indicate the upper and lower oil content thresholds for haploid kernels, respectively kernels. Due to the lower oil content of EmA kernels, they were mistakenly recognized as haploid kernels during sorting, which resulted in a higher FDR of haploid discrimination. Although the phenotype of EmA kernels is easy to determine [51,52], secondary sorting is needed, which increases the cost of sorting haploid kernels. Therefore, it was necessary to improve the haploid FDR. At present, there is no information about EmA kernel discrimination based on NMR. Based on the results from the single threshold method, we proposed the doublethreshold method to sort haploid kernels, where in seeds produced by high oil inducers would be divided into two groups: haploid kernels group and a mixture of diploid and EmA kernels. By implementing the double-threshold sorting method, the CDR for EmA kernels (CDR EmA ) reached 97.8%. The effect of sorting haploid kernels using a double-threshold was evaluated at five different oil content levels for the upper limit, and the haploid FDR was reduced by 44.6% compared with that of the single threshold method.
The maize haploid sorting system based on NMR using a double-threshold can save sorting costs in practice. Presently, the sorting system using the single threshold has a sorting speed of 4 s/kernel and can sort more than 20,000 kernels/day (24-h working time/day), it takes only one person less than 1 h to add kernel samples into the sample hopper, and the total cost is ¥ 15 yuan/day. Less than 3000 kernels/day are sorted by hand for each person, which will cost ¥ 120 yuan/day (8 h of working time a day). The automated sorting system can sort the same number of kernels as approximately 7 people sorting by hand, and the cost of manual sorting is ¥ 840 yuan/day. Therefore, the automated sorting system can reduced costs by ¥ 825 yuan per day compared to manual sorting. The double-threshold method can eliminate the step of manual secondary sorting and save ¥ 120 yuan/day compared to the single-threshold method and ¥ 945 yuan/ day compared to manual sorting, which indicates that the double-threshold method can significantly reduce the cost of sorting haploid kernels. Furthermore, another advantage of the double-threshold method is that it can reduce the haploid FDR and reduce the workload required for manual secondary sorting of EmA kernels from haploids. Furthermore, the automated haploid sorting system using NMR was based on the xenia effect of the oil content, which results in a difference in oil contents between diploid and haploid kernels. However, the xenia effect on the oil content depends on the genetic backgrounds of the male and female parents [53]. Lambert et al. used nine hybrids as females and pollinated them with both a high-oil and a normal-oil male to compare the effects of the two pollinators. They found that the oil contents of four normal-oil hybrids were 6.0-7.0% after pollination with a high oil pollinator, which was an increase of 1.7% in oil content in these same hybrids pollinated with the normal-oil male [54]. The oil content of hybrid seeds increases with increasing oil content of the male parent, and the influence of the male parent is greater than that of the female parent [55]. Thus, an effective method would be to select inducers with high seed oil contents and a stronger xenia effect, which will increase the difference in oil content between haploid and diploid seeds and reduce the overlap between them, effectively preventing diploid kernels from mixing with the haploids and further reducing the haploid FDR.

Conclusions
The automated sorting of maize haploid kernels using NMR is a promising way to replace human labor. In order to discriminate between the diploid, haploid, and EmA kernels during haploid induction, single kernel weight and oil content were measured for the three types of kernels, and the results showed that the oil content was the better indicator in the haploid sorting system. Using a double-threshold for the oil content to sort haploid kernels was first proposed in the present study. The average CDR EmA was > 97.8%, and the FDR was reduced by 27.9% with the double-threshold method compared to that with the single-threshold method. An oil content of 2.0% as the lower limit in the double-threshold method was suitable for most donors, and the mean overlap in the oil content between the haploid and diploid kernels can be used as the upper limit as a compromise. Our results verified the feasibility of discriminating between haploid, diploid, and EmA kernels with the double-threshold method, which will improve the haploid kernel sorting efficiency and reduce the cost of haploid identification.

Experimental materials
Three maize inducer inbred lines, CAUHOI, CHOI2, YHI-1, and the inducer hybrid CAUHOI/CHOI2, were used as the male parents in this study. The haploid induction rates (HIRs) of CAUHOI and CHOI2 are ~ 2.0% and ~ 7%, respectively [38,56], and both inducers were developed at China Agricultural University. The maize inducer YHI-1 with a HIR > 10% was developed at Henan Agricultural University [47,57]. Three maize hybrids, 4F1/Zheng58, Mo17/L119A, and Xun9058/Dan598, were used as females in crosses with each inducer, and nine inbred lines (TieC8605-2, Zheng58; Lx9801, K12, Dan598, Zheng22, Yuzi87-1, P138, and L217) were selected for use as females to cross with CHOI2 (Table 6). Crosses were performed (Fig. 4) at the Hainan Experimental Station of Henan Agricultural University (18°21 N, 109°10E) during the winter of 2017. Plants were sown in single rows that were 4 m long with a 0.60 m distance between the rows and 20 plants per row. The female donors were detasseled before flowering. Standard agronomic practices in maize production such as irrigation, fertilization, and weeding were used during the entire growth period.

Identification of the different kernel types
The ears of each cross pollinated with inducer lines were harvested at maturity. Haploid kernels were selected based on the phenotype conferred by the dominant marker gene R1-nj [9]. Kernels displaying purple endosperm and a purple embryo were classified as diploid, kernels with purple endosperm and a colorless embryo were classified as haploid, and kernels with purple endosperm and no embryo were classified as EmA kernels (Figs. 4, 5). SPSS 19.0 software was used for analysis of the phenotypic data.

Kernel sorting system based on a double-threshold for the oil content
The automated haploid sorting system was based on NMR (model No.: Online MR20-015 V) from Shanghai Niumai Technology Co., Ltd. The equipment mainly

HIR(%) =
Haploid kernels Diploid kernels + Haploid kernels × 100% EmAR(%) = EmA Diploid + EmA + Haploid kernels × 100% consisted of eight parts (Fig. 6): (a) sample hopper, which stored the sample grain to be screened; (b) automatic sampling system, which was composed of a vibration plate and propeller; (c) automatic weighing system, which measured the single kernel weight; (d) air compressor, which provided power for automatic sampling and sorting; (e) constant temperature system, which provided a 35℃ temperature environment for the magnet; (f ) a nuclear magnetic resonance (NMR) device, which provided a magnetic field; (g) control system, which controlled the kernel sorting system; and (h) measuring software, which converted the oil component signal into a numerical value, displayed and saved the measured values. The single kernel weight and oil content for diploid, haploid and EmA kernels were measured by the flow chart of the NMR sorting system (Fig. 7). There were three main steps to the operation: 1. Sampling: A vibrating plate was used to transmit kernels, and compressed air pushed them onto the weighing sensor; the weight of the kernel was taken.
2. NMR: Compressed air was used to push the kernel into the sample pipeline and NMR sample holder; NMR measurement was then performed.
3. Sorting: The oil content was determined by a combination of signals for oil based on NMR and the single kernel weight, and the test value of the oil content was compared with the threshold.
Single-threshold: only one threshold (T1) was set. If the test value (t) of the sample was > T1, the kernel was designated as diploid and sorted into the diploid group. If the test value (t) was < T1, the kernel was considered to be haploid and will be sorted into the non-diploid group (Fig. 8a). Double-threshold: The software set two thresholds, T1 and T2, which were the upper and lower limits of the oil content, respectively. If the test value (t) of the sample was greater than the lower limit value and less than the upper limit value, the sample was considered a haploid kernel and was sorted into the haploid group. If the test value (t) was less than the lower limit value or more than the upper limit, the sample was sorted into the non-haploid group (Fig. 8b).

Sorting haploid and EmA kernels based on the double-threshold method
Based on the R1-nj marker and embryo shape, 150 diploid kernels, 30 haploid kernels, and 20 EmA kernels were selected. The sorting efficiency of EmA kernels was tested using the double-threshold method. A total of 200 kernels were measured five times, and the number of sorted EmA kernels was recorded to calculate the CDR for the sorting of EmA kernels (CDR EmA ) by the following formula: Fig. 4 The process of maternal haploid induction by in vivo induction CDR EmA , correct discrimination rate of EmA kernels; N EmA , the number of EmA kernels in the non-haploid group; T EA , the total number of EmA kernels.
Five oil content thresholds, 5.0%, 4.5%, 4.0%, 3.5%, and 3.0%, were set as the upper limits for the double-threshold method to determine the CDR H , CRR D and FDR. The lower limit was set at 2.0%. The number of diploid, CDR EmA (%) = N EmA T EmA × 100% haploid, and EmA kernels was recorded to calculate the haploid CDR, FDR, and CRR D .
CDR H , correct discrimination rate of haploid kernels; N H , the number of haploid kernels in the haploid group; T H , the total number of haploid kernels.