Plant cell walls are a complex mixture of polysaccharides, proteins and the phenolic polymer lignin that have been recently targeted as possible sources of fermentable sugars for the production of biofuels and other bio-materials . The development of a lignocellulose biomass-based biofuels industry is partly dependent on genetic engineering and breeding of the next generation of crops containing, among other traits, easily extractable cell wall sugars. Thus, a better understanding of how plants synthesize, deposit and modify their cell walls is necessary for the selection of traits important for biofuel crop improvement .
The identification of plants with altered cell wall composition or structure can prove useful in the discovery of novel genes involved in the biosynthesis and modification of the cell wall. Such plants can be isolated using genome-wide association mapping of diverse populations or can be isolated from forward genetic screens, where a subset sample population with the desired traits is selected from a large pool of mutagenized individuals. However, the identification of these select samples requires a well-constructed screening process that is both robust and, due to the large sample population, high-throughput. Several successful plant cell wall mutant screens have been described over the years that make use of different screening methodologies. These include: acid hydrolysis and monosaccharide composition using gas-liquid chromatography , microscopic observation of xylem stem sections [4, 5], seedling growth on medium containing cell wall hydrolyzing enzymes  and Fourier-Transform Infrared (FT-IR) microspectroscopy [7, 8]. Most of these approaches either required at least some kind of sample processing or were not amenable to high-throughput screening, especially when dealing with, in some cases, thousands of mutagenized plant samples. In addition, most of these screens have been performed on the model species Arabidopsis thaliana, a dicot, which is known to have a different cell wall type than grasses .
Recently various infrared spectroscopy techniques such as Fourier Transform Mid-Infrared (FT-MIR) have been used to characterize plant cell wall model compounds and mutants [7, 8, 10–16]. Due to the chemical specificity of this infrared region (400 to 4000 cm-1), one can directly identify certain peaks related to cell wall components. However, the use of FT-MIR in these studies involved careful plant cell wall extraction and/or probing of individual plant cells with a FT-MIR microscopy objective. Though very effective and informative, the use of FT-MIR as a high throughput cell wall screening technique for a large population is not practical due to the need for meticulous sample handling.
Significantly, another region of the infrared spectrum, the near-infrared (NIR), has shown promise in the classification and characterization of plant material in a more rapid manner. In contrast to MIR, the NIR region (12000 to 4000 cm-1) does not reveal discrete signature peaks, but it excites several harmonic overtones of methyl, aromatic CH-OH, with minor features in methoxy and carbonyl CH bonds, generating spectra that have no easily distinguishing chemical features . However, with the help of multivariate analysis to deconvolve the spectrum, FT-NIR has been successfully applied to rapidly quantify and classify numerous known components in complex mixtures [18–20]. In this manner, cell wall components such as carbohydrates, ash content, and lignin have been successfully modeled and cross-validated from a defined plant set of various tissue types [21–27]. In order to correlate NIR spectra to chemical features and eventually quantify individual components in a mixture, a robust training set containing NIR spectra of a range of known concentrations is required. Using Partial Least Squares (PLS) regression, a model can then be developed to determine the concentration of these components in unknown mixtures, within the same range, by using NIR spectra alone . Successful applications of FT-NIR techniques for fast chemical characterization involve acquiring accurate sample spectra, applying robust chemometric/multivariate analysis for spectra processing and obtaining reliable calibration sets for modeling. Recently, FT-NIR and linear discriminate analysis (Mahalanobis distance) were used to screen a mutant maize population to identify putative mutants [29, 30]. In this study, approximately 1.8% of the samples were identified as putative mutants and 6 of these (17% validation rate) were confirmed by pyrolysis-molecular beam mass spectrometry. While highlighting the effectiveness of FT-NIR analysis in the discrimination of plant samples, the procedures outlined in these publications [30, 31] were limited in application details and no chemometric analysis (e.g. PLS modeling) were performed.
The non-destructive, fast and quantitative nature of NIR spectroscopy makes it a very attractive option to use for screening samples in large plant populations. This study outlines a detailed process for the application of fast scanning of intact plant leaves by NIR spectroscopy followed by an outlier detection scheme combining linear discriminate analysis and PLS modeling. The approach was validated on known cell wall mutants of rice and Arabidopsis and then applied to a rice mutant collection consisting of thousands of uncharacterized samples. The technique involves first nonspecific outlier detection using Mahalanobis distance analysis of NIR spectra followed by the development of a predictive model that could be readily implemented for a variety of analyses and applied to any collection of plant mutants or variants. We show that this approach significantly improves outlier detection over the Mahalanobis distance alone, as well as allowing the identification of specific cell wall variants in the mutant population.