- Open Access
Comparison and extension of three methods for automated registration of multimodal plant images
© The Author(s) 2019
- Received: 15 September 2018
- Accepted: 17 April 2019
- Published: 29 April 2019
With the introduction of high-throughput multisensory imaging platforms, the automatization of multimodal image analysis has become the focus of quantitative plant research. Due to a number of natural and technical reasons (e.g., inhomogeneous scene illumination, shadows, and reflections), unsupervised identification of relevant plant structures (i.e., image segmentation) represents a nontrivial task that often requires extensive human-machine interaction. Registration of multimodal plant images enables the automatized segmentation of ’difficult’ image modalities such as visible light or near-infrared images using the segmentation results of image modalities that exhibit higher contrast between plant and background regions (such as fluorescent images). Furthermore, registration of different image modalities is essential for assessment of a consistent multiparametric plant phenotype, where, for example, chlorophyll and water content as well as disease- and/or stress-related pigmentation can simultaneously be studied at a local scale. To automatically register thousands of images, efficient algorithmic solutions for the unsupervised alignment of two structurally similar but, in general, nonidentical images are required. For establishment of image correspondences, different algorithmic approaches based on different image features have been proposed. The particularity of plant image analysis consists, however, of a large variability of shapes and colors of different plants measured at different developmental stages from different views. While adult plant shoots typically have a unique structure, young shoots may have a nonspecific shape that can often be hardly distinguished from the background structures. Consequently, it is not clear a priori what image features and registration techniques are suitable for the alignment of various multimodal plant images. Furthermore, dynamically measured plants may exhibit nonuniform movements that require application of nonrigid registration techniques. Here, we investigate three common techniques for registration of visible light and fluorescence images that rely on finding correspondences between (i) feature-points, (ii) frequency domain features, and (iii) image intensity information. The performance of registration methods is validated in terms of robustness and accuracy measured by a direct comparison with manually segmented images of different plants. Our experimental results show that all three techniques are sensitive to structural image distortions and require additional preprocessing steps including structural enhancement and characteristic scale selection. To overcome the limitations of conventional approaches, we develop an iterative algorithmic scheme, which allows it to perform both rigid and slightly nonrigid registration of high-throughput plant images in a fully automated manner.
- Multimodal image registration
- Feature-point matching
- Phase correlation
- Mutual information
- Scale space
- High-throughput plant phenotyping
In the last decade, multisensory camera systems have become indispensable tools for the high-throughput screening of quantitative plant traits upon perturbation of environmental and/or molecular-genetic factors. Multimodal screening facilities enable plant scientists to generate large quantities of image data including visible light (VIS), fluorescence (FLU), near-infrared (NIR) and 3D images that are typically analyzed separately from each other. Some image modalities such as visible light or near-infrared images exhibit low contrast between plant and background image regions, which complicates automated findings of plant structures (i.e., image segmentation). Limited efficiency of existing manual and semi-automated approaches to image segmentation has been identified as the major bottleneck of quantitative plant phenotyping pipelines . A combination of low- and high-contrast image modalities (e.g., fluorescence images) by means of multimodal image registration can help to overcome the limitations of unimodal image processing. Once aligned, the binary mask of a segmented FLU image can be applied for extraction of plant regions in optically more heterogeneous VIS images. Consequently, multimodal image registration is an important tool for the automatization of plant image analysis and quantitative trait derivation from high-throughput phenotyping data.
Multimodal image alignment begins with establishment of mutual correspondences between each two structurally similar but nonidentical images. Due to large variability in optical appearance of different plants as well as the same plant in different image modalities, it is not evident what kind of image features and registration algorithms can be universally applied for the alignment of different multimodal plant images.
Differences in spatial camera resolution, position and orientation can, in general, be modeled by a combination of scaling, translations, and rotations. A plethora of methods for image registration has been developed in the past, particularly in the context of biomedical image analysis [2–6]. Depending on the type of image features or intrinsic algorithmic principles, different categorizations of registration techniques have been suggested in the literature. Here, we rely on the algorithm-focused classification of registration methods into three major groups: (i) feature-point, (ii) frequency domain and (iii) intensity-based techniques.
Methods based on the matching of feature-points (FPs) are applied when corresponding image regions exhibit local structural similarity. Pairwise correspondences between two sets of feature-points are then used for calculation of geometrical transformations. Common approaches for the detection of feature-points are based on edges and corners (e.g., FAST , Shi and Tomasi , Harris operators , SUSAN ), blob detection (e.g., MSER , DoG, DoH), structure tensors, and generalized feature descriptions (e.g., SURF , HOG, SIFT ). The main limitation of FP methods is the difficulty in finding a sufficient number of corresponding points in similar but nonidentical images of different modaliti .
Another prominent approach to image alignment relies on finding correspondences in the frequency domain. For example, Fourier- or Fourier-Mellin phase correlation (PC) techniques make use of the Fourier-shift theorem, which reformulates the problem of finding a shift in Cartesian or polar system coordinates to the phase-shift of Fourier transforms [15–17]. A closer analysis of PC methods shows that they basically perform correlations of all image structures that contribute to the synchronization of Fourier phases such as edges and corners . Previous works reported that PC is surprisingly robust with respect to statistical structural image noise [19–21]. This remarkable feature of PC originates from the insensitivity of inverse Fourier integrals with respect to distortions of just a few spectral bands such as high- or low-frequency noise . However, PC is also known to be less accurate in the presence of multiple structurally similar patterns or considerable structural dissimilarities such as nonrigid image transformations. The necessity of additional preprocessing steps including image filtering and scaling for improved performance of multimodal image registration using PC was repeatedly reported in the previous literature [23, 24]. Downscaling to a proper size appears to improve the robustness and accuracy of image registration by suppressing modality-specific high-frequency noise, which effectively enhances image similarity .
Alternatively to landmarks and frequency domain features, intensity-based methods rely on maximization of global image similarity measures such as the normalized cross-correlation (NCC) [26, 27] or the mutual information (MI) [28–32]. As a dimensionless quantity, characterizing structural image similarity of the mutual information has a considerable advantage of being independent from differences between image intensity functions and histograms . This property makes MI-based registration particularly suitable for image alignment that exhibits partial structural similarity but different image intensity levels.
The above registration techniques were previously applied for alignment of medical, microscopic and aerial images. Applications of image registration in the context of multimodal plant image analysis are, however, relatively scarce [34–36]. Structural differences between images of different modalities, the presence of nonuniform image motion and blurring make alignment of multimodal plant images a challenging task. Here, we investigate the performance of three registration methods by a direct comparison with manually segmented FLU and VIS plant images of different plants. The developed algorithmic scheme is, however, not limited to FLU/VIS images and can principally be applied to coregistration of other modalities (e.g., near-infrared, 3D projection images) as well. Our experimental results show limitations of conventional approaches by straightforward application to the registration of FLU/VIS plant images. Extensions of conventional algorithmic schemes are presented that allow improvement of the robustness and accuracy of image registration by application to the automated processing of large quantities of image data in the context of high-throughput plant phenotyping.
Image acquisition and preprocessing
An overview of image data used in this study including three different experiments of three different species, each taken in visible light and fluorescence, obtained by three different LemnaTec high-throughput phenotyping facilities for large, intermediate-size and small plants at the IPK Gatersleben
# FLU/VIS pairs
2056 × 2454
1234 × 1624
1234 × 1624
1234 × 1624
2056 × 2454
1038 × 1390
Image registration using built-in and extended MATLAB functions
For feature-point matching, several different edge-, corner- and blob-detectors were used. In addition to built-in MATLAB functions that rely on one particular feature detector, an integrative multifeature generator was introduced that merges the results of different feature-point detectors.
Alternative image registration techniques based on frequency domain features rely on the MATLAB imregcorr function, which performs Fourier-Mellin phase correlation of the corresponding spectral image transforms. For assessment of image transformation reliability, a fixed threshold of the maximum PC peak height (i.e., \(H>0.03\)) was used as suggested in . Transformations obtained with \(H<0.03\) typically indicate a failure of PC registration, for example, due to excessively low and missing structural similarities between two images.
Overview of three groups of image alignment methods including feature-point (FP) matching, phase correlation (PC) and image intensity (mutual) information (INT) image features and corresponding MATLAB functions used for calculation of pairwise image correspondences
Evaluation of image registration
To evaluate the results of image registration, two criteria for characterizing the robustness and accuracy of image alignment are used.
Success rate of image registration
Accuracy of image registration
Success rates and accuracy ratios of the successful alignment of originally sized Arabidopsis, wheat, and maize FLU/VIS images using FP/PC/INT registration techniques
Success rate (%)
Figure 7 shows success and accuracy statistics of image registration by combined application of all three methods (FP, PC, and INT) and both image representations, i.e., grayscale (GS) and color-edge (CE) images. From this diagram, it is evident that the majority of FLU/VIS image pairs can be successfully registered with more than one method and preprocessing condition. However, there are also some cases where only a few or even only one particular method is capable of successfully performing FLU/VIS image alignment. Again, background filtering in manually segmented images significantly improves success rates by combined application of different registration techniques; see Fig. 7a, b. To quantify the advantage of combined image registration, the maximum accuracy among all six techniques (i.e., FP-CE, FP-GS, PC-CE, PC-GS, INT-CE, and INT-GS) is calculated. From Fig. 7c, it is clearly visible that some plants (e.g., Arabidopsis) can generally be registered more accurately by one single registration step than others, and background elimination decisively improves the accuracy performance of FLU/VIS registration.
Depending on image preprocessing, registration algorithms may calculate quite different image transformations. Figure 9 shows component distributions of the transformation matrix that were assessed with different registration techniques and preprocessing conditions (i.e., scaling factors, background filtering). As one can see, the values of scaling, rotation and translation undergo considerable variations that correspond to both optimal and suboptimal FLU/VIS image alignments, such as those shown in Fig. 8. At first glance, registration dependency on structural image content and preprocessing appears to be disadvantageous. However, it turns out to be a very helpful feature. Here, we exploit the variability of geometrical transformations resulting from optimal and suboptimal image registration to construct an integrated registration mask that allows for a piecewise approximation of nonuniformly moving plant regions that otherwise could not be completely covered by a single-step registration; see Fig. 8e.
Multimodal image registration opens new possibilities for the automatization of image segmentation and analysis in high-throughput plant phenotyping. Using image registration, the result of a straightforward FLU image segmentation can, for example, be applied to automatically detect plant regions in optically more heterogeneous visible light images. Furthermore, the spatial alignment of different image modalities paves the way for consistent assessment of a multiparametric plant phenotype including information on local chlorophyll/water content and disease-/stress-related pigmentation. Our experimental results using three common registration techniques (FP, PC, and INT) show that the robustness and accuracy of FLU/VIS image alignment undergo substantial variations depending on the plant species, interplay between the background and plant intensities, and image preprocessing conditions. In general, background filtering, structural enhancement and downscaling significantly improve the performance of FLU/VIS image registration. However, none of the methods and preprocessing conditions offers universal advantages that guarantee optimal results of single-step registration by application to arbitrary image data. On the basis of insights gained in this study, we conclude that a combination of different registration techniques, scaling levels and image representations (i.e., grayscale and color-edge) enables significantly more robust and accurate results to be obtained when compared to single-step image alignment using one particular method and/or one particular image preprocessing filter. We began this study with the assumption of global rigid image transformations. However, it turned out that FLU/VIS images may exhibit nonuniform motion due to uncorrelated inertial movements of tillers and leaves after relocation or rotation of plant carriers during stepwise image acquisition. Integration of multiple registration results obtained for different preprocessing conditions into one single integrated mask allows this problem to be overcome by constructing a piecewise approximation of nonuniform image motion, which otherwise would require the application of significantly more expensive nonrigid registration.
The basic approach to automated alignment of plant images using a combination of feature detectors and preprocessing conditions presented in this work was evaluated with fluorescence and visible light images, but the results can principally be applied to coregistration of other image modalities, e.g., near-infrared images.
MH and EG conceived, designed and performed the computational experiments, analyzed the data, wrote the paper, prepared the figures and tables, and reviewed drafts of the paper. AJ and KN executed the laboratory experiments, acquired image data, co-wrote the paper, and reviewed drafts of the paper. TA co-conceptualized the project and reviewed drafts of the paper. All authors read and approved the final manuscript.
We would like to thank Mohammad-Reza Hajirezaei from the Molecular Plant Nutrition Group of IPK Gatersleben for kindly providing the image data of the Arabidopsis growth experiment.
The authors declare that they have no competing interests.
This work was performed within the German Plant-Phenotyping Network (DPPN), which is funded by the German Federal Ministry of Education and Research (BMBF) (Project Identification Number: 031A053). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Minervini M, Scharr H, Tsaftaris SA. Image analysis: the new bottleneck in plant phenotyping. IEEE Signal Proc Mag. 2015;32:126–31.View ArticleGoogle Scholar
- Zitova B, Flusser J. Image registration methods: a survey. Image Vis Comput. 2003;21:977–1000.View ArticleGoogle Scholar
- Xiong Z, Zhang J. A critical review of image registration methods. Intl J Image Data Fusion. 2010;1:137–58.View ArticleGoogle Scholar
- Phogat RS, Dhamecha H, Pandya M, Chaudhary B, Potdar M. Different image registration methods—an overview. Int J Sci Eng Res. 2014;5:44–9.Google Scholar
- Lahat D, Adali T, Jutten C. Multimodal data fusion: an overview of methods. Proc IEEE. 2015;103:1449–77.View ArticleGoogle Scholar
- Goshtasby AA. Theory and applications of image registration. Hoboken: Wiley; 2017.View ArticleGoogle Scholar
- Rosten E, Drummond T. Machine learning for high-speed corner detection. In: 9th European conference on computer vision, 2006; vol. 1, pp. 430–443.View ArticleGoogle Scholar
- Shi J, Tomasi C. Good features to track. In: Proceedings of the 9th IEEE conference on computer vision and pattern recognition, 1994; pp. 593–600.Google Scholar
- Harris C, Stephens M. A combined corner and edge detector. In: Proceedings of the 4th Alvey vision conference, 1988; pp. 147–151.Google Scholar
- Smith SM, Brady JM. Susan—a new approach to low level image processing. Int J Comput Vis. 1997;23(1):45–78.View ArticleGoogle Scholar
- Matas J, Chum O, Urba M, Pajdla T. Robust wide baseline stereo from maximally stable extremal regions. In: Proceedings of British machine vision conference, 2002; pp. 384–396.Google Scholar
- Bay H, Ess A, Tuytelaars T, van Gool L. Surf: speeded up robust features. Comput Vis Image Underst. 2008;110(3):346–59.View ArticleGoogle Scholar
- Lowe DG. Distinctive image features from scale-invariant keypoints. Int J Comput Vis. 2004;60(2):91–110.View ArticleGoogle Scholar
- Cruz M, Aguilera C, Vintimilla B, Toledo R, Sappa A. Cross-spectral image registration and fusion: an evaluation study. In: 2nd international conference on machine vision and machine learning, 2015; pp. 331–15.Google Scholar
- Kuglin CD, Hines DC. The phase correlation image alignment method. In: Proceedings of international conference on cybernetics and society. 1975; vol. 1, pp. 163–5.Google Scholar
- Reddy BS, Chatterji BN. An FFT-based technique for translation, rotation, and scale-invariant image registration. IEEE Trans Image Process. 1996;5:1266–71.View ArticleGoogle Scholar
- Wolberg G, Zokai S. Robust Image Registration Using Log-Polar Transform. In: Proceedings of IEEE international conference on image processing 2000; vol. 1, pp 493–6.Google Scholar
- Kovesi P. Phase congruency detects corners and edges. In: Proceedings of the Australian pattern recognition society conference: digital image computing: techniques and applications (DICTA 2003) 2003.Google Scholar
- Stone HS, Orchard MT, Chang EC, Martucci SA. A fast direct Fourier-based algorithm for subpixel registration of images. IEEE Trans Geosci Remote Sens. 2001;39:2235–43.View ArticleGoogle Scholar
- Foroosh H, Zerubia JB, Berthod M. Extension of phase correlation to subpixel registration. IEEE Trans Image Process. 2002;11:188–200.View ArticleGoogle Scholar
- Argyriou V, Vlachos T. A study of sub-pixel motion estimation using phase correlation. In: Proceedings of British machine vision conference, 2006; pp. 387–396.Google Scholar
- Gladilin E, Eils R. On the role of spatial phase and phase correlation in vision, illusion, and cognition. Front Comput Neurosci. 2015;9:45.View ArticleGoogle Scholar
- Wisetphanichkij S, Dejhan K. Fast Fourier transform technique and affine transform estimation-based high precision image registration method. GESTS Intl Trans Comp Sci Eng. 2005;20:179.Google Scholar
- Almonacid-Caballer J, Pardo-Pascual JE, Riuz LA. Evaluating Fourier cross-correlation sub-pixel registration in landsat images. Remote Sens. 2017;9:1051.View ArticleGoogle Scholar
- Wang J, Xu Z, Zhang J. Image registration with hyperspectral data based on Fourier–Mellin transform. Int J Signal Process Syst. 2013;1:107–10.View ArticleGoogle Scholar
- Pratt WK. Digital image processing. 2nd ed. New York: Wiley; 1991.Google Scholar
- Berthilsson R. Affine correlation. In: Proceedings of the international conference on pattern recognition ICPR’98, Brisbane, Australia, 1998; pp. 1458–1461.Google Scholar
- Viola P, Wells IIIWM. Alignment by maximization of mutual information. Int J Comput Vis. 1997;24:137–54.View ArticleGoogle Scholar
- Maes F, Collignon A, Vandermeulen D, Marchal G, Suetens P. Multimodality image registration by maximization of mutual information. IEEE Trans Med Imaging. 1997;16:187–98.View ArticleGoogle Scholar
- Mattes D, Haynor DR, Vesselle, H, Lewellen T, Eubank W. Non-rigid multimodality image registration. In: SPIE medical imaging 2001: image processing, vol. 4322; 2001, pp. 1609–1620.Google Scholar
- Smriti R, Stredney D, Schmalbrock P, Clymer B D. Image registration using rigid registration and maximization of mutual information. In: The 13th annual medicine meets virtual reality conference, 2005; pp. 26–29.Google Scholar
- Barrera F, Lumbreras F, Sappa AD. Multimodal template matching based on gradient and mutual information using scale space. In: IEEE international conference on image processing; 2010, pp. 2749–2752.Google Scholar
- Deshmukh MP, Bhosle U. A survey of image registration. Int J Image Process. 2011;5:245–69.Google Scholar
- De Vylder J, Douterloigne K, Prince G, Van Der Straeten D, Philips W. A non-rigid registration method for multispectral imaging of plants. In: Proceedings of SPIE sensing for agriculture and food quality and safety IV, 2012; vol. 8369, p. 6.Google Scholar
- Chéné Y, Rousseau D, Lucidarme P, Bertheloot J, Caffier V, Morel P, Belin E, Chapeau-Blondeau F. On the use of depth camera for 3d phenotyping of entire plants. Comput Electron Agric. 2012;82:122–7.View ArticleGoogle Scholar
- Raza S, Sanchez V, Prince G, Clarkson JP, Rajpoot NM. Registration of thermal and visible light images of diseased plants using silhouette extraction in the wavelet domain. Pattern Recognit. 2015;48:2119–28.View ArticleGoogle Scholar
- Henriques JF. COLOREDGES: edges of a color image by the max gradient method. https://de.mathworks.com/matlabcentral/fileexchange/28114 2010.
- Leutenegger S, Chli M, Siegwart RY. Brisk: binary robust invariant scalable keypoints. In: International conference on computer vision, 2011; pp. 2548–2555.Google Scholar
- Alcantarilla P F, Bartoli A, Davison AJ. Kaze features. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Computer Vision – ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part VI, 2012; pp. 214–227. Springer, Berlin, Heidelberg . Chap. KAZE Features.View ArticleGoogle Scholar