Digital imaging of root traits (DIRT): a high-throughput computing and collaboration platform for field-based root phenomics
© Das et al. 2015
Received: 25 August 2015
Accepted: 11 October 2015
Published: 2 November 2015
Plant root systems are key drivers of plant function and yield. They are also under-explored targets to meet global food and energy demands. Many new technologies have been developed to characterize crop root system architecture (CRSA). These technologies have the potential to accelerate the progress in understanding the genetic control and environmental response of CRSA. Putting this potential into practice requires new methods and algorithms to analyze CRSA in digital images. Most prior approaches have solely focused on the estimation of root traits from images, yet no integrated platform exists that allows easy and intuitive access to trait extraction and analysis methods from images combined with storage solutions linked to metadata. Automated high-throughput phenotyping methods are increasingly used in laboratory-based efforts to link plant genotype with phenotype, whereas similar field-based studies remain predominantly manual low-throughput.
Here, we present an open-source phenomics platform “DIRT”, as a means to integrate scalable supercomputing architectures into field experiments and analysis pipelines. DIRT is an online platform that enables researchers to store images of plant roots, measure dicot and monocot root traits under field conditions, and share data and results within collaborative teams and the broader community. The DIRT platform seamlessly connects end-users with large-scale compute “commons” enabling the estimation and analysis of root phenotypes from field experiments of unprecedented size.
DIRT is an automated high-throughput computing and collaboration platform for field based crop root phenomics. The platform is accessible at http://dirt.iplantcollaborative.org/ and hosted on the iPlant cyber-infrastructure using high-throughput grid computing resources of the Texas Advanced Computing Center (TACC). DIRT is a high volume central depository and high-throughput RSA trait computation platform for plant scientists working on crop roots. It enables scientists to store, manage and share crop root images with metadata and compute RSA traits from thousands of images in parallel. It makes high-throughput RSA trait computation available to the community with just a few button clicks. As such it enables plant scientists to spend more time on science rather than on technology. All stored and computed data is easily accessible to the public and broader scientific community. We hope that easy data accessibility will attract new tool developers and spur creative data usage that may even be applied to other fields of science.
Global food demand is projected to double by the year 2050 [1, 2]. Meeting this increased demand requires significant improvements in crop yield and the development of crop plants adapted to water-stress  and low fertility soils [4, 5]. Breeding more efficient roots is increasingly recognized as a high-priority target to achieve yield improvements  because roots are essential for nutrient and water uptake [7–9]. Yet, little is known regarding the relationship between root system architecture (RSA) and crop function with few examples linking root phenotype with genotype and phenotypic advantages under given field conditions [10–12].
Developing new crop varieties includes both laboratory- and field-based studies [13, 14]. Especially field studies to characterize RSA of mature field-grown crops involve laborious manual tasks that limit the achievable sample size. Extending field-based studies and sample sizes is a widely shared goal for future phenotyping scenarios [15, 16]. Indeed, phenotyping rather than genotyping is recognized as the bottleneck limiting advances [17, 18], given inexpensive next-generation sequencing technologies that have paved the way for characterizing the genotypes of diversity panels of thousands of recombinant inbred lines . In response, a number of national and international efforts, including the International Plant Phenotyping Network, have established “plant phenomics” centers to quantify plant phenotypes and their genetic origin .
Similarly, despite some successes, there are relatively few publicly available root phenotyping datasets . Available large datasets are pre-dominantly derived from laboratory-based root phenotyping platforms. Laboratory studies benefit from increased levels of control and, at least in a few cases, have identified loci with candidate genes underlying RSA in early root development [22, 23]. However, growth containers used in these studies, filled with real or artificial soil [24–27], limit observations spatially and temporally to small or immature root systems [28, 29].
Establishing a link between RSA and genotypes requires the measurement of root phenotypes , often derived from automatic analysis of two-dimensional and three-dimensional digital images [31–39]. A comprehensive overview of existing software for root image analysis is maintained at the site: http://plant-image-analysis.org . The scope of this software collection is impressive, in that individual tools provide different degrees of computational automation, ranging from manual, semi-automatic to fully automatic. However, none of these provide an integrated platform that can (a) associate root images with environmental and phenotypic meta-data, (b) provide seamless access to scalable, supercomputing resources for non-technical users and (c) share information within a collaborative team and the plant science community.
In order to address these issues we have developed DIRT. The DIRT platform provides a number of major functionalities that enable researchers to: (a) manage root image collections and metadata; (b) interactively calibrate measurement pipelines; (c) compute crop root traits on scalable high-throughput compute platforms; and (d) analyze the results of computations. Broadly DIRT enables researchers to process thousands of root images through the pipeline with custom parameters and view and analyze computed RSA output associated to the raw images. Thus, our platform makes high-throughput scalable computational platforms available to the researchers with no technical expertise.
The RSA trait computation pipeline available in DIRT is fully automated and includes automatic estimation of 78 traits in total (see Additional file 1: Section S3). Traits are categorized into common traits for all root system architectures, monocot traits, dicot traits and traits for excised root samples. We provide a separate, optional threshold calibration tool that allows the researcher to select a representative image from the marked collection and compute binary image masks using different segmentation threshold values. Within this calibration workflow, the user selects the most appropriate value by visually checking the image mask.
As a response to community requests, the original trait computation pipeline in DIRT was extended. The current pipeline includes previously unpublished algorithms to measure traits such as top and bottom angle in monocots (see Additional file 1: Section S3). The pipeline is best used by following the DIRT imaging protocol to process 2D root images. In brief, a washed root is imaged against a dark diffuse reflecting background that contains a light colored circle with known diameter. Additionally, a barcode, QR-code or simple text can be placed above the root for automatic identification to be associated with trait computations (see Additional file 1: Figure S2). On completion of the computation, masked images, computed traits, and corresponding CSV and RSML files  populate the computation view tab. See Additional files 2 and 3 for examples of produced CSV and RSML files.
DIRT was designed to enable full data control for researchers, whether individually or as part of collaborative teams. As such we realized sharing options, where each newly created collection is designated to be private by default. The owner of a collection can share data and computed results privately with one or many collaborators via the platform’s web-interface or publish collections and computations publically under a chosen creative commons license. Furthermore, DIRT enables different functions based on user access rights. The owners of data can edit, upload, download and delete images and corresponding metadata. Metadata can be associated to whole experiments or data sets to document experiment conditions (e.g. FAO soil type, GPS location, soil moisture content). The association is realized as an upload of a CSV file containing the metadata or is entered via a web form directly in the web browser. On top of suggested standard experiment parameters a dynamic form allows the documentation of non-standard parameters such as nitrogen content per depth level. Similarly, each root image can be annotated manually or by uploading a pre-formatted CSV file with specific metadata (e.g. genotype, dry biomass) and may contain RSML files of manual measurements to annotate the image, e.g. from RootNav  (Additional file 1: Section S6.3.7).
DIRT is hosted publically on the iPlant cyber-infrastructure [45, 46] leveraging its cloud data storage and the Advanced Agave API to communicate with the Texas Advanced Computing Center (TACC) for high-throughput computation of stored root images. It is built as a multi-tiered application consisting of a web server, a database server, iPlant’s data store, middleware and grid computing. The core middleware components are the PHP modules interfacing the database, iPlant data store and grid-computing environment. DIRT’s web interface is developed using the widely adopted open source content management system Drupal (http://drupal.org). DIRTs’ graphical interfaces (Fig. 1) are accessible via standard web browsers and abstract the organization and storage of root images and their metadata in a MySQL database and iPlant’s data store from the user. The image-processing pipeline is developed in Python and runs on TACC. The trait computation pipeline is abstracted from the computational resources and from the aggregation and sharing of images. Hence, it is possible for developers to extend DIRT by incorporating new pipelines adapted to distinct imaging and experiment conditions (see Additional file 1: Section S7.3). The DIRT source code and installation instructions are available for download from the DIRT website (see Additional file 1: Section S7.2) to facilitate use of private supercomputing resources for the plant science community. As a proof of concept we have also released an installation of DIRT at Georgia Tech (http://dirt.biology.gatech.edu) that uses Georgia Tech’s high performance computing environment; instructions for a local installation of DIRT on proprietary computing resources are described in Additional file 1: Section S7.3. Altogether, DIRT assembles a unique root phenotyping platform that is accessible to non-technical users via an interactive web-based interface.
Design and implementation
Private and public storage of large root image data sets with metadata for each image and data set. In doing so, DIRT users don't have to be concerned with the computational and storage needs.
The platform supports private virtual collections by selecting root images from different physical collections. Virtual collections have the potential to save time and money required for new field experiments, by simply combining existing experiments.
Up-scaling of RSA trait estimation to supercomputing platforms.
The DIRT platform should allow storage of different types of image data.
The DIRT platform is extensible to incorporate new RSA trait computation pipelines.
The source code of the DIRT platform is freely available under open source licenses to the science community.
For the RSA trait estimation we chose the pipeline developed in Python (see Additional file 1: Section S3) and ported it to grid computing infrastructure for high-throughput computation.
User interfaces, user management, access control, data management, application workflow, user task scheduling and system’s configuration were implemented as open-source Drupal modules.
The public DIRT installation on iPlant interfaces with the STAMPEDE high performance computing platform at TACC .
For scalable storage and public infrastructure we chose the data store within iPlant’s cyber-infrastructure.
The communication between DIRT and STAMPEDE is realized with the AGAVE API  and a secure shell connection.
In the following we detail the content, component and deployment model of the DIRT system to inform developers about our extensions to DRUPAL.
NID (Node ID): Every node or content in the Drupal system has a unique ID assigned, irrespective of the content type.
Title: Every node or content in the system is required to have a title.
UID: Every node or content in the system is explicitly tied to its creator i.e. the user of the system who created it.
Status: Every node or content in the system has one of the two states, published or unpublished. This feature assures that content is kept offline, until the content is valid and complete to be taken online.
Created and changed: A timestamp monitor content or node changes.
VID (version ID): Every node or content in the system maintains its version information. If enabled, all changes to a content or node is stored and maintained.
Calibrated mask images contains attributes to associate an image to multiple image masks created during the calibration of an original root image and an attribute referencing the original root image in the system database.
Computation references to a marked collection, a RSA trait computation pipeline, the pipeline parameters and the traits available in a pipeline. Furthermore, Computation contains an attribute to define its visibility. A computation also contains a field of type file to link to a CSV file containing computed RSA trait values of a referred Marked Collection.
DIRT Output defines the output produced for each raw root image by the RSA trait computation pipeline. It contains attributes to refer to a computation and original root image. Additionally, the content type contains attributes to refer to the image mask of the original image, each RSA trait value and the output RSML file.
Image processing pipeline has attributes for the pipeline parameters and each available trait.
License defines attributes for the licenses supported by the DIRT platform. The License content type is associated to computation and root image collection content types.
Marked collection has attributes that describe a list of root images.
Metadata has attributes that refer to a root image collection and a file that links a pre-formatted CSV file.
Root refers to an original root image within a root image collection. Hence, the attributes hold a reference to a root image, a root image collection and each associated metadata entry.
Root image collection has the attributes collection visibility, collection, membership, collection license and all collection metadata.
Web server component: These are the Drupal components including core, community contributed and custom DIRT modules that orchestrate the whole platform in cohort. The content model described in the previous section is designed and implemented using these module types.
RSA trait computation component: These are the Python code used for the trait computation that is deployed to both the web server and grid computing node to meet the calibration and trait computation system specifications respectively.
Interface component: These are the shell scripts that reside on the web server and grid-computing node to interface between DIRT and the grid job scheduler.
The deployment model is the static view of the run-time configuration of the processing nodes and all executed components. The deployment model defines the distribution of all DIRT components across different physical nodes in terms of folder structures an access rights. This deployment model is largely automated. Therefore we refer for detailed practical information to the Additional file 1: Section S7.
Discussion and conclusion
From our experience, the simple excavation and imaging protocol enables 2–3 persons to phenotype 500–700 common bean roots per day in soil with high clay content. Here, the limiting factor are soil properties such as clay content or compactness that impede root excavation, while sandy soils allow fast and easy root excavation. Until now we did not experience the limits of the computing resources. However, the growing community of DIRT users will increase the computational load on the computing resources and eventually reveal the limits of the current system.
We presented DIRT as an open online platform that stores and organizes root image data sets, executes RSA trait estimations and documents performed computations on root image data sets. DIRT allows contributions from the whole root phenotyping community, including users and developers, and enables sharing and documentation of experiments. It is encouraged to submit images taken with the DIRT imaging protocol to make use of all DIRT features. However, proprietary imaging protocols are often supported with limitations. Additionally, our efforts to make DIRT an open-source, transparent and freely accessible tool will enable further development and adaptation of the platform in response to research demands of free public data sets . Overall DIRT is a unique computational resource that promotes automated, yet researcher independent, root phenotyping as a response to the demands of researchers working under field conditions, to discover novel links between root morphology and the plant genome.
Availability and requirements
DIRT is freely accessible and usable at http://dirt.iplantcollaborative.org. In the spirit of open-source development, we have hosted DIRT on iPlant’s cyber infrastructure, which is open to the public. All source code is available on the DIRT GitHub repository (https://github.com/abucksch/DIRT) and on the DIRT website (http://dirt.iplantcollaborative.org/about-us?qt-about_us_quicktabs=2#qt-about_us_quicktabs). A user manual guide is included as part of the Additional file 1.
AD conceived, designed and implemented the platform, and wrote the manuscript. HS, performed trait validation, collected and contributed field data sets, tested and validated user interfaces and contributed to writing the manuscript. JB contributed field data sets, contributed to meta-data design. CT, TW and AKMA collected and contributed field data sets, contributed to the design of the user interface and performed beta-testing. JSW and JL contributed to the design of the platform and co-wrote the manuscript. AB, conceived the project, designed and implemented traits, contributed to the platform design and wrote the manuscript. All authors read and approved the final manuscript.
We want to acknowledge, Wesley Emeneker and Troy Hilley for providing the access and configuration for our test environment at the Georgia Institute of Technology, Furthermore we want to thank Andy Edmonds, Nirav Merchant and Martha Narro of the iPlant Collaborative for their technical support. This work was supported by the National Science Foundation (NSF) Plant Genome Research Program (grant nos. NSF0820624 to J.P.L. and J.S.W.), the Howard G. Buffett Foundation, and the Center for Data Analytics, Georgia Institute of Technology, Spatial Networks in Biology: Organizing and Analyzing the Structure of Distributed Biological Systems (to A.B. and J.S.W.). DIRT is powered, in part, by iPlant.
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Godfray HCJ, et al. Food security: the challenge of feeding 9 billion people. Science. 2010;327(5967):812–8.View ArticlePubMedGoogle Scholar
- Tilman D, et al. Global food demand and the sustainable intensification of agriculture. Proc Natl Acad Sci. 2011;108(50):20260–4.PubMed CentralView ArticlePubMedGoogle Scholar
- OECD. OECD Environmental Outlook to 2030. OECD Publishing; 2008.Google Scholar
- Lynch JP. Roots of the second green revolution. Aust J Bot. 2007;55(5):493–512.View ArticleGoogle Scholar
- López-Arredondo D, González-Morales SI, Bello-Bello E, et al. Engineering food crops to grow in harsh environments [version 1; referees: 2 approved]. F1000Research 2015, 4(F1000 Faculty Rev):651. doi:10.12688/f1000research.6538.1.
- Araus JL, Cairns JE. Field high-throughput phenotyping: the new crop breeding frontier. Trend Plant Sci. 2014;19(1):52–61.View ArticleGoogle Scholar
- Beebe SE, et al. Quantitative trait loci for root architecture traits correlated with phosphorus acquisition in common bean. Crop Sci. 2006;46(1):413–23.View ArticleGoogle Scholar
- Saengwilai P, Tian XL, Lynch JP. Low crown root number enhances nitrogen acquisition from low-nitrogen soils in maize. Plant Physiol. 2014;166(2):581–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Waisel Y, et al. Plant roots: the hidden half. Ann Bot. 2002;90(6):775–6.View ArticleGoogle Scholar
- de Sousa SM, et al. A role for root morphology and related candidate genes in P acquisition efficiency in maize. Funct Plant Biol. 2012;39(10–11):925–35.View ArticleGoogle Scholar
- Lynch J. Root architecture and plant productivity. Plant Physiol. 1995;109(1):7–13.PubMed CentralPubMedGoogle Scholar
- Zhu JM, et al. From lab to field, new approaches to phenotyping root system architecture. Curr Opin Plant Biol. 2011;14(3):310–7.View ArticlePubMedGoogle Scholar
- Jansen M, et al. Non-invasive phenotyping methodologies enable the accurate characterization of growth and performance of shoots and roots. Genomics of plant genetic resources. Netherlands: Springer; 2014. p. 173–206.Google Scholar
- Rogers ED, Benfey PN. Regulation of plant root system architecture: implications for crop advancement. Curr Opin Biotechnol. 2015;32:93–8.View ArticlePubMedGoogle Scholar
- Fiorani F, Schurr U. Future scenarios for plant phenotyping. Annu Rev Plant Biol. 2013;64:267–91.View ArticlePubMedGoogle Scholar
- Wuyts N, Dhondt S, Inzé D. Measurement of plant growth in view of an integrative analysis of regulatory networks. Curr Opin Plant Biol. 2015;25:90–7.View ArticlePubMedGoogle Scholar
- Kuijken RCP, et al. Root phenotyping: from component trait in the lab to breeding. J Exp Bot. 2015;66(18):5389–401.View ArticlePubMedGoogle Scholar
- Rahman H, et al. Phenomics: technologies and applications in plant and agriculture. PlantOmics: the omics of plant science. India: Springer; 2015. p. 385–411.Google Scholar
- McMullen MD, et al. Genetic properties of the maize nested association mapping population. Science. 2009;325(5941):737–40.View ArticlePubMedGoogle Scholar
- Finkel E. IMAGING with ‘Phenomics’, plant scientists hope to shift breeding into overdrive. Science. 2009;325(5939):380–1.View ArticlePubMedGoogle Scholar
- Fahlgren N, Gehan MA, Baxter I. Lights, camera, action: high-throughput plant phenotyping is ready for a close-up. Curr Opin Plant Biol. 2015;24:93–9.View ArticlePubMedGoogle Scholar
- Topp CN, et al. 3D phenotyping and quantitative trait locus mapping identify core regions of the rice genome controlling root architecture. Proc Natl Acad Sci USA. 2013;110(18):E1695–704.PubMed CentralView ArticlePubMedGoogle Scholar
- Pace J, Yu X, Lübberstedt T. Genomic prediction of seedling root length in maize (Zea mays L.). Plant J. 2015;83(5):903–12.View ArticlePubMedGoogle Scholar
- Clark RT, et al. Three-dimensional root phenotyping with a novel imaging and software platform. Plant Physiol. 2011;156(2):455–65.PubMed CentralView ArticlePubMedGoogle Scholar
- Downie H, et al. Transparent soil for imaging the rhizosphere. PLoS One. 2012;7(9):e44276.PubMed CentralView ArticlePubMedGoogle Scholar
- Iyer-Pascuzzi AS, et al. Imaging and analysis platform for automatic phenotyping and trait ranking of plant root systems. Plant Physiol. 2010;152(3):1148–57.PubMed CentralView ArticlePubMedGoogle Scholar
- Rellan-Alvarez R, et al. GLO-Roots: an imaging platform enabling multidimensional characterization of soil-grown root systems. Elife. 2015;4:016931.View ArticleGoogle Scholar
- Judd LA, Jackson BE, Fonteno WC. Advancements in root growth measurement technologies and observation capabilities for container-grown plants. Plants. 2015;4(3):369–92.View ArticleGoogle Scholar
- Pfeifer J, et al. Rapid phenotyping of crop root systems in undisturbed field soils using X-ray computed tomography. Plant methods. 2015;11(1):41.PubMed CentralView ArticlePubMedGoogle Scholar
- Walter A, Liebisch F, Hund A. Plant phenotyping: from bean weighing to image analysis. Plant Method. 2015;11:14.View ArticleGoogle Scholar
- Cai J, et al. RootGraph: a graphic optimization tool for automated image analysis of plant roots. J Exp Bot. 2015. doi:10.1093/jxb/erv359 Google Scholar
- Clark RT, et al. High-throughput two-dimensional root system phenotyping platform facilitates genetic analysis of root growth and development. Plant Cell Environ. 2013;36(2):454–66.View ArticlePubMedGoogle Scholar
- Colombi T, et al. Next generation shovelomics: set up a tent and REST. Plant Soil. 2015;388(1–2):1–20.View ArticleGoogle Scholar
- Humplik JF, et al. Automated phenotyping of plant shoots using imaging methods for analysis of plant stress responses—a review. Plant Method. 2015;11:29.View ArticleGoogle Scholar
- Metzner R, et al. Direct comparison of MRI and X-ray CT technologies for 3D imaging of root systems in soil: potential and challenges for root trait quantification. Plant Method. 2015;11(1):1–11.View ArticleGoogle Scholar
- Mooney SJ, et al. Developing X-ray computed tomography to non-invasively image 3-D root systems architecture in soil. Plant Soil. 2012;352(1–2):1–22.View ArticleGoogle Scholar
- Symonova O, Topp CN, Edelsbrunner H. DynamicRoots: a software platform for the reconstruction and analysis of growing plant roots. Plos ONE. 2015;10(6):e0127657. doi:10.1371/journal.pone.0127657.PubMed CentralView ArticlePubMedGoogle Scholar
- Yazdanbakhsh N, Fisahn J. High throughput phenotyping of root growth dynamics, lateral root formation, root architecture and root hair development enabled by PlaRoM. Funct Plant Biol. 2009;36(10–11):938–46.View ArticleGoogle Scholar
- Delory BM, et al. archiDART: an R package for the automated computation of plant root architectural traits. Plant Soil. 2015. doi:10.1007/s11104-015-2673-4.Google Scholar
- Lobet G, Draye X, Perilleux C. An online database for plant image analysis software tools. Plant Method. 2013;9:38.View ArticleGoogle Scholar
- Bucksch A, et al. Image-based high-throughput field phenotyping of crop roots. Plant Physiol. 2014;166(2):470–86.PubMed CentralView ArticlePubMedGoogle Scholar
- Trachsel S, et al. Shovelomics: high throughput phenotyping of maize (Zea mays L.) root architecture in the field. Plant Soil. 2011;341(1–2):75–87.View ArticleGoogle Scholar
- Lobet G, et al. Root system markup language: toward a unified root architecture description language. Plant Physiol. 2015;167(3):617–27.PubMed CentralView ArticlePubMedGoogle Scholar
- Pound MP, et al. RootNav: navigating images of complex root architectures. Plant Physiol. 2013;162(4):1802–14.PubMed CentralView ArticlePubMedGoogle Scholar
- Goff SA, et al. The iPlant collaborative: cyberinfrastructure for plant biology. Front Plant Sci. 2011;2:34.PubMed CentralView ArticlePubMedGoogle Scholar
- Stanzione D. The iPlant collaborative: cyberinfrastructure to feed the world. Computer. 2011;44(11):44–52.View ArticleGoogle Scholar
- Drupal. https://drupal.org/ Accessed 16 June 2015.
- STAMPEDE at TACC. https://tacc.utexas.edu/systems/stampede Accessed 16 June 2015.
- Agave API. http://agaveapi.co. Accessed 16 June 2015.
- Fowler M. UML distilled: a brief guide to the standard object modeling language. Boston: Addison-Wesley Professional; 2004.Google Scholar