HIRROS: automated platform for RSA imaging
The HIgh Resolution ROot Scanner (HIRROS) setup is a device for automated and nondestructive visualisation of root growth and architecture of Arabidopsis thaliana plants or small seedlings (Fig. 2). The imaging automata is located in a dedicated growth chamber allowing temperature, hygrometry, photoperiod, and light intensity to be adjusted according to the user’s requirements (Root phenotyping platform webpage: [24]).
HIRROS is designed to use either 200 small (120 × 120 mm) or 72 large (245 × 245 mm) square Petri dishes, which are filled with agar medium and placed upright in holders to allow root growth along a vertical plane at the surface of the culture medium. Once the medium is solidified, a onecentimetre wide strip of medium is removed to avoid contact of cotyledons and leaves with the culture medium and maintain shoots vertically (Fig. 2). Plate lids are wiped with tween20 to reduce condensation and water droplet formation. Five pregerminated seedlings are evenly distributed, and the plates are sealed with four pieces of tape to allow gas exchanges. The Petri dishes are then placed into holders to maintain them vertically and make them transportable within the automated HIRROS setup. Holder motions are driven by worms allowing lateral and forward movements. The imaging system that is protected from parasitic light is located beside the culture zone (Fig. 2). Plates are transferred to the imaging cart. For illumination, the plates are backlighted with a white collimated LED (Fig. 2). This allows a high contrast (even for thin and almost translucent Arabidopsis roots) and reduces reflection on the plate's lid. Plates are imaged using a 16MP linear camera coupled to a telecentric lens located at 50 cm of the plate (Fig. 1). This optical setup increases the depth of field. It suppresses image distortions allowing imaging of small and large plates without adjustment. It has a resolution of 19 μm/px (image size 8 MB, TIFF for small plates and 32 MB for large plates). Metadata describing the experiment associated with each image and stored locally in a folder labelled with the experiment information and the imaging date. Illumination is switched on and off automatically and synchronised with the camera during image acquisition. Once imaged, the plate moves back to the holder that moves one position further to place the next Petri dish in the row in front of the imaging cart until all plates are recorded. The holder moves back to the culture zone, and the next holder enters the imaging process. The imaging frequency can be adjusted between 2 and 24 h, and up to 1000 plants can be imaged in less than 80 min.
Image processing pipeline
Our processing pipeline includes four steps (see Fig. 3): (1) time series registration, (2) 2D + t segmentation, (3) root topological tracking, and (4) architecture reconstruction. It is fully automatic. The batch computation starts as soon as the user provides the indication of the source image directory.
First, all the images in the time series are registered together in order to align root organs in all images to ease the tracking process. Then, in the second step, root pixels are separated from the background (segmentation), and an apparition time is associated to the root pixel that corresponds to the time associated with the image in the time series. A single 2D labelled image is output where each pixel is labelled with a root apparition time.
The two final steps work on graph and tree data structures. In step 3, the label image is converted into a graph representing the successive growing parts of roots. The adjacency relations of the root segments are investigated by an optimization algorithm to estimate the topology of the root system architecture. Finally, in step 4 we refine the geometrical description of this architecture by computing the central line of the identified organs. The detailed description of these steps is presented below.
Time series registration
The pipeline's first operation corrects the root system's misalignment in the 2D + t image sequence \(I=\{{I}_{i}{\}}_{i=1...{N}_{t}}\), induced by the displacement of the plate in the imaging automata. We identify two sources of misalignment: (i) movements of the box relative to the imaging device and (ii) deformations of the root system in the culture medium induced by plate displacement. We correct these distortions using a registration pipeline developed with the Fijiyama plugin libraries [8].
We fix the geometry of the last image \({I}_{{N}_{t}}\), considered the reference image because it contains all the structures present in the previous images. We determine \(({N}_{t }1)\) geometric transformations \(\{{T}_{t}\}\) to align the other images with the reference image \({I}_{{N}_{t}}\). Each estimated transformation \({T}_{t}\) comprises a rigid transformation (translation + rotation), and a nonlinear transformation (dense vector field) estimated successively. In this setup, we assume that the rigid transformation is required to correct the movement of the plate, while the nonlinear transformation is required to correct the slight deformations of the root system in the culture medium. These transformations are computed from local correspondences using the blockmatching algorithm [28], integrated into a daisychain strategy: each image is registered with the next image in the sequence. Then the successive transformations are composed and applied at once for each image.
Time series segmentation
The image time series shows the roots growing over time. The difference between two successive images highlights the newly appeared segments of the roots. By working on these differences over the image sequence, one can distinguish the roots from the image's background, thus segmenting roots and associating each root segment and apparition time. Thus, we define the two purposes of this segmentation step: (i) identify which image pixels are traversed by the root systems during the sequence (binary segmentation of RSA), and (ii) estimate the corresponding traversal times (see Fig. 4).
In our dataset, background pixels have medium intensity while root pixels have low intensity. However, segmenting roots by thresholding each image separately is not an option, as other structures (box border, condensation, dirtiness) may have similar characteristics to roots, and pixels intensity varies between the root central line (low intensities) and root border (medium intensities).
Nevertheless, the registered image sequence allows studying the time sequence of intensities of each pixel and detecting in this sequence the event associated with the detection of a root (i.e., when a root reaches this pixel). The time sequence of intensity of a pixel traversed by a root begins with high values corresponding to the background, then undergoes a sudden decrease of intensity at the moment when the root reaches this pixel, and finally finishes with low values corresponding to root tissues. We process each pixel individually and identify this pattern by computing the maximum mean shift over its time series of intensity. We identify as “root pixels” those whose mean shift value is greater than a fixed threshold \({s}_{1}=25\) (given that the average intensity difference measured between the centerlines of the roots and the background is approximately 100). Among these pixels, we compute the difference between successive values in the time sequence and identify the observation time following the maximal intensity drop. If this maximum intensity drop is lower than a fixed threshold \({s}_{2}=10\), the pixel is rejected and identified as background. Else, the index of the maximal intensity drop is identified as the “apparition time” of the root at this given pixel location. The pixels selected by these two thresholding operations constitute a binary segmentation of the root systems. Each selected pixel is given a label defining the root apparition time (see Fig. 4).
We run a topological and geometrical characterisation of the binary segmentation of the root systems to eliminate various outliers: nightly condensation and dirtiness falling on the dish during the experiment. We compute the 4connected components and reject the pixels of components smaller than 2000 pixels. In our setup, this threshold is barely equivalent to rejecting any candidate root system in which cumulative organ length at the last observation time point is lower than 38 mm, which is suited to our experiment.
This segmentation step results in a 2D labelled image \(S\) (see Fig. 4), whose pixels \(p\) are given a label \(S(p)\) between \(0\) and \({N}_{t}\); the label “0” identifies nonroot pixels, and the other possible labels identify the index of the first image where the root is present.
Root topological tracking
In this section, the pipeline converts the 2D labelled image into a graph representing the root system architecture. This graph is extracted by investigating the adjacency relations between the detected root growing sections, and connecting these sections in the most plausible way, using a connection metric used to compute a minimal cost tree with an optimisation algorithm. Multiparticle tracking can be challenging, especially if the number of particles is not constant, which requires managing the creation and destruction of tracks in time [18]. An additional difficulty comes when the temporal sampling is sparse with respect to the particle velocity, inducing possible confusion between crossing tracks or neighbouring tracks. For the tracking of roots in the time series, we propose a solution based on a global tracking algorithm working on the fulltime series. In the previous step, the pipeline outputs the growing architecture as a fixed architecture with timestamped elements. During root topological tracking, the method tracks structures by using these timestamps. In steps 3a, the algorithm computes a directed Region Adjacency Graph equivalent to the segmentation. In 3b, it extracts a covering forest of minimal weight from it, and provides the skeleton of the primary and secondary roots. Then in 3c, it processes the root crossings by a global optimization algorithm (see Fig. 5).
Representation of roots growth space using a weighted directed region adjacency graph
After registration, we expect that the root system appears in the sequence of images as a sequence of Russian dolls: assuming that a root does not disappear (by dying or being removed) if we note \({P}_{t}\) the set of pixels covered by the roots at the time\(t\), \({P}_{t}\subseteq {P}_{t+k}(k>0)\), and the difference \({P}_{t+k} \backslash {P}_{t}\) is the additional surface of root system generated by root tips growth between \(t\) and\(t+k\). Thus, we assume that a connected component of pixels in \(S\) sharing the same label \(t\) is an additional surface covered between the observation time number \(t\) and \(t+1\) by a part of the root systems. With this assumption, root trajectories can be tracked by identifying the optimal pixel paths in\(S\), beginning with pixels labelled 1 (sequence start) and finishing with the label \({N}_{t}\) (last observation time point). We formalise a global solution to the tracking problem by searching a minimum cost spanning forest in the weighted directed region adjacency graph \(G={(V}_{G} ,{E }_{G})\) of the labelled image \(S\) (see Fig. 5). We build \(G\) as follows:

Each 8connected component of pixels \(p\in S\) sharing the same label \(t(p)\) in \(S\) is represented by a vertex \(v\in {V}_{G}\) with label \(t(v)=t(p)\).

Two vertices \({v}_{1},{v}_{2}\) are connected with a (directed) edge \(e=({v}_{1},{v}_{2})\) if and only if \(t({v}_{1})<t({v}_{2})\) (edges are directed according to forward time direction) and the two corresponding 8connected components in \(S\) are 4neighbours (they share at least a pixel edge).
The weight \(p(a)\) of an edge \(e=({v}_{1},{v}_{2})\) is defined by:
$$p(e)=\frac{\#{v}_{1}\#{v}_{2}}{\#{v}_{1}+\#{v}_{2}}+(t({v}_{2})t({v}_{1})1)+\frac{\underline{{{v}_{1}{v}_{2}}}\cdot \underline{y}}{\underline{{{v}_{1}{v}_{2}}}}$$
with

\(\#v\) the number of pixels contained in the corresponding connected component in \(S\).

\(t(v)\) the label of \(v\).

\(\underline{{{v}_{1}}{{v}_{2}}}\) is the vector from \({v}_{1}\) to \({v}_{2}\) and \(\underline{y}\) is the Yaxis normalised vector.
The weight \(p(e)\) of an edge is small (the edge is more “likely” to belong to an actual root trajectory) if (i) the edge connects two vertices which have a comparable surface, (ii) the two vertices labels (structures apparition time) are close, and (iii) the vector connecting the target from the source is directed downwards, which is the preferred orientation of root growth.
Extraction of root system skeleton
At this step, we extract the root systems skeleton in the graph by identifying the paths corresponding to primary roots and lateral roots without considering lateral root crossings. We describe this reconstruction algorithm for architecture composed of primary roots and lateral roots (secondorder organs inserted on the primary), which is suited to arabidopsis seedlings.
Primary roots: The primary root of seedlings is the first organ of the root system and tends to be larger than the lateral roots, have a higher growth rate and point downwards. Consequently, the corresponding path in the graph is composed of edges of minimum weight p(e). At the beginning of the experiment, we assume that the primary axes of the N plants (Np = 5 plants per box in our setup) have no branching. We then identify the corresponding vertices in G, which are the Np vertices associated with the largest connected component of timestamp 1 in the segmentation S. From each of such vertices, we extract the path of the primary axis, which is the longest greedy mincost path. Vertices and edges traversed in this step are labelled primary, and all other edges and vertices in G are labelled lateral. In agreement with the root hierarchy, all lateral edges incident to a primary vertex are invalid and eliminated from the graph. Finally, the algorithm filters out outliers by computing an extraction of connected components in G from the primary vertices and rejecting all vertices not included in the final components.
Lateral roots: The algorithm identifies the path corresponding to lateral roots by selecting in the graph the vertices and edges that are likely to correspond to actual lateral root paths (see Fig. 5). This action of classifying between “relevant” and “irrelevant” edges is an action of graph pruning that we run in two steps. First, we aim to extract a subpart of the graph whose topology corresponds to that of a set of root systems, which is a forest structure. To that end, we extract a forest from the graph by computing the minimal directed spanning forest \(F={(V}_{F},{E}_{F})\) using the Edmonds algorithm [7]. In \(F\), each vertex has a single incoming arc. Second, we try to ensure that each path corresponding to a lateral root has no branching, according to our dataset composed of Arabidopsis seedlings. In order to avoid the branching of lateral roots, we check each lateral vertex and keep only their mincost outgoing arc.
Resolution of root crossing ambiguities
By pruning the graph, parts of actual root paths corresponding to root crossings could be disconnected. This point is investigated and processed in this step using a global algorithm. If a root \({r}_{2}\) crossed the path of another root \({r}_{1}\) during the time sequence, assuming that \({r}_{2}\) traversed the area at least one timestep after \({r}_{1}\), the region adjacency graph in 3a) shows a typical “X” pattern, where the path of \({r}_{2}\) cross at least one edge in the wrong direction. This pattern leads to an upstream disconnection of the \({r}_{2}\) path and a downstream disconnection of the \({r}_{2}\) path during graph pruning in 3b), which lets an additional root start and root end in the graph (see Fig. 5). To process these cases using a global algorithm, we first identify the two following vertices sets in : the “root starts” \({V}_{start}\), including every lateral vertex with no incoming arc, and the “root stops” \({V}_{stop}\), including every vertex with no outgoing arc. As these elements are possibly parts of disconnected pathways, we target to identify the best pairwise matches (root stops with root starts) in order to reconnect them. We define a connection cost function to compute the optimal set of pairwise matches between elements of these two sets. The cost function is defined to indicate the likelihood of the connection, computed as the weighted average of multiple geometrical and topological continuity features, replaced by majoring values when the elements to connect are not long enough to provide the needed information. The features used for the cost function are the edge weight function \(p(a)\) (see 3a), the angle between root start and root end, the relative speed difference between the root start and the root stop, and a topological score. The topological score is computed by evaluating the minimal path between the root end and root start in the original region adjacency graph, evaluating the number of edges crossed in the right or wrong direction, and the corresponding temporal distance. We evaluate this cost function for all the possible ordered pairs \({(v}_{stop},{v}_{start})\) in order to build a cost matrix of all possible onetoone connections of root starts and root stops. This cost matrix is used to compute the optimal pairwise matching using the Hungarian algorithm [14]. The algorithm works as follows:
While the Hungarian algorithm provides a global and optimal set of connections, we do not apply this set in one go. We iterate over these steps: computation of the Hungarian algorithm, connection of the pair of minimal cost, and update of the cost matrix. That way, each new connection adds knowledge to the system. This point is critical for tackling some challenging cases, like when a single root is reconstructed from a series of numerous small isolated elements comprising both a root start and a root end. This kind of situation happens for example, when a lateral root grows hidden below a primary root (see results, Fig. 7D).
Artefacts rejection
We compute statistics over the identified lateral root paths and proceed to detection of outliers in order to reject spurious lateral roots. To that end, we collect data characterising the geometry and dynamics of these paths, measured at each timestep: length, speed, and surface. These scalar values are binned with respect to the temporal distance from the emergence of the path. Then we compute robust statistics within these subgroups. We compute the doublesided MADe [15] and detect values exceeding twentyfive times the median absolute deviation around the median. These abnormal values are identified as markers of spurious roots. The corresponding roots are then rejected as artefacts.
Architecture reconstruction
The reconstruction of the spatiotemporal model of the RSA involves a precise description of the topology and geometry of the root systems. At step 4, the pipeline refines the geometric description of the root system architecture by identifying the successive positions of root tips and interpolating the path of the root tips to extract the central lines of the organs. Vertices and edges of the forest \(F\) resulting from step 3 describe the root system topology. However, they do not provide the precise spatiotemporal pathway: each vertex of \(F\) represents a growing region of root pixels between two observations; thus, it is not associated with any exact coordinates, neither in time nor space. To build a precise spatiotemporal model, we need to know the precise position of each root tip at each observation time point and then interpolate this position between the time points to follow the roots' actual pathway, whatever their speed and tortuosity.
The position of the root tip at each time point can be estimated from the edges of the forest \(F\), recalling that each edge \(e\in F\) corresponds to a transition between two consecutive growth areas of a root (see Fig. 6). Thus, edges of \(F\) correspond to the successive positions of root tips at each time point. The spatiotemporal model is then represented by the dual graph \(I=({V}_{I},{E}_{I})\) of \(F=({V}_{F},{E}_{F})\) where a vertex encodes the spatiotemporal position of a root tip and an edge encodes the root segment between two positions (see Fig. 6). In this dual graph (Fig. 6, right), a vertex \(w\in I\) corresponds to an edge in \(F\). Its temporal coordinate is welldefined; it is exactly an observation time point.
Conversely, the spatial coordinates of \(w\in I\) have to be defined. \(w\) corresponds to an adjacency relationship between two distinct connected components in \(S\) (see the blue line and the blue arrow in Fig. 6, centre). We define the geometry of this adjacency in the image space by extracting the set of pixel edges and corners separating the two adjacent components. We choose the spatial coordinates of \(w\in I\) to be the coordinates of the centralmost pixel corner or pixel edge over all the corresponding frontiers (see the dark blue circle in Fig. 6, centre). We process all the edges of \(F\) that way to compute the corresponding vertices of \(I\). Afterward, we create for each root an additional vertex \(w\in I\) to identify the root tip. We estimate its position by identifying the pixel of the last component that is the most distant (geodesically) from the nearest identified vertex \(w\in I\) (for an illustration, see the lower white circle in Fig. 6, right). Once all the vertices of \(I\) are computed this way, we connect them with edges corresponding to the vertices of \(F\). The geometry of the edges is defined by straight lines connecting successive positions of root tips. These edges constitute an oversimplified version of the root curves, which do not precisely follow the roots pathway, especially for primary roots, which are growing very fast. We refine \(I\) to ensure geometrical fidelity of the RSA model, allowing accurate measurements (especially root length). For each edge \(\varepsilon ={(w}_{1},{\omega }_{2})\in I\), we compute the shortest path between \({w}_{1}\) and \({w}_{2}\) in the corresponding connected component \(CC\in S\). We use the Dijkstra algorithm and compute the shortest path of 8connected pixels in \(CC\). We define the edge cost using the standard interpixel euclidean distance, weighted by a scalar function derived from the distance map of the traversed pixels to the contour of the \(CC\). This morphological feature forces the path to avoid understeering curves; it fosters representative RSA models.
Finally, we reduce the number of points in the pixel path by decimating it using the DouglasPeucker algorithm [6]. The temporal coordinate of the vertices is estimated by a linear interpolation computed from the nearest two bounding vertices whose temporal coordinates are known.
Upstream extrapolation of primary roots to the seed
Leaves sometimes fall into the medium during the time course, inducing false detection of new organs. We address this problem by artificially removing the falling leaves from the images using a series of morphological operations. However, this results in a loss of information about the primary and lateral root starts, causing the pipeline to detect the root system initiation point below the target point (the seed). The pipeline corrects this defect during this postprocessing operation by computing a pixel path of minimum cost from the target height to the detected initiation point using the Dijkstra algorithm. To that end, we define the cost of pixeltopixel traversal as the product of the interpixel distance (1 for pixel sharing an edge or \(\sqrt{2}\) for pixel sharing a corner) by the average intensity of the two pixels.
Downstream extrapolation of the stopped laterals
After computation, we check every root end, and investigate the one ending before the last observation time, probably corresponding to lateral roots stopping their growth before the sequence ending. Such an event is infrequent, thus suspect. If these earlystopping roots \({r}_{stopped}\) are not incident to any other root, we consider that \({r}_{stopped}\) have stopped its growth. In that case, we keep it in the model if it has a continuous growth during at least three time points, or we reject it instead. In the other case where \({r}_{stopped}\) is incident to another earlier root \({r}_{first}\), we assume that \({r}_{stopped}\) continued its growth hidden below or parallel to \({r}_{first}\). In this situation, we extrapolate the growth of \({r}_{stopped}\) to its estimated final position in the timesequence and add these new points to the RSA model.