Overview
The input to the TopoRoot pipeline is a 3D grayscale image, \(I\), of a maize root crown or root system, represented as a stack of 2D image slices. The pipeline assumes that \(I\) has sufficient contrast between the roots and their surroundings (e.g., soil), so that intensity thresholding can be used to segment the roots. If this assumption does not hold, such as when large chunks of soil are attached to the roots or in the case of in situ imaging in soil, TopoRoot may still be applied if the roots can be segmented by a third-party segmentation algorithm (see "Discussions" section). TopoRoot produces a root hierarchy and fine-grained root traits in four steps (see Fig. 1).
-
1.
Segmentation: This step produces a binary segmentation, \(B\), from image \(I\) that captures the root shape with few topological errors. It does so by combining a recently developed topological simplification algorithm [20] with a new algorithm that ensures the solidity of the root stem
-
2.
Skeletonization: This step computes a skeleton, \(S\), from the binary segmentation \(B\) such that \(S\) has the correct topology of a tree (that is, being connected and free of cycles). This is done by employing an existing skeletonization algorithm [21, 22] followed by a novel heuristic to remove cycles.
-
3.
Inferring hierarchy: A hierarchy, \(H\), is obtained from the skeleton \(S\) so that each edge of \(S\) is given an integer indicating its level in the hierarchy (e.g., 0 means the stem, 1 means the nodal roots, 2 means the lateral roots of first order, etc.). The hierarchy algorithm minimizes the depth of the hierarchy while favoring longer branches at lower levels.
-
4.
Computing traits: Finally, a suite of root traits, such as the count, lengths, angles, thickness, and tortuosity, are computed from the skeleton \(S\) at each level of the hierarchy \(H\).
These steps are detailed in the next few sections.
Segmentation
Each maize root system has a simple topology: it is connected, free of handles (“loops”) or voids (“air bubbles”). However, due to limits in imaging resolution, contrast, and/or noise level, simple thresholding of the input grayscale image often yields a segmentation with numerous disconnected components, handles and voids (see Fig. 2A). These erroneous topological features pose significant challenges for the subsequent hierarchy inference. Another issue with simple thresholding is that due to the relatively low intensity in the interior of thick roots (e.g., the stem), only the outer shell of these roots is included in the segmentation (see Fig. 2A). These hollow shells would lead to complex skeletons that do not accurately capture the tubular shape of the roots. Given an input image \(I\), the first step extracts a segmentation \(B\) that fills the hollow root interior and has minimal topological errors.
We start by filling the hollow interior of roots using an erosion approach. The observation is that these hollow spaces become voids (i.e., closed off by the root’s shell) when the image \(I\) is thresholded by a sufficiently low value. Our heuristic identifies these voids at a low threshold and fills them in after “growing” them back to the normal threshold. The heuristic takes in two user-specified thresholds, \(t_{mid}\) and \(t_{low}\), such that \(t_{mid}\) best captures of the shape of the root while \(t_{low} < t_{mid}\) closes off most of the hollow spaces within roots. We denote the segmented, voxelized shapes at these two thresholds as \(B_{mid}\) and \(B_{low}\) respectively (see Fig. 3A). To grow the voids in \(B_{low}\) back to the hollow spaces in \(B_{mid}\), we maximally erode the voxels in \(B_{low}\) while maintaining its topology and preventing voxels in \(B_{mid}\) from being eroded. This results in another shape, denoted as \(B_{mid} ^{\prime}\), which is a minimal superset of \(B_{mid}\) with its hollow spaces closed off (see Fig. 3B). We then take all voxels in the voids of \(B_{mid} ^{\prime}\), together with those voxels in \(B_{mid} ^{\prime}\backslash B_{mid}\) adjacent to the voids, and “fill” them by setting their intensity values to be \(t_{mid}\) (see Fig. 3C). We denote the resulting image as \(I^{\prime}\).
Next, we compute a segmentation of the hollow-space-filled image \(I^{\prime}\) using the algorithm of [20]. This algorithm uses global optimization to extract a segmentation bounded between two intensity thresholds that has the least number of topological features. It takes three thresholds with increasing values, \(t_{low} < t_{mid} < t_{high}\), where \(t_{low} ,t_{mid}\) are the same as in the filling algorithm above. Let the segmented shapes of \(I^{\prime}\) at these thresholds be \(B_{low} ,B_{mid} ,B_{high}\). The algorithm of [20] computes a shape \(B\) such that \(B_{high} \subseteq B \subseteq B_{low}\) and the following vector energy is minimal in lexicographical order,
$$\left\{ {\beta_{0} \left( B \right) + \beta_{1} \left( B \right) + \beta_{2} \left( B \right), diff\left( {B,B_{mid} } \right)} \right\}$$
(1)
Here, \(\beta_{0} ,\beta_{1} ,\beta_{2}\) counts the number of connected components, handles and voids of \(B\), and \(diff\) is a difference measure between two voxel sets that considers both the number and intensity of voxels that are in one of the sets but not the other. Intuitively, \(B\) makes the least change to the shape \(B_{mid}\) (in terms of \(diff\)) to remove as many topological features on \(B_{mid}\) as possible while sandwiched between \(B_{high}\) and \(B_{low}\).
An example result of this step (filling hollow spaces and then applying the algorithm of [20]) is shown in Fig. 2B. The three thresholds \(t_{low} , t_{mid} , t_{high}\) control the trade-off between topological simplicity and geometric fidelity of the segmentation \(B\). A larger gap between \(t_{mid}\) and \(t_{low} , t_{high}\) gives the algorithm [20] more room to remove topological features, and hence the result has fewer topological errors. But this comes at the cost of possibly large and undesirable geometric changes; for example, a root may be broken in the middle to remove a topological handle. We found that the best results are obtained by setting \(t_{low}\) to be the highest value such that thin roots remain connected in \(B_{low}\) and setting \(t_{high}\) to be the lowest value before roots start to merge in \(B_{high}\). The resulting segmentation \(B\) will not be completely free of topological errors but fixing these remaining errors in a geometrically correct way requires a more global context of the root shape. This will be addressed in the next step and with the help of a geometric skeleton.
Skeletonization
The tubular shape of roots makes them representable by curve skeletons. The graph structure of a skeleton is the key that enables subsequent analysis of branching hierarchy and traits. Given the segmentation \(B\) produced by the previous step, this step produces a geometric skeleton \(S\) capturing the shape and branching structure of the roots. We will utilize the structural information provided by the skeleton to resolve the topological errors that remain from the previous step, so that \(S\) is connected and free of cycles.
We first compute an initial curve skeleton \(S_{0}\) from the voxel shape \(B\) using the algorithms described in [21, 22, 34]. These methods have been recently used in skeleton-based phenotyping of sorghum panicles [23]. Specifically, the Voxel Core method of [22] extracts an approximate medial axis of \(B\), which is a triangulated 2-dimensional non-manifold that lies at the center of \(B\). Taking the medial axis as the input, the Erosion Thickness method of [21] reduces it to a polygonal 1-dimensional skeleton \(S_{0}\). Compared with other means for computing skeletons, this approach has several features important for root analysis. Both methods [21, 22] are robust to irregular shape boundaries, and hence spurious skeleton branches are minimized. Both methods also preserve the topology of \(B\) exactly, and hence \(S_{0}\) carries the same set of topological features as \(B\) without adding new features. Both methods are highly optimized and capable of processing large 3D images. Unlike methods that produce skeletons made up of voxels, \(S_{0}\) is made up of vertices and edges, and hence it can be conveniently processed by graph algorithms. Finally, the method of [21] also outputs a “thickness” measure for each edge of \(S_{0}\), which will be utilized later. An example of this initial skeleton is shown in Fig. 4A (also in Fig. 1C).
The remaining topological features on the segmentation \(B\) manifest as disconnected components and cycles on the initial skeleton \(S_{0}\). Figure 4B shows an example of a cycle caused by two touching roots. To remove these features, we take the largest component of \(S_{0}\), denoted by \(S_{1}\), and remove cycles in \(S_{1}\) using a graph-based approach. We first construct a graph \(G = \left\{ {N,A} \right\}\) with nodes \(N\) and arcs \(A\) as follows: each node represents either a junction (a vertex incident to three or more skeleton edges) or a branch (a sequence of skeleton edges between two junctions or between a junction and an end vertex) on \(S_{1}\), and each arc connects two nodes representing a junction and a branch at that junction (see Fig. 5A, B). Next, we extract a spanning tree of \(G\), which is a subset of arcs \(A^{\prime} \subseteq A\) that connect all nodes in \(N\) and have no cycles (see Fig. 5C). The final skeleton, \(S\), is then obtained from \(A^{\prime}\) as follows: for each arc \(a \in A\) that does not exist in \(A^{\prime}\), the pair of skeleton branch and junction presented by \(a\) is “detached” from each other (see Fig. 5D). Note that cycle-removal using this approach prevents a skeleton branch from being broken in the middle.
To find a spanning tree that best captures the structure of the root, we associate each arc \(a \in A\) with a positive weight \(w\left( a \right)\) and compute the spanning tree \(A^{\prime}\) with the minimal total arc weights. \(A^{\prime}\) is known as the minimal spanning tree (MST) of \(G\), and it can be computed efficiently using standard algorithms such as Prim’s or Kruskal’s. The weight \(w\left( a \right)\) measures the likelihood that the pair of skeleton junction and branch represented by \(a\) should be detached. It is defined as:
$$w\left( a \right) = w_{angle} \left( a \right) + \lambda w_{dis} \left( a \right)$$
(2)
The first term \(w_{angle} \left( a \right)\) measures the continuity of skeleton orientation at the junction. Let \(j, b\) be the skeleton junction and branch represented by the arc \(a\), \(\Omega\) be the set of all branches at \(j\), and \(\vec{b}\) be the unit vector representing the tangent direction of the branch \(b\) oriented towards \(j\). The angle term is defined as:
$$w_{angle} \left( a \right) = \mathop {min}\limits_{b^{\prime} \in \Omega } \left( {1 + \vec{b} \cdot \overrightarrow {b^{\prime}} } \right)$$
(3)
This term reaches the minimum of 0 if there is some other branch \(b^{\prime}\) at junction \(j\) that has the same tangent direction as \(b\). The second term \(w_{dis} \left( a \right)\) measures the distance from the junction \(j\) represented by the arc \(a\) to the root stem. This term exists to discourage detaching branches representing nodal roots from the stem, which would have a great impact on the branching hierarchy. We use the algorithm reported in [23] to identify the stem as the longest non-branching sequence of skeleton edges on the skeleton \(S_{1}\) whose thickness measure is above a given threshold \(\in\), which was set in our experiments to be 0.15 multiplied by the maximum thickness among any vertex in \(S_{1}\). We then compute \(w_{dis} \left( a \right)\) as the shortest distance on the skeleton between junction \(j\) and any vertex on the stem. Finally, \(\lambda\) is a balancing parameter between the angle and distance terms. We found that a small value such as 0.05 best satisfies the need for avoiding detachment of nodal roots without overpowering the angle term \(w_{angle} \left( a \right)\). Figure 4C shows the skeleton obtained from the MST of the weighted graph. Observe that the touching between two roots in the highlighted view are correctly detached.
Inferring hierarchy
Given the cycle-free skeleton \(S\) obtained from the previous step, we next label each vertex of the skeleton as either part of the stem, a nodal root, or a lateral root of a specific order. This results in a branching hierarchy denoted as \(H\). The hierarchy plays a key role in the final step to obtain traits concerning each type of roots.
We first identify the path of skeleton edges capturing the stem using the heuristic of [23], as already done in the previous step. This path is called a stem path. We then consider all skeleton edges within a cylindrical region around the stem path, where the radius of the cylinder varies along the stem path and is set to be 1.6 times the thickness measure stored on the skeleton vertices on the path. These skeleton edges make up the stem region. The factor 1.6 was chosen empirically to ensure that all skeleton junctions representing the beginning of nodal roots are included in the stem region. Both stem path and stem region are labelled as hierarchy level 0.
To identify nodal roots and lateral roots of different orders, we make the following two assumptions, which are shared by DynamicRoots and have been supported by studies of root systems [7, 19, 24]. First, roots higher up in the hierarchy are generally longer. For example, nodal roots are generally longer than 1st-order lateral roots, which in turn are generally longer than 2nd-order lateral roots, and so on. Second, the maximum number of hierarchy levels in a root system is generally kept low. With these assumptions, we developed a heuristic that minimizes the depth of the hierarchy while favoring longer roots higher up in the hierarchy.
Our heuristic proceeds in two stages, a bottom-up traversal of the skeleton and then a top-down traversal. They are illustrated in Fig. 6 using a cartoon example. We start with a skeleton \(S\) labelled only by the stem region (Fig. 6A). Recall that a branch is a sequence of skeleton edges between two junctions or between a junction and an end vertex. Since \(S\) has no cycles, it is a “tree”. We consider the stem region as the “root” of this tree, and this induces a partial ordering on the skeleton such that each junction (outside the stem region) is incident to exactly one parent branch and two or more children branches. The first stage of the heuristic computes, at each skeleton junction, the association between the parent branch with one of the children branches as the continuation of the same root (see arrows in Fig. 6B). The association is computed by visiting the skeleton branches from the leaves of the skeleton tree towards the stem and updating a depth d(b) (numbers in Fig. 6B) and a distance l(b) at each visited branch \(b\), as follows. First, for each branch \(b\) incident to an end vertex, we set d(b) = 0 and l(b) as the length of \(b\). We then iteratively visit parent branches whose children branches have already been visited. For a parent branch b whose children are b1, …., bn, we associate \(b\) with the child \({b}_{i}\) that has the maximal depth \(d\left({b}_{i}\right)\). If multiple children have the same maximal depth, \(b\) is associated with the \({b}_{i}\) with maximal length \(l\left({b}_{i}\right)\). We then set d(b) = d(bi) + 1 and l(b) to be l(bi) plus the length of \(b\). In the second stage, we visit all branches from the stem to the leaves and assign the hierarchy levels. We assign each branch attached to the stem region a hierarchy level of 1 (i.e., nodal roots). For each parent branch assigned with level \(k\), we assign level \(k\) to the child branch associated with the parent (computed from the first stage) and level \(k+1\) to all other children branches. The resulting hierarchy labelling is shown in Fig. 6C.
Computing traits
Given the skeleton \(S\) and the hierarchy labelling H, TopoRoot computes a suite of coarse-grained and fine-grained traits. Like existing works (e.g. [14]), we compute global traits which are aggregated over all roots regardless of their location in the hierarchy, including the total root length, number of roots, and average root length. For fine-grained traits, for each hierarchy level (e.g., nodal roots, 1st-order lateral roots, 2nd-order lateral roots, etc.), we compute the root count, average and total root length, average root tortuosity, average root thickness, average number of children, and the average emergence, midpoint, and tip angle. We also report the length and thickness of the stem. Some of these traits, such as stem length and per-level angle traits, have not been previously reported by existing tools (including DynamicRoots [16]). Details on how each of these traits is computed can be found in Additional file 1: Table S1.