Skip to main content
Figure 4 | Plant Methods

Figure 4

From: Analysing complex Triticeae genomes – concepts and strategies

Figure 4

GenomeZipper workflow. The GenomeZipper approach can be divided into three individual steps which can be run independently: repeat masking (A), detection of syntenic conserved blocks (B) and the ‘zipping’/integration (C) of all data sets into a virtual linear gene order model. A) The repetitive sequences (grey boxes) were filtered out from the NGS data (orange boxes) using Vmatch and the MIPS REdat Poaceae v9.0 library. Only sequences without repetitive elements or with at least 100 base pairs repeat-free regions were considered for the next analysis steps. B) The syntenic conserved regions between target and reference genome(s) were determined using BLASTX and a sliding window approach. The sequences were aligned against the reference genome(s) and the highly conserved genes (coloured boxes) were extracted and used for the construction of the gene map. C) The virtual gene map is constructed by integrating the syntenic conserved genes of one or multiple reference genomes and the NGS survey sequences along a backbone build by a genetic marker map of the same organisms as the target or a very closely related one. In a first step all conserved genes with a bi-directional blast hit to the gene-based marker are integrated into the zipper backbone. The remaining conserved genes (depicted in the clouds) are indirectly incorporated using the order deduced from the corresponding reference genome. The NGS data (orange boxes) are anchored to the ordered map by first best blast hits.

Back to article page