The DAWGPAWS pipeline for the annotation of genes and transposable elements in plant genomes

Estill, James C; Bennetzen, Jeffrey L

doi:10.1186/1746-4811-5-8

Plant Methods

Table 3 Additional helper scripts included in the DAWGPAWS package.

From: The DAWGPAWS pipeline for the annotation of genes and transposable elements in plant genomes

DAWGPAWS Script	Purpose
cnv_gff2game.pl	Converts GFF files to the game.xml format.
cnv_game2gff3.pl	Converts game.xml files to the GFF3 format.
batch_hardmask.pl	Given a directory of lowercase masked sequence files, this will replace lowercase residues with an N or X to indicate masking.
dir_merge.pl	Given annotation results scattered across multiple directories, this program can merge the results into subdirectories in a single parent directory.
vennseq.pl	Given GFF annotation results from multiple methods, this program generates a Euler Diagram of these features using the VennMaster program [55]
batch_findgaps.pl	This program will annotate gaps in the query sequences in the input directory.
clust_write_shell.pl	This program writes shell scripts to run DAWGPAWS in a cluster environment running the Platform LSF queuing system.
cnv_seq2dir.pl	Given a FASTA file with multiple sequence files, this program generates a separate FASTA file for each sequence record. The sequence files produced are named using the sequence ID in the FASTA header in the input file.
fasta_merge.pl	This program merges all FASTA files in a directory into a single FASTA file.
fasta_shorten.pl	This program shortens the FASTA header by limiting the header length, or splitting the header by a delimiting character. Some annotation programs are limited by the length of the FASTA header that is accepted, and this programs allows input files to meet this limitation.
fetch_tenest.pl	Fetches multiple results from the Plant GDB TEnest server and converts the results to GFF.
gff_seg.pl	Given a GFF file that contains point or segment data, this will extract segments with score values that exceed a threshold value.
ltrstruc_prep.pl	Because the LTR_STRUC program only runs under the windows environment, this program converts FASTA sequences in UNIX to DOS line endings and generates the files name and flist file required for LTR_STRUC.
seq_oligiocount.pl	This program allows for the generation of a GFF file that counts the number of times an oligomer in the genomic contig occurs in a reference shotgun sequence database.

Back to article page

ISSN: 1746-4811

Contact us

Submission enquiries: journalsubmissions@springernature.com