Skip to main content

Table 3 Additional helper scripts included in the DAWGPAWS package.

From: The DAWGPAWS pipeline for the annotation of genes and transposable elements in plant genomes

DAWGPAWS Script Purpose
cnv_gff2game.pl Converts GFF files to the game.xml format.
cnv_game2gff3.pl Converts game.xml files to the GFF3 format.
batch_hardmask.pl Given a directory of lowercase masked sequence files, this will replace lowercase residues with an N or X to indicate masking.
dir_merge.pl Given annotation results scattered across multiple directories, this program can merge the results into subdirectories in a single parent directory.
vennseq.pl Given GFF annotation results from multiple methods, this program generates a Euler Diagram of these features using the VennMaster program [55]
batch_findgaps.pl This program will annotate gaps in the query sequences in the input directory.
clust_write_shell.pl This program writes shell scripts to run DAWGPAWS in a cluster environment running the Platform LSF queuing system.
cnv_seq2dir.pl Given a FASTA file with multiple sequence files, this program generates a separate FASTA file for each sequence record. The sequence files produced are named using the sequence ID in the FASTA header in the input file.
fasta_merge.pl This program merges all FASTA files in a directory into a single FASTA file.
fasta_shorten.pl This program shortens the FASTA header by limiting the header length, or splitting the header by a delimiting character. Some annotation programs are limited by the length of the FASTA header that is accepted, and this programs allows input files to meet this limitation.
fetch_tenest.pl Fetches multiple results from the Plant GDB TEnest server and converts the results to GFF.
gff_seg.pl Given a GFF file that contains point or segment data, this will extract segments with score values that exceed a threshold value.
ltrstruc_prep.pl Because the LTR_STRUC program only runs under the windows environment, this program converts FASTA sequences in UNIX to DOS line endings and generates the files name and flist file required for LTR_STRUC.
seq_oligiocount.pl This program allows for the generation of a GFF file that counts the number of times an oligomer in the genomic contig occurs in a reference shotgun sequence database.