Table 2 Details of the core genes used to assess the quality of de novo transcriptome assemblies

From: Next generation sequencing and de novo transcriptomics to study gene evolution

Gene name GenBank ID Length (amino acid) Average coverage
Late embryogenesis abundant (LEA) X59700.1 104 174,360
Oleosin (OLE) X78679.1 183 9,894
Aspartic proteinase (AP) AB025359.2 509 1,561
Pathogenesis related (PR) AB091075.1 158 503
Cysteine protease-1 (CP-1) AB109186.1 461 126
Serine/threonine protein kinase (PK) AB090881.1 439 50
  1. The level of transcription in sunflower ranges from high (LEA) to low (PK). The GenBank ID for the amino acid sequence used to tBLASTn each de novo transcriptome is shown. The average coverage in a word size 60, paired method transcriptome provides an indication of their relative abundance. This control set was found to be appropriate for the Asteraceae. For reference, the most closely related sequences in Arabidopsis thaliana are LEA4-5 (At5g06760), an OLEOSIN family member (At3g01570), APA1 (At1g11910), MLP423 (At1g24020), RD21B (At5g43060), and Protein kinase (PK) super family member (At5g15080).