Skip to main content

Table 1 Assembly statistics

From: Next generation sequencing and de novo transcriptomics to study gene evolution

Species

Read length

Raw reads

Clean reads

Assembler

N50

Contig count

H. annuus

101

2 x 21,404,702

40,742,686

CLC (ws60,paired)

482

59,530

A. montana

101

2 x 14,458,043

27,516,042

CLC (ws60,paired)

485

45,194

Z. haageana

101

2 x 38,382,090

64,649,107

CLC (autows,non-paired)

308

205,324

Z. haageana

101

2 x 38,382,090

64,649,107

CLC (ws60,paired)

435

80,460

Z. haageana

101

2 x 38,382,090

72,756,408

CLC (ws60,paired)

629

40,764

H. helianthoides

101

2 x 109,627,594

169,128,716

CLC (autows,non-paired)

305

443,800

H. helianthoides

101

2 x 109,627,594

169,128,716

CLC (ws60,paired)

497

151,272

H. helianthoides

101

2 x 109,627,594

200,130,791

CLC (ws60,paired)

496

162,563

  1. Clean reads were assembled using two methods; automatic word size (autows, 23), non-paired and word size 60 (ws60), paired method. Number of clean reads when quality filtering was done to achieve a quality threshold (q) of 30 and 22 are shown for Z. haageana and H. helianthoides datasets. N50 refers to the contig length where 50% of the assembly is represented by contigs of this size or longer.