Skip to main content

Table 2 Summary of the positive and negative datasets

From: PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel

Sub-dataset

#Positive sequence

#Negative sequence

Length category

Q1

1588

2045

P1, N1

Q2

1596

2047

P2, N2

Q3

1593

2050

P3, N3

Q4

1365

1499

P4, N4

Total (Full dataset)

6142

7641

-

  1. Full dataset of positive and negative classes are partitioned into four sub-datasets i.e., Q1, Q2, Q3 and Q4. The partitioning was done based on the homogeneity of sequence length. For the Q1 sub-dataset, the sequence lengths for the positive and negative classes are P1 and N1 respectively, where P1 corresponds to 39 to 221 amino acids and N1 corresponds to 43 to 407 amino acids sequence length. Similar inference can be made for other sub-datasets
  2. P1: 39 to 221 amino acids; P2: 221 to 363 amino acids; P3: 363 to 538 amino acids; P4: 538 to 1000 amino acids; N1: 43 to 407 amino acids; N2: 407 to 485 amino acids; N3: 485 to 607 amino acids; N4: 607 to 1000 amino acids