Skip to main content

Table 2 Summary of the positive and negative datasets

From: PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel

Sub-dataset #Positive sequence #Negative sequence Length category
Q1 1588 2045 P1, N1
Q2 1596 2047 P2, N2
Q3 1593 2050 P3, N3
Q4 1365 1499 P4, N4
Total (Full dataset) 6142 7641 -
  1. Full dataset of positive and negative classes are partitioned into four sub-datasets i.e., Q1, Q2, Q3 and Q4. The partitioning was done based on the homogeneity of sequence length. For the Q1 sub-dataset, the sequence lengths for the positive and negative classes are P1 and N1 respectively, where P1 corresponds to 39 to 221 amino acids and N1 corresponds to 43 to 407 amino acids sequence length. Similar inference can be made for other sub-datasets
  2. P1: 39 to 221 amino acids; P2: 221 to 363 amino acids; P3: 363 to 538 amino acids; P4: 538 to 1000 amino acids; N1: 43 to 407 amino acids; N2: 407 to 485 amino acids; N3: 485 to 607 amino acids; N4: 607 to 1000 amino acids