PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel

Table 2 Summary of the positive and negative datasets

Sub-dataset	#Positive sequence	#Negative sequence	Length category
Q1	1588	2045	P1, N1
Q2	1596	2047	P2, N2
Q3	1593	2050	P3, N3
Q4	1365	1499	P4, N4
Total (Full dataset)	6142	7641	-

Full dataset of positive and negative classes are partitioned into four sub-datasets i.e., Q1, Q2, Q3 and Q4. The partitioning was done based on the homogeneity of sequence length. For the Q1 sub-dataset, the sequence lengths for the positive and negative classes are P1 and N1 respectively, where P1 corresponds to 39 to 221 amino acids and N1 corresponds to 43 to 407 amino acids sequence length. Similar inference can be made for other sub-datasets
P1: 39 to 221 amino acids; P2: 221 to 363 amino acids; P3: 363 to 538 amino acids; P4: 538 to 1000 amino acids; N1: 43 to 407 amino acids; N2: 407 to 485 amino acids; N3: 485 to 607 amino acids; N4: 607 to 1000 amino acids

ISSN: 1746-4811