Plant materials and treatments
Cowpea (V. unguiculata L. Walp) breeding lines IT96D-602 and Tvu7778 were provided by the Dr BB Singh of the International Institute of Tropical Agriculture (IITA) . Seeds were germinated and plants were grown in a glasshouse under 11 h day length, 28°C and 18°C day and night temperatures, respectively, and watering three times weekly. At six weeks, five replicate plants of each variety were divided into two groups. One group was subjected to drought stress by withholding water, and the other group was kept to the control watering scheme.
RNA was isolated from cowpea leaves using Tri-reagent (Sigma) and Polyvinyl pyrrolidone (PVP) (Ambion's Plant RNA isolation aid). Contaminating genomic DNA was removed with the Turbo DNA-free kit (Ambion) and the RNA cleaned up with the Plant RNeasy kit (Qiagen, Hilden, Germany).
Construction of cDNA library using SSH
Differential expression analysis by means of SSH  was employed to prepare a cDNA drought expression library for cowpea. Messenger RNA (mRNA) was isolated from 50 μg pools of stressed IT96D-602 RNA (9 and 12 days without water) (treated) and unstressed Tvu7778 RNA (9 and 12 days) (control) using an Oligotex mRNA purification kit (Qiagen). cDNA was synthesised from each mRNA using the cDNA synthesis system (Roche Diagnostics, Basel, Switzerland) to be used as unsubtracted testers and unsubtracted drivers in SSH . Subtractive hybridisation was performed on Rsa I (Roche Diagnostics) -digested tester and driver cDNA fragments using the PCR-Select cDNA subtraction kit (BD Biosciences Clontech, Palo Alto, CA), as previously described . A forward subtraction was performed by using the treated sample as tester (and control as driver), and a separate reverse subtraction was performed by using the control cDNA as tester (and treated cDNA as driver). After subtraction the products were amplified by a primary PCR and a nested secondary suppression PCR to generate differentially expressed cDNA fragments (termed ST (subtracted treated) and SC (subtracted control) for the forward and reverse libraries, respectively). Replicate PCR reactions were pooled, size fractionated and cloned into the pGEM-T Easy cloning vector and transformed into Escherichia coli JM109 following the manufacturers' instructions (Promega, Madison, WI). Transformed colonies were selected by blue-white selection on 100 μg/ml ampicillin LB-agar selection media (spread with X-Gal and IPTG) and stored as 25% glycerol stocks at -70°C in sterile 96-well culture plates (Corning, NY). In addition, unsubtracted PCR products from the treated cDNA (drought stressed IT96D-602) (termed UT) and control cDNA (control Tvu7778) (termed UC) were also prepared to be used for SSHscreen analysis as described in .
Fabrication of SSH library on glass slide array
Inserts of the cowpea drought expression cDNA library were amplified with PCR directly from overnight bacterial cultures in 96-well format (Thermo-Fast, ABGene, Epsom, UK) in 100 μl reactions with 1 U Biotaq DNA polymerase (Bioline) and the SP6 and T7 primers (Additional file 6). The PCR plate was sealed with a silicon mat (Corning). Reactions were incubated in a PTC-100 Thermocycler (MJ Research) at 94°C for 5 min; 30 cycles of (94°C for 30 s, 50°C for 30 s and 72°C for 1 min); and 72°C for 5 min.
The PCR products were purified with Montage PCR purification plates on a vacuum manifold (Microsep) and resuspended in 50 μl SDW. The suspensions were transferred to 96-well storage plates, covered with well caps (Nunc, Roskilde, Denmark) and stored at -20°C. The purified PCR products were dried down in a vacuum centrifuge at 45°C, resuspended in 50% dimethyl sulfoxide (DMSO), transferred to 384-well spotting plates and stored at -70°C until microarray spotting.
The control genes gfp (717 bp fragment in pGEM-T Easy, positions 1603-2319 of GenBank accession number AF078810), globin (human beta-globin; 474 bp fragment in pBluescriptSK, positions 50-523 of NM_000518) and nptII (812 bp fragment in pGEM-T Easy, positions 142-953 of V00618) were purchased from the Nottingham Arabidopsis stock centre [NASC, http://arabidopsis.org.uk]. They were transformed into E. coli JM109 (Promega). An its clone in pGEM-T Easy (193 bp fragment from the internal transcribed spacer 2 of the rRNA genes from Leptographium elegans) was also used as a control gene. It matches to positions 268-458 of AF343675.1. Plasmids were isolated from cultures using the Qiaspin miniprep plasmid isolation kit (Qiagen). PCR products of the four control genes were prepared using the T7 and SP6 primers (PCR product sizes: gfp (893 bp), nptII (988 bp), its (369 bp)) or the T7 and M13R primers for globin (677 bp). Montage purified PCR products of twelve 100 μl PCR reactions each were pooled, concentrated and transferred to 12 wells each of a 384-well spotting plate. An equal volume of DMSO was added so that the final concentrations in 50% DMSO ranged from 70-100 ng/μl. Five two-fold serial dilutions were also prepared for each PCR fragment (gfp: 180-11.25 ng/μl;globin: 100-6.25 ng/μl;nptII: 150-9.375 ng/μl; and its: 130-8.125 ng/μl), transferred to an additional 10 wells per fragment, an equal volume of DMSO added, and spotted on the glass slides.
Glass slides (Corning GAPS II) were spotted with the cowpea drought expression library (4160 clones in total from the forward and reverse libraries) and controls using the Array Spotter Generation III (Molecular Dynamics, Sunnyvale, CA) at the University of Pretoria, Pretoria, South Africa http://microarray.up.ac.za. Each sample spot was duplicated on the slide in replicate blocks on either side of the slide, and therefore replicates are not spatially close together. The slides were allowed to dry overnight in the protective atmosphere of the spotter, after which the DNA was cross-linked under ultraviolet (UV) light for 3 min. The slides were stored in a desiccator covered in foil at room temperature.
Screening SSH library on microarrays
SSH cDNA fragments (ST, SC, UT and UC), purified by PCR Minelute cleanup kit (Qiagen), were digested with RsaI (10 U per microgram DNA) in the appropriate buffer overnight at 37°C. The fragments were separated from the adaptor fragments by electrophoresis on a 1.5% low melting point agarose gel (Seaplaque, FMC Bioproducts) in 0.5× TAE and purified from the gel using the Qiaquick gel extraction kit (Qiagen).
The control fragments were excised from their plasmids using restriction digestion to exclude any T7 and SP6 primer binding sites (Kpn I/Xba I for globin (product of 548 bp); Nco I/Pst I for gfp (768 bp); Eco RI for nptII (830 bp) and its (211 bp)). Restriction fragments were purified with the Qiaquick gel extraction kit (Qiagen). Each target sample of SSH cDNA fragments (200 ng) were spiked with equal amounts of a control fragment pool made up of different quantities of four control fragments (45 ng globin, 45 ng its, 4.5 ng nptII and 0.45 ng gfp) for within-slide normalization. Spiking with equal amounts of fifteen- or three-fold dilutions of the control fragment pool were tested and also gave sufficient hybridization for within-slide normalization (data not shown).
Targets were labelled by direct Cy-dUTP incorporation by Klenow enzyme (Fermentas, Vilnius, Lithuania). Each SSH fragment sample was labelled with both dyes (Cy3 and Cy5) for a dye-swap experiment of each slide. The protocols and data analysis techniques described in  were followed, with some modifications. DNA to be labelled, in a volume of 12 μl, was denatured at 95°C for 5 min and placed on ice. The following were added to the pairs of denatured DNA samples to yield a total reaction volume of 20 μl: 2 μl of 10× Klenow buffer (Fermentas); 2 μl 10× Hexanucleotide mix (Roche Diagnostics); 2 μl Klenow enzyme (5 U/μl; Fermentas); 2 μl of a dNTP mix containing 1 nmol each of dATP, dCTP and dGTP, 0.74 nmol dTTP and 0.27 nmol of either Cy3-dUTP or Cy5-dUTP (Amersham Biosciences). The labelling reaction was incubated overnight (17-20 h) at 37°C. The labelled DNA was cleaned up from unincorporated dye using the Qiaquick PCR purification kit (Qiagen). Dye incorporation was measured using a NanoDrop ND-1000 UV-Vis Spectrophotometer (Nanodrop Technologies, Wilmington, DE).
Labelled SSH targets were combined in pairs using equal amounts of Cy3 and Cy5 dye incorporation for each target in each pair required for SSHscreen analysis. Each labelled target DNA mix was dried down in a vacuum centrifuge at 45°C and resuspended in 50 μl hybridisation solution (50% formamide, 25% 4× Microarray hybridisation buffer (Amersham Biosciences), 25% SDW). Labelled targets in hybridisation solution were denatured at 95°C for 2 min and placed on ice.
Glass slides arrayed with the SSH cDNA libraries were pre-treated in 1% bovine serum albumin (BSA; Roche Diagnostics), 3.5× SSC (525 mM sodium chloride and 52.5 mM sodium citrate) and 0.2% sodium dodecyl sulphate (SDS) at 60°C for 20 min. After rinsing in SDW at room temperature, the slide was dried by centrifugation in a 50 ml tube at 1000 × g for 4 min at room temperature in a swing-out rotor (Eppendorf 5810R centrifuge). The slide was placed in a locally manufactured hybridisation chamber (HybUP, NB Engineering, Pretoria, South Africa) with 20 μl SDW in the reservoirs on either side. Labelled and denatured target was applied to the slide and gently overlaid with a cover slip. The chamber was sealed and incubated in a water bath at 42°C for 16 h. Slides were washed for 4 min at 42°C with 1× SSC (150 mM NaCl, 15 mM sodium citrate)/0.2% SDS, twice with 0.1× SSC (15 mM NaCl, 1.5 mM sodium citrate)/0.2% SDS and three washes of 0.1× SSC for 1 min at room temperature. After dipping the slide in SDW at room temperature and centrifuged to dry, it was immediately scanned using a GenePix™ 4000B scanner (Axon Instruments, Foster City, CA).
GenePix Pro 5.1 software (Axon Instruments) was used to automatically locate all the spot positions from the scanner-generated TIFF images and associate them with each specific clone in a GenePix Array List (GAL file)(available at NCBI GEO Accession # GSE20273). The GAL file links the information from the arraying process to the analysis, since it provides identification information for each spot printed on the slide. Bad quality spots (irregularly shaped or with hybridisation artefacts; signal/noise ratio < 3) were flagged for exclusion during data analysis and the array of circles were manually adjusted for a better fit. GenePix Pro 5.1 was used to extract the dye intensity data of each spot and save the data for each slide in a GenePix Results file (gpr).
SSHscreen software analysis of microarray data
The SSHscreen 2.0.1 package, written as a single function in the R programming language, was used for analyzing the resulting microarray data to calculate ER3 and inverse ER2 values. ER3 values (log2(UT/UC) for the forward library clones; and log2(UC/UT) for the reverse library clones) quantify differential expression of transcripts that give rise to the clones in each library . Inverse ER2 values (log2(UT/ST) for the forward library clones; and log2(UC/SC) for the reverse library clones) reflect the relative abundance of transcripts for each clone in the unsubtracted samples . The original version of SSHscreen is described in . Improvements to the functionality were added to the original R code, the documentation was updated and the latest version was packaged as SSHscreen 2.0.1. SSHscreen can be downloaded at http://microarray.up.ac.za/SSHscreen/, together with a demo data set and an example R script. For details on how to use SSHscreen and to view a full description of all the possible argument options, type help(SSHscreen) at the R command line (after loading the SSHscreen library). The SSHscreen analysis implemented by the R script provided as Additional file 7 is described in the following sections:
SSHscreen analysis in R (version 2.8.1) required the libraries of limma (version 2.16.5) and SSHscreen 2.0.1 to be installed in R. The input data to SSHscreen 2.0.1 were the 12 Genepix results files from hybridization experiments to the cowpea SSH library arrays (gpr files deposited at NCBI GEO series accession number GSE20273), with the Targets file, Spot types file (Additional files 8 & 9, respectively) and the GAL file (available at GEO accession number GSE20273).
The first step was to weight all spots that had been flagged as poor quality (signal/noise < 3) by the GenePix Pro 5.1 image analysis programme http://www.axon.com so that these spots would not be used to calculate the normalization factors. Background correction used the normexp method in limma  with an offset of 50, which dampens the variation of the log-ratios for very low intensity spots towards zero. This approach is encouraged specifically when using empirical Bayes methods from the limma package .
Within array normalization was based on data from the four alien controls (globin, its, nptII and gfp) that had been spotted as dilution series on each array to make up a total of 176 control spots/array. Equal amounts of each control fragment had been added to pairs of target samples to be labelled with the Cy3 and Cy5 dyes, and thus hybridization signals from the control spots could be used for within-slide normalization. We used this data to apply the up-weighting print-tip loess within-array normalization method of limma in SSHscreen 2.0.1., which essentially applies full weight to all the control spots and zero weight to the spots of the probes from the SSH library. Thereby a loess curve is fitted through the control spots, which normalizes the data within each slide to remove the systematic errors due to the dye effects . Between-slide normalization was carried out using the Aquantile method, which is based on the assumption that the distribution of A values (eg. Average expression: 1/2(log2 [UT*UC]) is similar across all arrays. MA plots (Additional file 4) can be exported at each stage of pre-processing.
SSHscreen enrichment ratio analysis and outputs
ER3 and inverse ER2 values for each SSH library clone were calculated in SSHscreen using the functions of limma for differential gene expression analysis . ER3 values were based on testing the null hypothesis that there is no differential expression for a gene between the UT and UC samples, whereas inverse ER2 values test the null hypothesis that there is no difference in abundance of a clone between the unsubstracted sample (e.g. UT) and its corresponding subtracted sample (e.g. ST). This was achieved in SSHscreen 2.0.1 by implementing the limma function that fits gene-wise linear models through the normalized expression data that was the output of pre-processing. Thereafter, an empirical Bayes approach is used in limma to calculate a moderated t-statistic for each gene, in which the standard errors have been moderated across all the genes on the array. This approach of variance shrinkage improves inference about each gene in experiments in which there are a low number of replicates . We adjusted for multiple testing using Benjamini & Hochberg's method  for controlling the false discovery rate, which computes an adjusted p-value for the hypothesis test of each gene. A prior guess of 50% differentially expressed genes for the SSH libraries was implemented as the default in SSHscreen 2.0.1, and was used in calculating B-statistics for each gene using limma functions. The B-statistic [32, 33] can be interpreted as the log-odds that a specific gene is differentially expressed. Thus a positive B-statistic represents more than a 50-50 chance of differential expression. The outputs of SSHscreen were top tables which reported the enrichment ratios for each gene and associated statistics, namely the moderated t-statistic, associated p-value adjusted for multiple testing, and B-statistic (Additional file 5), MA-plots (Additional file 4) and a graphical representation of each clone on ER-plots (Figures 2a and 2b), which were used to select clones for sequencing.
Selected cowpea drought expression library clones were sequenced using the T7 Promoter primer by Inqaba Biotec (SA) or Macrogen (USA). Colonies were sent on LB-agar plates containing 100 μg/ml ampicillin. Non-redundant sequences have been deposited in dbEST at Genbank (Accession numbers GR942571 - GR942610).
Annotation and management of sequences using SSHdb
SSHdb (available at http://sshdb.bi.up.ac.za) was developed as a web-based tool for sequence management of clones in SSH libraries. The SSHdb interface was written using Turbogears , a Python web application framework. Currently, a central MySQL database is used to store sequence, top table and annotation information. SQLAlchemy , an object relational mapper for Python and toolkit for SQL, is implemented within SSHdb when a user queries the database.
For each input sequence in FASTA format, SSHdb removed the vector and adaptor fragments after BLASTN  searches were performed against the NCBI UniVec database http://www.ncbi.nlm.nih.gov/VecScreen/UniVec.html. Further BLASTN searches were carried out against all sequences already uploaded in the database, so that redundant partners in the library (using a BLASTN E-value cut-off value of 10e-10) could be identified. For each redundant partner group, the longest sequence in the group was selected by default as the representative clone. Multiple sequence alignments (generated by ClustalW ) for individual redundant partner groups could be viewed and downloaded from SSHdb. For each representative clone, SSHdb performed nucleotide-nucleotide and translated sequence comparisons using BLASTN and BLASTX searches against a local installation of the NCBI non-redundant nucleotide and peptide databases (nt/nr) . For cases where the E-value of the top BLASTX hit was less than 10e-10), this hit was automatically selected as the default priority annotation. The top 10 BLASTX and BLASTN hits were stored in the database. Blast2GO  has also been implemented in SSHdb, and the putative GO annotations for representative sequences are recorded in SSHdb. SSHdb provides two major views of the data, the SSH database view, which shows the annotated representative clones (see Additional file 1), or the SSH toptable view, which shows the enrichment ratio data for each clone in each library (see Table 1). SSHdb can be updated as additional clones are sequenced.
qPCR primer pairs (20-mers) were designed from selected cowpea sequences to amplify products between 120 and 250 bp in length from the SSH cDNA fragment pools (UT, UC, ST, and SC) (Additional file 6). qPCR reactions containing 1× Sensimix (Quantace, Celtic Molecular Diagnostics), SYBR Green, 2.5 mM MgCl2, the appropriate primer pair (200 nM each) and cDNA template in a total volume of 25 μl were set up and run on a Rotor-Gene (Corbett Research). The enzyme was activated by a hold at 95°C for 10 min followed by 45 cycles of 95°C for 15 s, annealing at 56°C for 30 s and extension at 72°C for 6 s. SYBR green fluorescence was measured after the extension step of every cycle. qPCR was performed on serial ten-fold dilutions of a mix of UT and UC cDNAs (templates ranging from 0.5 pg to 50 ng) to construct standard curves for each primer pair. The quantification cycle (Cq) values from the qPCR fluorescent profiles were converted to input nanograms of template using the standard curves. Average nanogram quantities for each gene was normalised relative to the data for the respective sample's reference gene content.
For ER3 verification, qPCR was performed in duplicate on 50 ng each of cDNA from the two cowpea cultivars before subtraction (UT and UC). Glyceraldehyde-3-phosphate dehydrogenase C-subunit (gapC) was used as a reference gene. A consensus sequence between the gapC genes of Medicago truncatula [GenBank: AC135505_Mt, exons only], G. max [GenBank: DQ192668_Gmax1 and DQ355800_Gmax2] and Pisum sativum [GenBank: PEAGAPCI] was used to design the gapC reference gene primers (Additional file 6). An expression ratio of log2(ng in UT/ng in UC) was calculated for each gene.
For ER2 verification, normalization of qPCR results between unsubtracted and subtracted cDNA samples (i.e. UT and ST; UC and SC) required the spiking of cDNA samples with equal amounts of an alien gene. qPCR was performed in duplicate on 10 ng of the UT, ST; UC and SC cDNA templates, each spiked with 50 pg of the human beta-globin fragment (prepared using the Globin forward and reverse primers, Additional file 6). Average input nanogram quantities were calculated for each gene and normalised with the globin spike content. Expression ratios of log2(ng in UT/ng in ST) and log2(ng in UC/ng in SC) were calculated for each gene.
RT-qPCR reactions were performed in duplicate using 100 ng of RNA isolated prior to SSH library construction from drought-treated IT96D-602 cultivar and control Tvu7778 at two time points, 9 and 12 days, separately. The Sensimix One-step RT-qPCR kit (Quantace) was used with a reverse transcription step at 49°C for 30 min inserted before the cycling profile described above. The quantification cycle (Cq) values from the RT-qPCR fluorescent profiles were converted to input nanograms of template using the standard curves.