Selection and evaluation of reference genes for qRT-PCR analysis in Euscaphis konishii Hayata based on transcriptome data

Background Quantitative real-time reverse transcription-polymerase chain reaction has been widely used in gene expression analysis, however, to have reliable and accurate results, reference genes are necessary to normalize gene expression under different experimental conditions. Several reliable reference genes have been reported in plants of Traditional Chinese Medicine, but none have been identified for Euscaphis konishii Hayata. Results In this study, 12 candidate reference genes, including 3 common housekeeping genes and 9 novel genes based on E. konishii Hayata transcriptome data were selected and analyzed in different tissues (root, branch, leaf, capsule and seed), capsule and seed development stages. Expression stability was calculated using geNorm and NormFinder, the minimal number of reference genes required for accurate normalization was calculated by Vn/Vn + 1 using geNorm. EkEEF-5A-1 and EkADF2 were the two most stable reference genes for all samples, while EkGSTU1 and EkGAPDH were the most stable reference genes for tissue samples. For seed development stages, EkGAPDH and EkEEF-5A-1 were the most stable genes, whereas EkGSTU1 and EkGAPDH were identified as the two most stable genes in the capsule development stages. Two reference genes were sufficient to normalize gene expression across all sample sets. Conclusion Results of this study revealed that suitable reference genes should be selected for different experimental samples, and not all the common reference genes are suitable for different tissue samples and/or experimental conditions. In this study, we present the first data of reference genes selection for E. konishii Hayata based on transcriptome data, our data will facilitate further studies in molecular biology and gene function on E. konishii Hayata and other closely related species. Electronic supplementary material The online version of this article (10.1186/s13007-018-0311-x) contains supplementary material, which is available to authorized users.


Background
Quantitative real-time reverse transcription-polymerase chain reaction (qRT-PCR) has become one of the most powerful tools to study gene expression due to its high sensitivity, accuracy and specificity [1]. However, to get accurate and reliable results, a reference gene is necessary to normalize gene expression and avoid errors caused by different experimental procedure, such as sample amounts, quality and quantity of RNA, efficiency of enzymatic reaction and PCR efficiency [2,3].
Most of the commonly used reference genes are housekeeping genes, such as actin (ACT ), tubulin (TUB), polyubiquitin (BUQ), elongation factor 1-α (EF1-α), glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and ribosomal RNAs (18S rRNA or 28S rRNA). However, some data showed that expression levels of these housekeeping genes can vary considerably under different experimental conditions [4,5], and also, in nonmodel plant species, usually the used reference genes are identified by the orthologous sequence of common
Twelve genes (EkUBC, EkF-ACP, EkARP7, EkEF2, EkACT , EkGAPDH, EkEEF-5A-1, EkADF2, EkTUB, EkPLAC8, EkLPP, EkGSTU1) were selected as candidate genes according to transcriptome data from our lab (Liang et al., College of Forestry, Fujian Agriculture and Forestry University) (unpublished data), and their expression stability was evaluated by qRT-PCR across different experimental conditions: including five tissues (root, branch, seed, leaf and capsule), six different developmental stages of seed and six different development stages of capsule. Their expression stability was calculated using geNorm and NormFinder. Additionally, in order to validate our results, the expression levels of EkCAD1 in different tissues were normalized by the most and least stable genes.

Plant material
Euscaphis konishii Hayata tissues were collected from Fujian Agriculture and Forestry University, Fujian Province, China. Tissues (leaf, capsule, seed, root and branch) were collected on November 15th 2016, and six developmental stages of capsule and seed were collected once every 15 days after formation. All samples were harvested, washed and surface dried and then frozen in liquid nitrogen and immediately stored at − 80 °C until required for further analyzes. Three biological replicates for each sample were used for RNA extraction.

RNA isolation and cDNA synthesis
Total RNA was extracted from each sample using the RNAprepPure Plant Kit DP441 (Tiangen Biothch CO., LTD, Beijing, China), according to the manufacturer's instructions. RNA was treated with DNase I (Tiangen, Beijing, China) to eliminate DNA contamination. RNA quality was determined by 1.2% agarose gel electrophoresis. The concentration and purity of total RNA was determined using a NanoDrop 2000c Spectrophotometer (Thermo Scientific, US). The A 260 /A 280 ratio of total RNA between 1.90 and 2.10 was considered to meet the required quality for further experiments. First-strand of cDNA was synthesized using the First Strand cDNA Synthesis Kit (Roche, Switzerland) using 1.0 μg of total RNA in a 20 μL reaction volume according to the manufacturer's protocols.

Selection of candidate reference genes and primer design
Based on transcriptome sequencing data from our laboratory, 12 reference genes were selected to normalize and validate qRT-PCR experiments by screening for genes with relatively stable expression (based on their RPKM and fold change values), including nine novel genes and three common housekeeping genes. Their sequence/ alignment/phylogenetic data are shown in Additional files 1 and 2. Forward and reverse primers of all candidate reference genes were designed using Primer Premier 5.0 with the following parameters: Tm values ranging from 50 to 70 °C, GC percent of 45-50%, primer lengths of 18-25 bp and product length of 90-200 bp. All primers were synthesized by Sangon Biotech Co., Ltd (Shanghai, China). Primer details are shown in Table 1.
qRT-PCR analysis for each candidate reference gene was performed on a 7500 Fast ABI Real-time PCR system (Applied Biosystems, US) using FastStart Universal SYBR Green Master (Roche, Switzerland). A 20 μL reaction mixture contained: 10 μL 2 × SYBR Green Master, 0.4 μL forward primer (10 μM), 0.4 μL reverse primer (10 μM), 2 μL cDNA and 7.2 μL dd H 2 O in a 96-well plates. The amplification conditions were as follows: 50 °C for 2 min, 95 °C for 10 min, 40 cycles of 95 °C for 15 s and 60 °C for 30 s. Melting curve was analyzed to determine primer specificity.
All samples were analyzed in three biological and technical replicates. Serial tenfold dilutions of cDNA template were used to generate slope of the standard curve to calculate amplification efficiency and correlation coefficient of each candidate reference gene.

Data analysis
NormFinder and geNorm were used to analyze the stability of the 12 candidate reference genes under different conditions. Expression levels of each reference gene were shown by Cq values. Before using the two softwares, the raw Cq values was used to calculate relative quantities by the equation: Q = 2 −(sampleCq-mimCq) . The values of stability (M) and pairwise variation (V) between genes was generated by geNorm, the lower M value is the gene expression is more stable [8,34,35]. Furthermore, the normalization factor generated by computing the pairwise variation of the two normalization factor was used to determine the most suitable numbers of reference genes with a cut-off value of 0.15 [17]. NormFinder was used to evaluate the stability of candidate genes by intra-and inter-group variations. The more stable reference gene will have lower stability value and inter-and intra-group variation.

Validation of the candidate reference genes
In order to verify the results of our experiments, the most stable and unstable reference genes were selected to validate the expression of the E. konishii Cinnamyl alcohol dehydrogenase 1 (EkCAD1) gene in different tissue samples (root, branch, capsule, seed and leaf ). CAD1 belongs to CAD family, which catalyzes the reduction of p-coumaricaldehyde, coniferyl aldehyde and sinapyl aldehyde to their alcohol derivatives which are then polymerized into lignin [36], CAD is one of the most used genes to manipulate to obtain plants with low lignin content [37]. qRT-PCR experimental method was the same as described above, and the relative expression level was calculated by 2 −ΔΔct method [12]. Data from three biological replicates were analyzed using analysis of variance (ANOVA) followed by Student's t test (P < 0.05).

Primer specificity and PCR amplification efficiency
A total of 12 candidate reference genes, including three common housekeeping genes and nine novel genes from transcriptome sequencing data of E. konishii were selected for qRT-PCR normalization. The details of gene names, abbreviation, accession number, primer sequence, primers Tm, product length, amplification efficiency and correlation coefficient are shown in Table 1. The specificity for each primer set was validated by melting curve. For all primer sets the melting curve showed a single amplification peak (Additional file 3). qRT-PCR efficiency for all 12 candidate reference genes ranged from 97.89% for EkUBC to 103.21% for EkACT , and correlation coefficients varied from 0.9795 to 0.9999 (Table 1).

Cq values of candidate reference genes
Cq values for all 12 reference genes are shown in Fig. 1

Expression stability of candidate reference genes
Expression stability of the 12 reference genes was analyzed by geNorm and NormFinder. Samples were divided into three different experimental groups: (1) five tissues (root, leaf, branch, seed and capsule), (2) six seed developmental stages and (3) six capsule developmental stages.

geNorm analysis
Gene expression stability was determined by M-value in geNorm analysis, the lower the M value is, the more gene expression stability. For the tissue group the two most stable genes were EkGSTU1 and EkGAPDH with the lowest M value, and EkTUB was the most unstable gene. In the seed group EkEEF-5A-1 and EkGAPDH were the two most stable genes through all the different developmental stages, and EkLPP was the most unstable gene. Finally, in the capsule group EkGAPDH was the most stable gene, followed by EkGSTU1, and EkF-ACP and EkUBC were the least stable genes ( , which indicate that two reference genes are enough to normalize gene expression data (Fig. 3).

NormFinder analysis
Expression stability values analyzed by NormFinder are shown in Table 3. For tissue group, EkGSTU1 and EkGAPDH were the most stable reference genes, and   EkTUB was the least stable gene, same as shown by geNorm analysis. In the seed group EkEEF-5A-1 and EkGAPDH were the most stable reference genes, while EkARP7 was the least stable gene. In the capsule group, EkGAPDH and EkGSTU1 got the top rank, while EkUBC and EkF-ACP were ranked at the lowest. In general, the ranking was same as geNorm analysis (Table 3).

EkCAD1 expression and validation of EkGSTU1 and EkGAPDH
In order to verify the reliability of the selected reference genes, expression profiles of EkCAD1 gene was determined in different tissues. Relative expression levels were normalized using the two most stable reference genes (EkGSTU1 and EkGAPDH) and the least stable reference gene (EkTUB). EkCAD1 showed similar expression levels when single or a combination of reference genes (EkGSTU1 and EkGAPDH) were used to normalize the expression. EkCAD1 expression was up regulated in all the tissues except in seed. However, when EkTUB was used for normalization (unstable gene), relative expression profile of EkCAD1 was different when compared when the normalization expression was done using the two most stable reference genes identified in our study (EkG-STU1 and EkGAPDH) (Fig. 4). Our results suggest that the expression patterns of target genes are differed when normalized by different reference genes.

Discussion
qRT-PCR is one of the most commonly used technique to determine gene expression in plants. To ensure the accuracy and reliability of the results, a suitable reference gene is necessary for data normalization. Conventionally, some housekeeping genes such as ACT , GAPDH, TUB, have been used as reference genes, however, no single gene can be used for all plant species, experimental conditions and/or tissues. Therefore, it is required to select proper reference gene(s) for certain species under different conditions rather than using common reference genes.
The development of high-throughput sequencing technology provides a more efficient approach to study plant molecular biology, and it has been widely used in plant genomes [38][39][40][41][42][43], plant transcriptome [44][45][46][47], plant ncRNA [48][49][50], moreover, the generation of large-scale gene segments and gene expression data by sequencing provides a new resource for the identification of reference genes, especially in non-model species [51][52][53]. Therefore, transcriptome data on E. konishii Hayata, available in our laboratory can be used as a tool to identify candidate reference genes. Asystematic study of 12 candidate reference genes in three conditions was carried in this paper, and their expression stability was calculated using geNorm and NormFinder.
ACT and TUB, the most widely used reference genes, did not show a good expression stability in E. konishii Hayata across all sample sets (Tables 2, 3). The phenomenon that expression levels of common reference genes varied in a large range has been reported in several papers [54,55]. GAPDH, a common housekeeping gene also, has been widely used as reference gene in different species and experimental conditions [51,[56][57][58][59][60], in our experiments this gene was one of the two most stable genes in tissue sample set and capsule development stages, but it did not perform well in across all the sample and seed sets. The different performance of EkGAPDH in different experimental conditions in this study demonstrated that there is no single reference gene that can be used for all species or different experimental conditions [61][62][63][64][65].
In this study, EkGSTU1 (glutathione-S-transferase tau 1), which belongs to tau class of glutathione transferases (GSTs) [66], was the one of two most stable genes in tissues sample and capsule development stages. EkADF2 and EkEEF-5A-1 were the two most stable genes in total sample set, ADF (actin-depolymerizing factor) play important roles in several cellular processes that require cytoskeletal rearrangements, such as cell migration, chromosome introgression, cleavage plane orientation and furrow formation [67][68][69]. VvADF has been identified as candidate reference gene for grapevine during anthesis [6], rubber tree duration of latex flow [70] and TrADF3 was selected as reference gene in staminate and perfect flowers of T. rupestris [71].
It has been widely accepted that using combination of multiple reference genes to normalize gene expression can give more accurate and reliable expression patterns than using a single gene in qRT-PCR analysis [57]. Based on validation results of target gene expression, when EkGAPDH and EkGSTU1 were selected as reference genes for normalization either single or combination, the target gene EkCAD1 showed the similar expression pattern among different tissues, which indicated that the expression pattern of EkCAD1 was nearly identical when normalized with a single reference gene or two. Interestingly, in the tissue group, the combination of traditional housekeeping gene (EkGAPDH) and a novel identified reference gene (EkG-STU1) were identified as the most stable reference genes, suggesting that combination of traditional housekeeping genes and newly identified reference genes based on transcriptome data can be used as a good strategy for expression normalization of E. konishii Hayata genes.

Conclusion
In this study, we evaluated the expression stability of twelve candidate reference genes, including three traditional housekeeping genes and nine novel genes based on transcriptome data of E. konishii Hayata. Additionally, the expression pattern of target gene EkCAD1 was determined in different tissues to further verify the reliability of the identified stable reference genes. This study shows the first data for reference genes validation on E. konishii Hayata. Our study will contribute in future studies of gene expression in E. konishii Hayata and related species.