- Open Access
Comparative performance of the BGISEQ-500 and Illumina HiSeq4000 sequencing platforms for transcriptome analysis in plants
- Fu-Yuan Zhu†1, 2,
- Mo-Xian Chen†3, 4,
- Neng-Hui Ye†5,
- Wang-Min Qiao†6,
- Bei Gao4,
- Wai-Ki Law6,
- Yuan Tian1,
- Dong Zhang6,
- Di Zhang4,
- Tie-Yuan Liu4,
- Qi-Juan Hu3,
- Yun-Ying Cao7,
- Ze-Zhuo Su8,
- Jianhua Zhang3, 9Email author and
- Ying-Gao Liu1Email authorView ORCID ID profile
© The Author(s) 2018
- Received: 30 March 2018
- Accepted: 6 August 2018
- Published: 13 August 2018
The next-generation sequencing (NGS) technology has greatly facilitated genomic and transcriptomic studies, contributing significantly in expanding the current knowledge on genome and transcriptome. However, the continually evolving variety of sequencing platforms, protocols and analytical pipelines has led the research community to focus on cross-platform evaluation and standardization. As a NGS pioneer in China, the Beijing Genomics Institute (BGI) has announced its own NGS platform designated as BGISEQ-500, since 2016. The capability of this platform in large-scale DNA sequencing and small RNA analysis has been already evaluated. However, the comparative performance of BGISEQ-500 platform in transcriptome analysis remains yet to be elucidated. The Illumina series, a leading sequencing platform in China’s sequencing market, would be a preferable reference to evaluate new platforms.
To this end, we describe a cross-platform comparative study between BGISEQ-500 and Illumina HiSeq4000 for analysis of Arabidopsis thaliana WT (Col 0) transcriptome. The key parameters in RNA sequencing and transcriptomic data processing were assessed in biological replicate experiments, using aforesaid platforms.
The results from the two platforms BGISEQ-500 and Illumina HiSeq4000 shared high concordance in both inter- (correlation, 0.88–0.93) and intra-platform (correlation, 0.95–0.98) comparison for gene quantification, identification of differentially expressed genes and alternative splicing events. However, the two platforms yielded highly variable interpretation results for single nucleotide polymorphism and insertion–deletion analysis.
The present case study provides a comprehensive reference dataset to validate the capability of BGISEQ-500 enabling it to be established as a competitive and reliable platform in plant transcriptome analysis.
- Alternative splicing
- Differential expressed genes
- Illumina HiSeq4000
- Next-generation sequencing
The past three decades witnessed a rapid advance in functional genomics, where gene transcription has emerged as an important research indicator for the study of functional genomics. Recently, transcriptome analysis has been accepted as a popular large profiling technique to reveal gene regulatory networks in both animals and plants [1–3]. The collection of methods which comprehensively and systematically analyze the transcriptome has been steadily increasing, both in their throughput and application range, eventually leading to better quality of data. Initial attempts regarding transcriptome analysis are based on oligonucleotide hybridization and array technologies. The establishment and gradual progression of next generation sequencing devices has resulted in high throughput RNA sequencing (RNA-seq) technology, defined as next-generation sequencing (NGS) [1–3], a routine laboratory practice in transcriptome analysis. The NGS’s capability of profiling the entire transcriptome, in addition to whole genomes, exomes and targeted gene regions and its dynamic range to detect subtle changes in expression level, has made significant impact in academic research, diagnostics and industry [1–3]. Since the last decade, the majority of efforts focused on reducing the prime cost while increasing sequencing accuracy and throughput for NGS platforms. Different from array-based technology, RNA-seq expands our knowledge on pervasive transcription of eukaryotic transcriptomes [4, 5], which enables to uncover the unexpected complexity of genomic regions which were once considered silent or antisense genes. The NGS approach has facilitated convenient and detailed study of new and informative features of transcription, such as novel transcript assembly, the regulation of untranslated regions (UTR), alternative splicing variants and the generation of small or non-coding RNAs [6–9]. Furthermore, during recent years, the comparison between RNA-seq and microarrays for transcriptome analysis has been carried out by several research groups. Particularly, the superior performance of RNA-seq is mainly attributed to its better resolution, lower variation and higher dynamic range than microarray-based transcriptome analysis [10–13]. However, the potential and capacity of RNA-seq needs to be explored in depth, and should be carefully investigated based on case studies and appropriate bioinformatic tools.
At present, a number of sequencing platforms such as Illumina HiSeq series and Roche 454 platform and RS/SEQUEL series from Pacific Biosciences (PacBio) have been developed by leading sequencing service providers worldwide. However, each platform is differed in its instrumentation and sequencing protocols such as library preparation procedures, base-calling mechanisms and measurement technology [14–17]. Thus, comparative studies among different sequencing platforms have been conducted to assess the intra- and inter-platform repeatability and reproducibility by using targeted RNA samples [18–20]. One classic example is from a case study in model yeast Saccharomyces cerevisiae , which focused on detailed intra-platform comparison adopting Illumina HiSeq series. The above study assessed the robustness of different platforms in of gene quantification, three different alignment algorithms, two assembly strategies (reference genome-based and de novo assembly) and five statistical methods have been used in order to validate the consistency of identifying differentially expressed gene (DEG) within Illumina HiSeq platforms and in comparison to conventional microarray datasets. High correlation has been reported between Illumina HiSeq and array based analysis using different combination of aforementioned approaches. Another large-scale cross-platform comparison has been performed by the Association of Biomolecular Resource Facilities (ABRF) members . In total, replicate experiments from 15 laboratories, 4 library construction protocols, 3 size fractions of library and 5 sequencing platforms have been subjected for comprehensive evaluation. The outcome of this cross-site comparison study among 15 ABRF laboratories has generated an early standard for performing NGS analysis on animal samples.
Although single molecule long-read sequencing platform has been developed since several years ago, representative platforms such as PacBio series and Minion (Oxford Nanopore Technologies) are not widely used in plant transcriptome studies accounting to their high cost and low throughput . The above two platforms have yet to reach the performance and dynamic range similar to those of NGS platforms with further advancement of their sequencing technology. At present, a wide range of NGS platforms are available in the sequencing market of China. Among these platforms, the Illumina series had gained its position as one of the most widely used sequencing platforms and have generated a substantial part of transcriptome data in the past 5 years probably due to its stable performance, lower error rate and relatively low cost in transcriptome analysis. In 2016, BGI announced its own NGS platform designated as BGISEQ-500 . The general NGS workflow and stepwise sequencing procedures of the newly developed BGISEQ-500 are similar to those of Illumina series; yet the two templates have marked differences. The subsequent DNA nanoball technology specifically used for library preparation in BGISEQ-500 platform is different from the library construction protocol used in Illumina series . Initial tests in DNA sequencing confirmed the platform’s potential to generate high quality data in DNA-related NGS applications [24–26]. The performance of this platform has been subsequently validated in small RNA profiling in comparison to Illumina series . BGISEQ-500 utilizes both single end (SE) and paired end (PE) modes which are comparable to latest Illumina model, HiSeq4000. The announced data on throughput of BGISEQ-500 was relatively high, which is potentially suitable for high throughput transcriptome studies. Even though it has been already validated for small RNA profiling, a comprehensive evaluation of the performance of BGISEQ-500 in transcriptome analysis has not been recorded to date. To this end, we used two Arabidopsis thaliana (WT, Col-0) seedling samples exposed to dimethyl sulfoxide (DMSO) or abscisic acid (ABA) (three replicates of each), to explore the capability of this platform in transcriptomic profiling. For comparison, Illumina HiSeq4000 was used as a control platform along with BGISEQ-500. Key parameters in current plant transcriptome studies including DEGs, alternatively spliced (AS) events, single nucleotide polymorphism (SNP), and insertions–deletions (INDEL) were comprehensively validated between these two platforms. Results indicated that both platforms have high inter- and intra-platform repeatability in gene quantification, DEG and AS analysis, but present a relative low correlation in SNP and INDEL identification. We discuss the possible underlying causes for the above and put forward our suggestions for enhancement of transcriptome analysis with respect to the two platforms subjected for comparison.
RNA samples and sequencing protocols for inter-platform comparison
Base and raw data quality
Summary of basic parameters in three RNA sequencing datasets
Total raw reads (Mb)
Total clean reads (Mb)
Genome mapped reads (Mb)
Gene mapped reads (Mb)
Genome mapping rate (%)
Gene mapping rate (%)
Intra- and inter-platform comparison of gene detection and quantification
Detection of differentially expressed genes and alternative splicing
Identification of single nucleotide polymorphisms and insertion–deletion mutation
Summary of SNP identification
Summary of INDEL identification
High performance of BGISEQ-500 in transcriptome analysis
Up to date, this article represents the first cross-platform comparison to evaluate BGISEQ-500 in transcriptome analysis. The results from the present study provide reference datasets to examine key parameters utilized in the transcriptome analytical pipeline. The intra- and inter-platform correlations of gene quantification, DEG detection, AS identification, SNP and INDEL detection between BGISEQ-500 and Illumina HiSeq4000 sequencing platforms have been critically evaluated. Previous reports mentioned that BGISEQ-500 has similar throughput and turnaround time to that of HiSeq2500 platform in DNA sequencing of human genome . Furthermore, more even read distribution than HiSeq data has been observed in previous miRNA analysis with respect to BGISEQ-500 platform . Although the throughput was not tested in the present study, a similar number of starting reads (~ 70 Mb) was used to facilitate subsequent normalization and comparison. Likewise, both base/read quality and read distribution pattern of BGISEQ-500 was comparable to those of HiSeq4000 (Fig. 2 and Table 1). In addition, high correlations of gene detection and DEGs/AS identification (Figs. 3, 4, 5) suggest that BGISEQ-500 has the capability for efficient transcriptome analysis. Especially, the consistency in biological interpretation from DEGs was supported by five different DEG calling methods. By incorporating more methods in DEG analysis such as limma , may further increase the confidence of this conclusion. However, the data generated in this study was solely based on model dicot plant Arabidopsis, for which high quality and rigorously annotated genomic information is already available. Further evaluation is required to test the transcriptome compatibility of BGISEQ-500, especially with respect for those plant species with no reference genome sequences available, or other eukaryotic samples such as animals and fungi.
In comparison to Illumina series, BGISEQ-500 possesses fundamental differences in terms of library preparation and sequencing strategy . The success of this platform provides yet another set of reliable sequencing approaches for any experiments dealing with NGS analysis including research conducted using either DNA or RNA sequencing techniques. However, further improvements such as sequencing quality and standardized protocols, strand-specific library construction and bias correction, are needed to improve the performance of BGISEQ series for it to be applied in the other NGS sequencing applications (e.g. epigenomic and metagenomic sequencing; LncRNA analysis).
Platform-based variations in transcriptome analysis
Although previous studies proposed that comprehensive data mining projects can be applied to datasets generated by different platforms despite the intrinsic variations , in our present work, we express our particular concern on the phenomena of platform variations and their effect in identification of AS, SNP and INDEL during transcriptome analysis. Previous comparative studies have indicated that considerable variation can be detected between BGISEQ-500 and HiSeq platforms in miRNA identification . In our study, approximately 20% of AS was uniquely identified by each platform (Fig. 5), especially for AS events like AFE and ALE, which is shown large variations in both inter- and intra-platform comparisons (Additional file 1: Figures S11 and 15). Thus, the authenticity of such AS events is needed to be verified by parallel independent methods such as quantitative real-time PCR or RT-PCR. In comparison to the other two sequencing approaches, BGISEQ500-PE100 mode showed distinct AS identification in SKIP and IR-related events (Additional file 1: Figure S11B). However, the underlying causes for this variation remain to be investigated. We observed larger variations with respect to SNP and INDEL calling from our dataset (Additional file 1: Figures S16 and 17). Similarly, a tenfold variation for error rates in INDEL has been reported in a previous cross-platform comparison . Low concordance may result from the different PCR steps used in library construction protocols  or could be interfered by plant RNA editing processes. Furthermore, as a means of intra-platform comparison, both strand-specific (BGISEQ-500 PE100) and strand-non-specific libraries (BGISEQ-500 PE75) have been applied for comparison in transcriptome analysis. From our results, no apparent differences were detected between the above two types of libraries. The only variation observed between these two approaches was in the identification of certain AS events, suggesting that the application of strand-specific library could achieve similar results to that of non-strand-specific library in BGISEQ-500 platform.
It is obvious that the biological significance is crucial for omics-based analysis, where the outcome which can be achieved from the experimental design largely relies on the consistency of sequence data. Yet, the considerably large variations observed among different sequencing platforms may lead to false interpretations in transcriptome studies. Previous publications suggested that variations in RNA sequencing can be easily avoided by standardizing sequencing protocols, platforms and bioinformatic analytical pipelines for specific experiment . However, this could be effective for studies which rely on a single type of platform, but not for large-scale comparative studies, where the analysis has to be dealt with discrete data generated from different platforms. Therefore, approaches which ensure increased inter-platform consistency, differentiate real events from platform-specific bias and define the standard to manipulate the cross-platform data variations need to be extensively discussed within research community. Conducting cross-platform comparisons may help us further understand the signature of platform-specific variation. Furthermore, deepening the sequencing depth may increase the possibility to identify low abundance transcripts and splicing junctions , while nullifying possible inadequacies of the sequencing method. However, how this dynamic range is related to platform specific variation requires further investigation.
Potential applications in cross-platform comparison
Cross-platform comparative studies provide valuable and indispensable information on sample repeatability and reproducibility, platform preference, bias estimation and the potential application of new sequencing technologies. The datasets generated from cross-platform comparisons are necessary for platform improvement, bias correction and development of suitable analytical tools for omics-based approaches. In the field of transcriptome, this will benefit for parameters like gene detection and quantification, DEG and AS identification, SNP and INDEL observation. In addition to standard analysis, other valuable information of transcriptome features such as natural antisense transcript detection, gene fusions, post-transcriptional modifications (e.g. RNA editing) could be retrieved by cross-platform studies, and even degraded samples could be addressed [19, 32], providing pivotal information for genome annotation refinement and mechanistic studies. Given the fast advancement in NGS sequencing platform, the practical multi-platform evaluation and development of standard protocols needs to keep pace.
From the reference dataset generated by BGISEQ-500, we compared basic parameters in transcriptome analysis between this new sequencing platform and Illumina HiSeq4000, elucidating the capability of BGISEQ-500 as an alternative choice and yet another competent platform for plant transcriptome analysis. This case study may encourage more attempts to test their transcriptomic data using BGISEQ-500. We look forward to further explore the potential of this sequencing platform using a wider range of samples.
Plant material, growth conditions and abscisic acid treatment
In general, Arabidopsis thaliana WT seeds (Col-0 background) were surface-sterilized with 20% bleach for 30 min followed by four washes with distilled water. Subsequently, sterilized seeds were sown on Murashige and Skoog (MS) plates  supplemented with 0.8–1.0% (w/v) agar and 1.5% (w/v) sucrose. Plates were then incubated under 16 h light (23 °C)/8 h dark (21 °C) cycles following 2 days stratification. Twelve-day-old seedlings were treated with DMSO control or 50 μM ABA for 6 h and seedlings were harvested and subjected to further transcriptomic analysis.
Plant RNA extraction
Total RNA of Arabidopsis seedlings was extracted using the RNeasy Mini Kit (Qiagen, Germany) according to the manufacturer’s bench protocol. Two samples, DMSO- and ABA-treated seedlings, each with three biological replicates were subjected to RNA extraction. The extracted RNA was then quantified and assessed for integrity using the NanoDrop (Thermo, USA) and 2100 Agilent Bioanalyzer (Agilent, USA) prior to subsequent experiments.
Library construction and RNA sequencing in BGISEQ-500 platform
The strand non-specific library construction of PE75 mode was described as follows. In total, approximately 1 μg total RNA was initially used for BGISEQ-500 library construction. In general, DNase I was initially used to degrade double-stranded and single-stranded DNA contaminant in RNA samples. The mRNA molecules were then purified from total RNA using oligo(dT)-attached magnetic beads and fragmented into small pieces. First-strand cDNA was generated using random hexamer-primed reverse transcription, followed by a second-strand cDNA synthesis. The cDNA thus synthesized was subjected to end-repair and 3′ adenylation. Subsequently, adaptors were ligated to the ends of these 3′ adenylated cDNA fragments. The double stranded PCR products were heat denatured and circularized by the splint oligo sequence. The single stranded circular DNAs were formatted as the final library for Agilent Technologies 2100 bioanalyzer validation and subsequent PE75 sequencing.
For PE100 strand-specific library preparation, the first step in the workflow involved purifying the poly-A containing mRNA molecules using poly-T oligo-attached magnetic beads. Following purification, the mRNA was fragmented into small pieces using divalent cations under elevated temperature. The cleaved RNA fragments were copied into first strand cDNA using reverse transcriptase and random primers. This was followed by second strand cDNA synthesis using DNA Polymerase I and RNase H. This process removes the RNA template and synthesizes a replacement strand, incorporating dUTP in place of dTTP to generate dscDNA. The incorporation of dUTP quenched the second strand during amplification. These cDNA fragments were added with a single ‘A’ base and subsequently ligated to the adapter. The resultant product was purified and enriched with PCR amplification to yield the final cDNA library. The PCR yield was quantified and was subjected to single strand circularized DNA molecule (ssDNA circle) preparation for final library construction. DNA nanoballs (DNBs) were generated with the ssDNA circle by rolling circle replication (RCR) to intensify the fluorescent signals during the sequencing process. The DNBs were then loaded into the patterned nanoarrays and pair-end reads of 100 bp were read through on the BGISEQ-500 platform for subsequent data analysis.
Library construction and RNA sequencing in Illumina HiSeq4000 platform
The library construction in HiSeq series was carried out according to the bench manual of TruSeq RNA Sample Prep Kit v2 (Illumina). Briefly, approximately 1 µg of total RNA sample was purified using oligo-dT beads, followed by fragmentation with Elute, Prime, Fragment Mix. First-strand cDNA was generated by First Strand Master Mix and Super Script II (Invitrogen) reverse transcription (Reaction condition:25 °C for 10 min, 42 °C for 50 min, 70 °C for 15 min). The product was purified (Agencourt RNAClean XP Beads, Agencourt), prior to the addition of Second Strand Master Mix and dATP, dGTP, dCTP, dUTP mix, to proceed with the synthesis of second-strand cDNA (16 °C for 1 h). The purified fragmented cDNA was incubated at 30 °C for 30 min in presence of End Repair Mix. Subsequently, the end-repaired cDNA was purified with Ampure XP Beads (Agencourt). A-Tailing Mix was then added, mixed and incubated at 37 °C for 30 min. The 3′end adenylated cDNA, RNA index adapter and ligation mix were combined and mixed, then incubated at 30 °C for 10 min. The end-repaired cDNA thus produced was then purified with Ampure XP Beads (Agencourt). The uracil-N-glycosylase enzyme was added into the reaction mixture at the final purification step, incubated at 37 °C for 10 min and the resulting product was purified using Agencourt Ampure XP Beads. Several rounds of PCR amplification with PCR primer cocktail and PCR master mix were performed to enrich the cDNA fragments, prior to the purification of PCR products with Ampure XP Beads (Agencourt). The library quality was assessed by checking the distribution of the fragments size using the Agilent 2100 bioanalyzer (Agilent DNA 1000 Reagents), and the library was quantified by using qRT-PCR (TaqMan Probe). The resultant library was subjected to Illumina HiSeq sequencing.
Cross-platform RNA-sequencing data analysis
The bioinformatic pipeline was performed as described previously with minor modifications [27, 28]. Raw sequencing reads were filtered to get clean reads by using SOAPnuke (v1.5.2, parameters -l 15, -q 0.2, -n 0.05) (https://github.com/BGI-flexlab/SOAPnuke). For both BGISEQ-500 and HiSeq4000 derived sequencing data, HISAT pipeline  was applied to align reads against reference genome. StringTie  was then used for transcript reconstruction. Subsequently, Cuffcompare (Cufflinks tools)  was utilized to compare reconstructed transcripts and the reference annotation of Arabidopsis. Coding potential of novel transcripts were predicted by CPC . SNP and INDEL calling was carried out by using GATK (v 3.4-0, https://www.broadinstitute.org/gatk)  with parameters (call): -allowPotentiallyMisencodedQuals, -stand_call_conf 20.0, -stand_emit_conf 20.0 and parameters (filter): -window 35, -cluster 3, -filterName FS, -filter “FS > 30.0”, -filterName QD, -filter “QD < 2.0”. In addition, we have mapped clean reads to reference genes using Bowtie2 software . Expression values of candidate genes were then calculated by RSEM . The identification of DEGs was based on the negative binomial distribution of DEseq2 package , AudicS , Cuffdiff , DEGseq  and edgeR . The cutoff of DEGs was Fold Change ≥ 2 and adjusted P value ≤ 0.05. The subsequent GO and pathway analysis was followed by previous description [21, 46–48]. Alternative spliced events were identified according to previous description by using ASprofile software (http://ccb.jhu.edu/software/ASprofile) . In brief, as described previously , AS junctions supported with two or more reads were subsequently used for AS events identifications.
M-XC, JHZ, N-HY, and Y-GL designed experiments. M-XC, F-YZ, T-YL, DZ, and BG performed experiments. M-XC, W-MQ, W-KL, and DZ analysed data. M-XC wrote the manuscript. Q-JH, Y-YC, Z-ZS, F-YZ, N-HY, JHZ, and Y-GL critically commented and revised it. All authors read and approved the final manuscript.
We thank Ms. HHKAchala in providing language editing on this manuscript.
The authors declare that Wang-Min Qiao, Wai-Ki Law and Dong Zhang are employees of BGI.
Availability of data and materials
The raw data used in this study has been uploaded to Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra) under Bioproject PRJNA434180.
Consent for publication
All the authors are in their consent for publication of this article.
Ethics approval and consent to participate
No human subjects involved in this study.
This work was supported by the China Postdoctoral Science Foundation (2017M622801), the Natural Science Foundation of Guangdong Province (2018A030313030), the Natural Science Foundation of Shandong Province (BS2015NY002), Funds of Shandong “Double Top” Program, National Natural Science Foundation of China (NSFC31771701, 31701341, 81401561), the NJFU project funding (GXL2018005), Science and Technology Program of Nantong (MS12016044), National Basic Research Program of China (2012CB114300), Shenzhen Overseas Talents Innovation and Entrepreneurship Funding Scheme (The Peacock Scheme, KQTD201101) and Hong Kong Research Grant Council (AoE/M-05/12, AoE/M-403/16, CUHK14160516, 14177617, 12100318).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Snyder M. RNA-seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10(1):57–63.View ArticlePubMedPubMed CentralGoogle Scholar
- Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB. Alternative isoform regulation in human tissue transcriptomes. Nature. 2008;456(7221):470–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Nagalakshmi U, Waern K, Snyder M. RNA-Seq: a method for comprehensive transcriptome analysis. Curr Protoc Mol Biol. 2010;chapter 4(Unit 4):4.11.11–14.11.13.Google Scholar
- Iasillo C, Schmid M, Yahia Y, Maqbool MA, Descostes N, Karadoulama E, Bertrand E, Andrau JC, Jensen TH. ARS2 is a general suppressor of pervasive transcription. Nucleic Acids Res. 2017;45(17):10229–41.View ArticlePubMedPubMed CentralGoogle Scholar
- Lee HG, Kahn TG, Simcox A, Schwartz YB, Pirrotta V. Genome-wide activities of Polycomb complexes control pervasive transcription. Genome Res. 2015;25(8):1170.View ArticlePubMedPubMed CentralGoogle Scholar
- Ozsolak F, Milos PM. RNA sequencing: advances, challenges and opportunities. Nat Rev Genet. 2011;12(2):87–98.View ArticlePubMedGoogle Scholar
- Tarazona S, García-Alcalde F, Dopazo J, Ferrer A, Conesa A. Differential expression in RNA-seq: a matter of depth. Genome Res. 2011;21(12):2213–23.View ArticlePubMedPubMed CentralGoogle Scholar
- Katz Y, Wang ET, Airoldi EM, Burge CB. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat Methods. 2010;7(12):1009.View ArticlePubMedPubMed CentralGoogle Scholar
- Huang R, Jaritz M, Guenzl P, Vlatkovic I, Sommer A, Tamir IM, Marks H, Klampfl T, Kralovics R, Stunnenberg HG. An RNA-seq strategy to detect the complete coding and non-coding transcriptome including full-length imprinted macro ncRNAs. PLoS ONE. 2011;6(11):e27288.View ArticlePubMedPubMed CentralGoogle Scholar
- Wilhelm BT, Landry JR. RNA-seq-quantitative measurement of expression through massively parallel RNA-sequencing. Methods. 2009;48(3):249–57.View ArticlePubMedGoogle Scholar
- Liu S, Lin L, Jiang P, Wang D, Xing Y. A comparison of RNA-seq and high-density exon array for detecting differential gene expression between closely related species. Nucleic Acids Res. 2011;39(2):578–88.View ArticlePubMedGoogle Scholar
- Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 2008;18(9):1509–17.View ArticlePubMedPubMed CentralGoogle Scholar
- Mortazavi A, Williams BA, Mccue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat Methods. 2008;5(7):621.View ArticlePubMedGoogle Scholar
- Liu L, Li Y, Li S, Hu N, He Y, Ray P, Lin D, Lu L, Maggie L. Comparison of next-generation sequencing systems. J Biomed Biotechnol. 2012;2012(7):251364.PubMedPubMed CentralGoogle Scholar
- Ratan A, Miller W, Guillory J, Stinson J, Seshagiri S, Schuster SC. Comparison of sequencing platforms for single nucleotide variant calls in a human sample. PLoS ONE. 2013;8(2):e55089.View ArticlePubMedPubMed CentralGoogle Scholar
- Quail MA, Smith M, Coupland P, Otto TD, Harris SR, Connor TR, Bertoni A, Swerdlow HP, Gu Y. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genom. 2012;13(1):341.View ArticleGoogle Scholar
- Pa TH, Ariyurek Y, Thygesen HH, Vreugdenhil E, Vossen RH, de Menezes RX, Boer JM, van Ommen GJ, den Dunnen JT. Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Res. 2008;36(21):e141.View ArticleGoogle Scholar
- Nookaew I, Papini M, Pornputtapong N, Scalcinati G, Fagerberg L, Uhlén M, Nielsen J. A comprehensive comparison of RNA-seq-based transcriptome analysis from reads to differential gene expression and cross-comparison with microarrays: a case study in Saccharomyces cerevisiae. Nucleic Acids Res. 2012;40(20):10084–97.View ArticlePubMedPubMed CentralGoogle Scholar
- Li S, Tighe SW, Nicolet CM, Grove D, Levy S, Farmerie W, Viale A, Wright C, Schweitzer PA, Gao Y. Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study. Nat Biotechnol. 2014;32(9):915–25.View ArticlePubMedPubMed CentralGoogle Scholar
- Fehlmann T, Reinheimer S, Geng C, Su X, Drmanac S, Alexeev A, Zhang C, Backes C, Ludwig N, Hart M. cPAS-based sequencing on the BGISEQ-500 to explore small non-coding RNAs. Clin Epigenetics. 2016;8(1):123.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhu FY, Chen MX, Ye NH, Shi L, Ma KL, Yang JF, Cao YY, Zhang Y, Yoshida T, Fernie AR. Proteogenomic analysis reveals alternative splicing and translation as part of the abscisic acid response in Arabidopsis seedlings. Plant J. 2017;91(3):518–33.View ArticlePubMedGoogle Scholar
- Goodwin S, Mcpherson JD, Mccombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet. 2016;17(6):333–51.View ArticlePubMedGoogle Scholar
- Drmanac R, Sparks AB, Callow MJ, Halpern AL, Burns NL, Kermani BG, Carnevali P, Nazarenko I, Nilsen GB, Yeung G. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science. 2010;327(5961):78–81.View ArticlePubMedGoogle Scholar
- Mak SST, Gopalakrishnan S, Carøe C, Geng C, Liu S, Sinding MHS, Kuderna LFK, Zhang W, Fu S, Vieira FG. Comparative performance of the BGISEQ-500 vs Illumina HiSeq2500 sequencing platforms for palaeogenomic sequencing. Gigascience. 2017;6(8):1–13.View ArticlePubMedPubMed CentralGoogle Scholar
- Huang J, Liang X, Xuan Y, Geng C, Li Y, Lu H, Qu S, Mei X, Chen H, Yu T. A reference human genome dataset of the BGISEQ-500 sequencer. Gigascience. 2017;6(5):1–9.View ArticleGoogle Scholar
- Fu Y, Li L, Hao S, Guan R, Fan G, Shi C, Wan H, Chen W, Zhang H, Liu G. Draft genome sequence of the Tibetan medicinal herb, Rhodiola crenulata. Gigascience. 2017;6(6):1–5.View ArticleGoogle Scholar
- Chen H, Lin L, Ali F, Xie M, Zhang G, Su W. Using next-generation sequencing to detect differential expression genes in Bradysia odoriphaga after exposure to insecticides. Int J Mol Sci. 2017;18(11):2445.View ArticlePubMed CentralGoogle Scholar
- Shu XS, Zhao Y, Sun Y, Zhong L, Cheng Y, Zhang Y, Ning K, Tao Q, Wang Y, Ying Y. PBRM1 restricts the basal activity of innate immune system through repressing RIG-I-like receptor signaling and is a potential prognostic biomarker for colon cancer. J Pathol. 2017;244(1):36–48.View ArticlePubMedGoogle Scholar
- Hansen KD, Brenner SE, Dudoit S. Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res. 2010;38(12):e131.View ArticlePubMedPubMed CentralGoogle Scholar
- Reyes A, Huber W. Alternative start and termination sites of transcription drive most transcript isoform differences across human tissues. Nucleic Acids Res. 2018;46(2):582–92.View ArticlePubMedGoogle Scholar
- Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47.View ArticlePubMedPubMed CentralGoogle Scholar
- Xian A, Borgesrivera D, Satija R, Deluca DS, Busby MA, Berlin AM, Sivachenko A, Thompson DA, Wysoker A, Fennell T. Comprehensive comparative analysis of RNA sequencing methods for degraded or low input samples. Nat Methods. 2013;10(7):623.View ArticleGoogle Scholar
- Murashige T, Skoog F. A revised medium for rapid growth and bio assays with tobacco tissue cultures. Physiol Plant. 1962;15(3):473–97.View ArticleGoogle Scholar
- Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015;12(4):357–60.View ArticlePubMedPubMed CentralGoogle Scholar
- Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT, Salzberg SL. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 2015;33(3):290–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012;7(3):562.View ArticlePubMedPubMed CentralGoogle Scholar
- Kong L, Zhang Y, Ye ZQ, Liu XQ, Zhao SQ, Wei L, Gao G. CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res. 2007;35:W345.View ArticlePubMedPubMed CentralGoogle Scholar
- Mckenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–303.View ArticlePubMedPubMed CentralGoogle Scholar
- Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinform. 2011;12(1):323.View ArticleGoogle Scholar
- Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550.View ArticlePubMedPubMed CentralGoogle Scholar
- Audic S, Claverie JM. The significance of digital gene expression profiles. Genome Res. 1997;7(10):986–95.View ArticlePubMedGoogle Scholar
- Trapnell C, Hendrickson DG, Sauvageau M, Goff L, Rinn JL, Pachter L. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat Biotechnol. 2013;31(1):46–53.View ArticlePubMedGoogle Scholar
- Wang L, Feng Z, Wang X, Wang X, Zhang X. DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics. 2010;26(1):136–8.View ArticlePubMedGoogle Scholar
- Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40.View ArticlePubMedGoogle Scholar
- Alaimo S, Giugno R, Acunzo M, Veneziano D, Ferro A, Pulvirenti A. Post-transcriptional knowledge in pathway analysis increases the accuracy of phenotypes classification. Oncotarget. 2016;7(34):54572–82.View ArticlePubMedPubMed CentralGoogle Scholar
- Tarca AL, Draghici S, Khatri P, Hassan SS, Mittal P, Kim JS, Kim CJ, Kusanovic JP, Romero R. A novel signaling pathway impact analysis. Bioinformatics. 2009;25(1):75–82.View ArticlePubMedGoogle Scholar
- Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102(43):15545–50.View ArticlePubMedPubMed CentralGoogle Scholar
- Florea L, Song L, Salzberg SL. Thousands of exon skipping events differentiate among splicing patterns in sixteen human tissues. F1000research. 2013;2:188.PubMedPubMed CentralGoogle Scholar