SILEX: a fast and inexpensive high-quality DNA extraction method suitable for multiple sequencing platforms and recalcitrant plant species

The use of sequencing and genotyping platforms has undergone dramatic improvements, enabling the generation of a wealth of genomic information. Despite this progress, the availability of high-quality genomic DNA (gDNA) in sufficient concentrations is often a main limitation, especially for third-generation sequencing platforms. A variety of DNA extraction methods and commercial kits are available. However, many of these are costly and frequently give either low yield or low-quality DNA, inappropriate for next generation sequencing (NGS) platforms. Here, we describe a fast and inexpensive DNA extraction method (SILEX) applicable to a wide range of plant species and tissues. SILEX is a high-throughput DNA extraction protocol, based on the standard CTAB method with a DNA silica matrix recovery, which allows obtaining NGS-quality high molecular weight genomic plant DNA free of inhibitory compounds. SILEX was compared with a standard CTAB extraction protocol and a common commercial extraction kit in a variety of species, including recalcitrant ones, from different families. In comparison with the other methods, SILEX yielded DNA in higher concentrations and of higher quality. Manual extraction of 48 samples can be done in 96 min by one person at a cost of 0.12 €/sample of reagents and consumables. Hundreds of tomato gDNA samples obtained with either SILEX or the commercial kit were successfully genotyped with Single Primer Enrichment Technology (SPET) with the Illumina HiSeq 2500 platform. Furthermore, DNA extracted from Solanum elaeagnifolium using this protocol was assessed by Pulsed-field gel electrophoresis (PFGE), obtaining a suitable size ranges for most sequencing platforms that required high-molecular-weight DNA such as Nanopore or PacBio. A high-throughput, fast and inexpensive DNA extraction protocol was developed and validated for a wide variety of plants and tissues. SILEX offers an easy, scalable, efficient and inexpensive way to extract DNA for various next-generation sequencing applications including SPET and Nanopore among others.


Background
In the last decade, sequencing and genotyping technologies have become routine, allowing to generate a wealth of genomic information even in non-model plant species and neglected crops [1,2]. Nowadays, genome sequencing, as well as the most common high-throughput genotyping strategies, like Genotyping-by-Sequencing (GBS [3]), Restriction Associated DNA Sequencing (RADseq [4]) and Single Primer Enrichment Technology (SPET [5,6]), are conducted using next-generation sequencing (NGS) platforms. However, despite the advances, DNA quality is still a main bottleneck, mostly for the third-generation sequencing platforms where high-molecular-weight DNA free of contaminants is required [7]. Unlike bacteria and mammalian cells, fungi and plant cells are protected by rigid polysaccharide cell walls that hamper the extraction of unfragmented DNA [8]. Furthermore, plants produce a wide array of compounds and secondary metabolites (e.g., pigments, phenols, carbohydrates, waxes, among others) that tend to co-precipitate with the DNA and interfere with the subsequent enzymatic reactions [9].
So far, the CTAB DNA extraction protocol developed by Doyle and Doyle [10] is one of the most widely used by plant researchers. Several modifications of this protocol have been implemented in order to minimize contamination by other compounds of specific tissues of species [7,11,12]. These modifications, apart from being species or tissue-specific and frequently not removing completely interfering compounds, are time-consuming due to many handling steps, and thus are not suitable for highthroughput applications [13,14].
Conversely, commercial kits based on silica matrices avoid many of these issues by optimizing the conditions in which only DNA can bind to the silica surface. Therefore, contaminants such as polysaccharides, polyphenols and proteins can be easily removed [15]. They also tend to be faster than the standard CTAB protocol, being the preferred option for sequencing studies in which many samples must be evaluated [9,16]. Usually, commercial kits rely on the reversible interaction between DNA and a silica or silicate support, either in the form of a filter membrane or of silica-coated magnetic particles [17,18]. The adsorption of DNA to the silica surface is facilitated by buffers with low pH, high concentrations of chaotropic salts (such as guanidinium hydrochloride, guanidinium thiocyanate, or sodium iodide) and ethanol [19][20][21][22]. Under these conditions, the surface of the silica can interact with the negative surface of DNA via ionic interactions [23,24]. After several washes with high concentrations of ethanol to eliminate contaminants, DNA is generally eluted with water or TE at pH 8.0. At this higher pH value, the negatively charged silica surface and DNA repeal each other, releasing the DNA [25][26][27].
However, commercial kits are usually expensive, with reagent costs commonly ranging between 2 and 9 US$ per sample [8,28], and many times provide low yields, insufficient for some NGS applications [8,29].
In this study, we present a novel, fast and inexpensive DNA extraction protocol that combines the advantages of CTAB-based extraction coupled with a purification on a silica matrix. The new method was assessed on different species, including recalcitrant ones and different tissues. To test its suitability for different NGS applications, the method was compared with commercial kits for Single Primer Enrichment Technology (SPET) genotyping [6]. The method was also used to extract high-molecularweight DNA from a recalcitrant wild species (Solanum elaeagnifolium). The DNA obtained was successfully used to construct long insert size Nanopore libraries for a de novo genome assembly, which can be difficult for recalcitrant species [40], thus proving its suitability for third-generation sequencing platforms.
We demonstrate that this new method combines the advantages of commercial kits (high-quality DNA, fast and broad range of species spectrum) with those of a CTAB-based method (high yield and inexpensive) being suitable for routinely DNA screening and NGS platforms.

Plant material
To test our proposed protocol (hereafter named SILEX, for SILica matrix EXtraction), leaf and fruit tissue from non-recalcitrant species and leaf tissue from recalcitrant species was sampled for four different trials. In a first trial, leaf tissue from a total of 1860 accessions of tomato (S. lycopersicum) and its wild relatives was extracted to compare the quality, quantity and integrity of DNA extracted using SILEX and the commercial kit sbeadex maxi plant kit (hereafter SMP kit; LGC Genomics, Teddington, UK) for SPET genotyping. Extractions were performed on different days over several months.
In a second trial, in order to evaluate the appropriateness of SILEX in different plant tissues, 50 mg of fresh and 30 mg of lyophilized fruit tissue of tomato, eggplant (S. melongena) and pepper (Capsicum annuum) were extracted. The fruit tissue was collected, immediately frozen in liquid N 2 and lyophilized In a third trial, the suitability of SILEX for DNA extraction in recalcitrant species was assessed using leaves tissue of six species, cassava (Manihot esculenta), grapevine (Vitis vinifera), loquat (Eriobotrya japonica), banana (Musa × paradisiaca), naranjillo (Solanum bonariense), and strawberry (Fragaria × ananassa), selected to represent a wide range of recalcitrant species presenting different contaminants and secondary metabolites that interfere with DNA extraction. Extractions from recalcitrant plants made by SILEX were compared with those carried out using the standard CTAB protocol [10] and the commercial SMP kit following the manufacturer's instructions. Finally, the suitability of SILEX to extract clean and high-molecular-weight DNA for third-generation sequencing was assessed in the silverleaf nightshade (S. elaeagnifolium), a wild relative of eggplant [41], that we selected for the difficulty to obtain contaminant-free DNA due to its high content in phenolics [42]. should be the binding buffer and 60% should be absolute ethanol. 7. Add 20 µl of silica matrix buffer and mix gently during 5 min (by hand or using an orbital shaker). 8. Spin down the silica for 5 to 6 s and discard the supernatant by decantation. NOTE: Longer centrifugation times will make it difficult to resuspend the silica in the subsequent steps. 9. Add 700 μl of washing buffer and shake gently by hand until a uniform dispersion of the silica is obtained. 10. Spin down the silica for 5 to 6 s, gently discard the supernatant by decantation and let dry at room temperature for 5 min. NOTE: Make sure that all ethanol is completely evaporated. 11. Add 100 µl of elution buffer, shake gently by hand until the pellet is resuspended and incubate 5 min at 65 °C. 12. Centrifuge at 14,000 rpm for 10 min at room temperature and transfer 90 µl of the supernatant to a new tube.

DNA concentration and quality
DNA integrity was checked by electrophoresis on a 0.8% agarose gel (Condalab, Madrid, Spain) in 1X TAE buffer (GenoChem World, Valencia, Spain) stained with GelRed ® (Biotium, Fremont, CA, USA) at a constant voltage of 100 V for 50 min. Gel Doc XR + System transilluminator (Bio-Rad, Hercules, CA, USA) was used to visualized agarose gels. For high-molecular-weight DNA, the size and integrity were tested by pulse-field gel electrophoresis was run at 3.3 V/cm in 15-second cycles with an angle of 120º for 24 h at 4 °C with 0.8% agarose in TB buffer.
DNA yield and quality were measured spectrophotometrically using NanoDrop ™ ND-1000 (Thermo Scientific, Waltham, MA, USA). A 260 /A 280 and A 260 /A 230 ratios were measured to determine, respectively, protein and polysaccharide contamination. DNA quantity was also quantified with a Qubit ™ 2.0. Fluorometer (Thermo Scientific, Waltham, MA, USA). An aliquot of 2 µl of each sample was examined using the Qubit ™ dsDNA BR Assay Kit (Thermo Scientific, Waltham, MA, USA) according to the instructions of the manufacturer.
In addition, the concentration of DNA obtained from the 1860 tomato samples was measured fluorometrically using Quant-iT ™ PicoGreen ™ dsDNA Assay Kit (hereafter PicoGreen, Thermo Scientific, Waltham, MA, USA) and a 96-wells plate reader VICTOR3 1420 (PerkinElmer, Waltham, MA, USA) equipped with an excitation filter F485 and emission filter F535.
To check the suitability of the DNA extraction method for sequencing applications where DNA is fragmented, approximately 1 µg of DNA was digested for 1 h at 37 °C followed by 20 min at 65 °C with restriction enzymes EcoRI (New England Biolabs, Ipswich, MA, USA). The digestion was evaluated through 1% agarose electrophoresis as above.

High-throughput genotyping quality check
For the first trial, sequencing of tomato samples for genotyping by SPET was performed with an Illumina Next-Seq500 platform (Illumina Inc., San Diego, CA, USA), following the manufacturer protocol. Phred values were obtained using FastQC Version 0.11.8. and plotted in R [43] using the package ggplot2 [44].

Suitability of extracted DNA for third-generation sequencing platforms
For the third trial, 5 µg of S. elaeagnifolium DNA from a single extraction were size-selected using the Circulomics SRE-XL-Kit (Circulomics Inc., Baltimore, MD, USA). For library preparation, 1 µg of the size-selected DNA was used to prepare each of the three Nanopore LSK-109 libraries. Two of these libraries were sequenced on a MinION R9.4.1 (Oxford Nanopore, Oxford, UK) and the third was loaded on a PromethION PRO-002 (Oxford Nanopore, Oxford, UK). All three sequencing runs were basecalled using Oxford Nanopores Guppy basecaller version 3.2.2 (Oxford Nanopore, Oxford, UK) using the high accuracy basecalling models.

Tomato leaf samples
Total DNA yield extracted through the SMP kit and estimated by NanoDrop ranged from 14.5 ng/mg to 366.9 ng/mg with a mean of 38.3 ng/mg and a standard deviation (SD) of 29.2 ng/mg. DNA extracted by SILEX showed higher output, ranging from 86.1 ng/mg to 1698.1 ng/mg with an average of 382.9 ng/mg and a SD of 205.3 ng/mg (Table 1). Despite higher SD, the coefficient of variation (CV) of SILEX (53.6%) was lower than that of the SMP kit (76.1%).
The A 260 /A 280 ratio, which indicates protein contamination, was very variable in the SMP kit protocol, ranging from 1.15 to 2.32, with an average of 1.76 and a SD of 0.33 (Fig. 1a). In contrast, SILEX showed a more consistent ratio with less variation (from 1.91 to 2.12) and with an average value of 2.03 and a SD of 0.05. Similarly, for the A 260 /A 230 ratio, which indicates salt and carbohydrates contamination, SMP kit showed a greater dispersion, with a ratio between 0.27 and 2.43 with an average of 1.09 and a SD of 2.55, compared to   SILEX, which ranged from 1.16 to 2.16 with an average of 1.66 and a SD of 0.25 (Fig. 1b) (Table 2). Since spectrophotometric measurements with Nan-oDrop tend to overestimate DNA yield due to likely interferences of proteins [45], those measures were compared with the fluorometric ones performed with PicoGreen. Yields estimated by the latter ranged from 1.2 ng/mg to 134.8 ng/mg with a mean of 41.7 ng/mg and a SD of 26.4 ng/mg in the case of DNA extracted by SMP kit. On the other hand, SILEX had higher yields, ranging from 37.9 ng/mg to 231.2 ng/mg with a mean of 141.3 ng/mg and a SD of 36.8 ng/mg (Table 1). In addition, yields estimated by PicoGreen had greater variation between samples in DNA extracted by SMP kit (63.4%) in comparison with that extracted by SILEX (26.1%).

Table 2 Mean value, standard deviation (SD), range and coefficient of variation (CV) of NanoDrop absorbance ratios (A 260 /A 280 and A 260 /A 230 ) using SILEX and SMP kit
To assess the overestimation of DNA yield extracted using the different protocols, we compared the ratios obtained by NanoDrop and PicoGreen measurements. Estimation of yield by NanoDrop of DNA extracted with the SMP kit showed an estimation of 0.9-fold compared to PicoGreen, suggesting that for this commercial kit NanoDrop measurements were comparable with the PicoGreen ones. In contrast, NanoDrop measurements from SILEX tended to overestimate DNA yield 2.7-fold compared to PicoGreen, suggesting contamination with a molecule absorbing at 260 nm. One possible explanation for this overestimation is that remnants of degraded RNA were present in our samples, since nanodrop is unable to discriminate among free nucleotides, RNA, singlestranded DNA, and double-stranded DNA. However, even with this overestimation, the average yield obtained with SILEX (141.3 ng/mg with PicoGreen) was 3.4 times higher than with SMP kit (41.7 ng/mg).

DNA extraction from dry and fresh fruit tissues
The amount of DNA obtained from dry and fresh fruit tissues was similar to that achieved using leaf tissue and ranged from 116.4 to 920.3 ng/mg. In general, higher yield of DNA was obtained with lyophilized tissue ( Table 3). Regardless of the tissue used, A 260 /A 280 ratios were above 2.0 which indicates no protein contamination. On the other hand, lower ratios were observed in A 260 /A 230 ratio, suggesting the presence of some organic contaminants. Despite these ratios, DNA obtained was successfully digested by HindIII restriction enzyme.
Yield and quality control of DNA extracted by lyophilized and fresh fruit tissue of tomato, eggplant and pepper measured using NanoDrop.

DNA extraction from recalcitrant species
Overall, SILEX resulted in higher DNA yields, ranging (with fluorimetric determination) from 46.4 ng/mg in strawberry to 318.0 ng/mg in grapevine, than those obtained with the standard CTAB method or the SMP kit (Table 4). In addition, A 260 /A 280 ratios obtained with SILEX were above 2.0, which is considered a protein-free DNA, except in strawberry where values were on average 1.86, even though they were higher than in standard CTAB and SMP kit. In the same way, SILEX A 260 /A 230 ratios were higher than in the two other protocols, from 1.71 in loquat to 2.16 in banana, except in strawberry, where the results were similar to standard CTAB and SMP kit ( Table 4). The differences were very noticeable in banana and grapevine, where the A 260 /A 230 ratios were 2.7 and 4.3-fold lower for standard CTAB and 1.7 and 3.3-fold lower for SMP kit, respectively (Table 4). Nan-oDrop/Qubit ratio estimated with SILEX for recalcitrant species seemed to be species-dependent, and ranged from 1.4-fold in grapevine to 6.6-fold in naranjillo with a mean of 3.9-fold in comparison to Qubit. However, even though the SMP kit provided lower NanoDrop/Qubit ratios, SILEX performed better than the standard CTAB, which on average had a NanoDrop/Qubit ratio of 18-fold.
In order to test if the presence of contaminants could inhibit the enzyme activity, DNA was digested with the restriction enzyme EcoRI. Agarose gels, such as the one shown in Fig. 2 indicated efficient endonuclease activity in all the DNA extracted from the six recalcitrant species even though in some cases A 260 /A 230 ratios were below 1.8 (strawberry and loquat). Also, strawberry samples showed yellow and brown coloration and high viscosity even after two washing steps.

SILEX timing and cost
The time needed to extract 48 samples, without taking into account the sampling of the plant material) is approximately 96 min (around 2 min/sample; Fig. 3). The estimated cost of all consumables required to extract high-molecular-weight gDNA using SILEX is approximately 0.12 € per sample (Additional file 1: Additional data S1).

High-throughput genotyping platforms
In order to evaluate the suitability of the gDNA obtained with SILEX for high-throughput genotyping, 1380 tomato samples were genotyped using SPET [6]. The reads obtained showed excellent Phred-quality scores along the 150 bp, with a mean value of 33.1 (Fig. 4). Similar results were obtained with 480 samples extracted using the SMP kit with a mean value of 33.5. The mean Phred score along the 150 bp sequenced was always over 30, indicating good sequencing quality in both methods, with the SILEX method providing more DNA per equal amount of tissue.

High molecular weight DNA extraction
To test the suitability of using SILEX for NGS platforms requiring high-molecular-weight DNA, S. elaeagnifolium DNA was size-selected using the Circulomics short read eliminator kit, recovering 3.5 µg, and analysed using Pulsed-field gel electrophoresis (PFGE) (Fig. 5)

Discussion
One of the main advantages of the SILEX protocol for DNA extraction is the use of common and inexpensive reagents and its simplicity. No toxic salts such as guanidinium thiocyanate or sodium iodide at high concentrations are used. Several authors reported that the use of NaCl at concentrations higher than 2 M facilitated the DNA binding to the silica surface [46][47][48]. Also, it is known that the addition of polyethylene glycol (PEG) to the binding solution increases the adsorption due to the compact globular structure of DNA adopted under these conditions [49]. For these reasons, we use a binding buffer composed by the non-toxic, inexpensive NaCl and PEG compound to facilitate the DNA binding to the silica surface. The total cost of reagents and consumables is only 0.12 € per sample and for multiple simultaneous manual extractions, each sample requires less than 2 min per person. In this respect, in the SILEX method, the silica matrix used for each extraction cost less than 0.001 € and the washing buffer is only water and ethanol, a common non-toxic reagent in most molecular biology laboratories.
The protocol presented here has been tested on many samples of different species with similar satisfactory results, confirming its wide applicability. The quality and quantity parameters obtained also indicate that SILEX is at least as effective as commercial kits even when recalcitrant species were used. In recalcitrant species, the presence of polysaccharides and phenols was significantly lower in SILEX compared to the standard CTAB protocol where several samples showed yellow and brown coloration and high viscosity, indicating the presence of oxidized polyphenols and high concentration of polysaccharides. One of the reasons for this difference could be the absence of a precipitation step in SILEX, as polysaccharides and polyphenols tend to co-precipitate with DNA when isopropanol or ethanol is added [50]. This is important since the presence of polysaccharides such as carrageenan, pectin and xylan are strong inhibitors of PCR [51,52]. We also observed that in species with very high polyphenol and polysaccharide compounds, such as strawberry [53], a second washing step increases the A 260 /A 230 ratio. Our DNA extraction protocol has been tested by other research groups and it has been found to provide high-quality DNA in high concentrations in other plant species as different as silver fir (Abies alba), watermelon (Citrullus lanatus), melon (Cucumis melo), Although DNA is usually extracted from fresh leaf tissue, it is sometimes necessary to use other types of material such as fresh or freeze-dried fruit. Our protocol was flexible enough to successfully extract high DNA quantities from lyophilized and fresh fruit tissues obtaining A 260 /A 280 ratios above 2.0.
Thousands of samples of tomato and wild relatives were successfully genotyped using SPET high-throughput genotyping, that relays on DNA fragmentation, target probe annealing, PCR amplification and NGS sequencing [6]. The quality of the reads produced had a mean Phred value over 30, which represents a base call accuracy of 99.9%. Also, hundreds of samples of grapevine and watermelon were genotyped using GBS, obtaining similar results (C. Esteras, personal communication). This indicates the suitability of SILEX to yield DNA of enough quality to be used in different genotyping platforms. Furthermore, our DNA extraction method could be used in applications requiring high molecular weight genomic DNA, such as long-read single molecule Nanopore sequencing [54] without any additional steps.