Biases underlying species detection using fluorescent amplified-fragment length polymorphisms yielded from roots

Background Roots of different plant species are typically morphologically indistinguishable. Of the DNA-based techniques, fluorescent amplified-fragment length polymorphisms (FAFLPs) are considered reliable, high throughput, inexpensive methods to identify roots from mixed species samples. False-negatives, however, are not uncommon and their underlying causes are poorly understood. We investigated several sources of potential biases originating in DNA extraction and amplification. Specifically, we examined the effects of sample storage, tissue, and species on DNA yield and purity, and the effects of DNA concentration and fragment size on amplification of three non-coding chloroplast regions (trnT-trnL intergenic spacer, trnL intron, and trnL-trnF intergenic spacer). Results We found that sample condition, tissue and species all affected DNA yield. A single freeze–thaw reduces DNA yield, DNA yield is less for roots than shoots, and species vary in the amount of DNA yielded from extractions. The effects of template DNA concentration, species identity, and their interaction on amplicon yield differed across the three chloroplast regions tested. We found that the effect of species identity on amplicon production was generally more pronounced than that of DNA concentration. Though these factors influenced DNA yield, they likely do not have a pronounced effect on detection success of fragments and only underscore the restriction on the use of FAFLPs for measuring species presence rather than their abundance. However, for two of the regions tested—the trnT-trnL intergenic spacer and the trnL intron—size-based fragment competition occurred and the likelihood of detection was higher for smaller than larger fragments. This result reveals a methodological bias when using FAFLPs. Conclusions To avoid potential bias with the use of FAFLPs, we recommend users check for the disproportionate absence of species detected belowground versus aboveground as a function of fragment size, and explore other regions, aside from the trnT-trnL intergenic spacer and trnL intron, for amplification. Electronic supplementary material The online version of this article (doi:10.1186/s13007-015-0079-1) contains supplementary material, which is available to authorized users.


Background
The analysis of plant communities in relation to abiotic and biotic factors has mostly emerged from studying aboveground responses. Roots, however, are the primary organ of water and nutrient acquisition, account for a major portion of primary production, and may mediate aboveground coexistence and diversity from their interactions with other plants and organisms occurring in soils [1,2]. How roots are organized belowground compared with the spatial organization of aboveground organs has only recently been investigated [3][4][5][6]. Our understanding of ecological factors governing root placement, foraging and function lags far behind that of aboveground plant organs owing to practical difficulties with sampling and identifying roots to the level of species. Roots of different species are typically morphologically indistinguishable. A variety of methods (e.g., morphological, chemical, spectroscopic and fluorescent) exist to identify roots, however, DNA-based molecular markers hold the most promise because they do not vary depending on abiotic and biotic conditions [7]. Specifically, molecular identification using fluorescent amplified-fragment length polymorphisms (FAFLPs) is considered a reliable and inexpensive method to determine species identity of roots in soils [8,9]. This method has been advocated for use in species-rich plant communities where high sample throughput is required for analysis, and multiple species co-occur within a sample (i.e., mixed template). Sequence-based markers are used to differentiate species using size differences in fluorescently labelled PCR amplicons. Size profiles derived from roots are then compared to those developed from leaves of identified species occurring within a species pool defined by the user. For instance, using two non-coding regions of cpDNA in roots, 80% of 95 plants in a fescue grassland were identified [10]. Despite this success, often species recorded aboveground are not detected belowground (i.e., false-negatives), and the reasons for this are poorly understood [11].
Aside from actual differences between above and belowground plant richness [5,12], the molecular methods used for FAFLPs may give rise to false-negatives. In particular, false-negatives may originate in DNA extraction, amplification, and quantification of amplified fragments. Of course, differences in the abundance of roots occurring in soils affect the amount of root DNA, and this relationship has been investigated towards developing quantitative real-time PCR methods and detection thresholds [13,14]. Regarding DNA extraction, chemicals present in plant material vary by age, tissue and species, and these differences in underlying chemistry can affect DNA yields [11,[15][16][17]. In addition to chemistry, the 'freshness' of material and differences in sample storage can also influence DNA extraction. Though sample condition has been tested on roots of herbaceous species [18], woody tree roots often require mechanical treatment to disrupt cell walls, thus they may differ from herbaceous species in their sensitivity to extraction and storage conditions. Of the limited research investigating how plant material and DNA concentration affects DNA yield from roots, most of it has been with species from a particular ecosystem (e.g. grassland or forest) and not for use in FAFLPs. With global change and shifts in human land-use, many ecosystems are in a state of conversion [19], consequently, comparisons between life forms (e.g. trees versus grasses) are increasingly necessary. While other molecular methods are available to identify mixed template, in particular, next generation sequencing, these methods are expensive compared with FAFLPs, informatics-intensive, and not necessarily the desired tool when sample numbers are high as is the usual case for community studies.
Following DNA extraction, subsequent biases can arise in amplification. For example, differences across species and in root diameter influenced the recovery efficiency of species present in clone libraries [20]. Several studies have reported that the complexity and intensity of fragments decreases with DNA concentration, indicating that fragments are not randomly distributed at low DNA concentrations [21,22]. One explanation for this finding is that high species richness of DNA template may induce competition among primers, thus selecting for short fragments. Taggart et al. [10] tested for primer competition using trials of mixed samples containing between 4 and 16 grassland species. In that study, the increase in false-negatives in samples of higher species diversity was attributed to a higher probability of 'difficult' species present in samples of higher diversity rather than a result of primer competition. However, neither fragment size nor template concentration was considered in that study. False-negatives may also be present post-amplification as fluorescently labeled amplicons are injected into capillaries; shorter fragments have greater mobility into the capillaries and therefore may be more detectable than longer fragments. This bias was tested by combining amplicons yielded from seven boreal tree species (and thus of different sizes) post-amplification in trials of one, two, four, and six species; no evidence of fragment competition was discerned [8].
In short, plant species vary in abundance, either through the amount of roots present in a soil sample or through the DNA they yield. Users employ different strategies to conserve samples, potentially affecting DNA quality and quantity. Species vary in sizes of target DNA, which may be preferentially amplified based on size. Collectively, no single study has assessed these sources of bias with use of FAFLPs. We focussed our work on two stages: DNA extraction and amplification. To identify biases arising with DNA extraction, we tested the effects of species, tissue and sample storage on DNA yields and purity with species from forests prone to invasion by grassland species. Using these same species combined with others from a similar ecosystem, we tested biases arising with amplification. Specifically, we tested how species and DNA concentration affected amplicon yield, and as a result, detection thresholds. We also tested how DNA concentration and fragment size affected fragment presence by creating known mixtures of plant species and manipulating the abundance of a given component of the mixture. Taken together, this study highlights methodological issues affecting false-negatives in the species identification of roots using FAFLPs.

DNA extraction: testing for differences in DNA yield and purity by tissue, species and sample condition
Species, tissue, and sample condition all significantly affected DNA yield (Additional file 1: Table S3). Overall DNA yield was higher for Bromus inermis (199 ± 15 ng mg −1 dry tissue) than Populus tremuloides (74 ± 7 ng mg −1 dry tissue). Across both species, roots yielded less DNA than leaves ( Figure 1). The difference in yield between leaves and roots was proportionally less for Bromus inermis than Populus tremuloides. DNA yield of roots was 66% and 25% that of leaves for Bromus inermis and Populus tremuloides, respectively. Thawing frozen samples generally decreased DNA yields; however, the number of freeze-thaw cycles did not have a pronounced effect on DNA quantity ( Figure 1). Regardless of species, there was no effect of sample condition on DNA purity measured as A 260 /A 280 absorbance ratios (Additional file 1: Table S4). Mean A 260 /A 280 absorbance ratio was highest for DNA extracted from roots of Populus tremuloides (1.86 ± 0.026) compared with that for leaves (1.74 ± 0.0121). Absorbance ratios were similar for DNA extracted from leaves and roots of Bromus inermis (1.81 ± 0.020 and 1.83 ± 0.007, respectively).

Amplification: testing the effect of species and DNA concentration on amplicon yield
Between root and leaf tissue, fragment sizes varied within one base pair, i.e., fragments yielded from root and leaf tissue of a single individual were effectively sized the same. Across the three regions, fragments differed in size between Populus tremuloides and Bromus inermis ( Table 1), and some intraspecific variation was present in fragments of the trnL intron for Populus tremuloides. We found that for the trnL-trnF intergenic spacer, individuals of Bromus inermis expressed two fragments (Table 1). DNA extracted from leaves yielded fragments sized 394-395 base pairs (bp) and occurred eight times less than fragments sized 443 bp [t(10) = −6.28, P < 0.001] However, DNA extracted from roots yielded fragments sized 394-395 bp and occurred in equal abundance as fragments sized 443 bp [t(10) = −0.46, P = 0.66]. Amplification of DNA extracted from frozen-thawed samples produced amplicons of the same fragment lengths as those from fresh samples (data not shown) for all three chloroplast non-coding regions.
The effects of DNA concentration, species, and their interaction on amplicon yield differed by region. Amplicon yield of the trnT-trnL intergenic spacer was affected by the interaction between DNA concentration and species (Additional file 1: Table S5). Amplicon yield of Bromus inermis increased with DNA concentration, whereas Melilotus officinalis and Populus tremuloides did not (Figure 2). Species and DNA concentration independently affected amplicon yield of the trnL intron (Additional file 1: Table S6). There was a weak, positive relationship between DNA concentration and amplicon yield (R 2 < 0.1; data not shown). There was an order of magnitude difference among mean amplicon yield across species; Melilotus officinalis had the highest yield (62,672 ± 329 rfu), followed by Bromus inermis (11,987 ± 957 rfu), and Populus tremuloides (2,739 ± 361 rfu). Similar to amplicons of trnL intron, species identity also had the most pronounced effect on amplicon yield of the trnL-trnF intergenic spacer (Additional file 1:  three chloroplast regions, fragment sizes differed across known plant species (Table 1).
Regarding species pools assembled with DNA extracted from roots of unidentified species spiked with DNA extracted from leaves of known species, for the trnT-trnL intergenic spacer, no fragments were detected at the lowest concentration, 4.5 ng μL −1 (Table 3). Increasing the concentration of DNA to 45.5 ng μL −1 , resulted in nearly 100% successful detections. For the trnL intron, at DNA concentrations of 0.9 ng μL −1 , detections of fragments were 100% successful (Table 3). Detection success was unaffected by the concentration gradient for most fragments of the trnL intron, though some fragments disappeared with a decrease in DNA concentration. Detections were 100% successful for trnL-trnF intergenic spacer, however it is possible that there were species present in the mixed root samples that went undetected across all trials.
When data were pooled across all created communities for trnT-trnL intergenic spacer, the logistic regression model was statistically significant [χ 2 (3) = 60.73, P < 0.001, R 2 = 0.50], and both fragment size (P = 0.022) and its interaction with DNA concentration (P = 0.012) influenced detection success. At low concentrations of DNA, smaller fragments were more likely to be detected than larger fragments. When data were pooled across all created communities for the trnL intron chloroplast region, the logistic regression model was statistically significant [χ 2 (3) = 43.8, P < 0.001, R 2 = 0.40]; fragment size (P = 0.001) alone affected detection success. The likelihood of detection was higher for small than large

Table 1 Fragment sizes of three regions of chloroplast DNA
Fragment size was measured in base pairs for plant species (n = 6) common in western Canada.
n/a unsuccessful amplifications. a Unpublished values determined by the authors in previous trials using same conditions of extraction, amplification and fragment analysis.  fragments. Due to the relative homogeneity in detection success, we did not perform a logistic regression on data pooled for trnL-trnF intergenic spacer.

Discussion
To identify methodological sources of discrepancies between plant species occurring above and belowground (false-negatives), we investigated the effects of species, DNA template concentration and fragment size on biases associated with amplification, in addition to determining the effects of sample storage, and species differences in extraction of DNA. Below we discuss how each step of the work stream may affect species detections using FAFLPs (Table 4), and provide recommendations.

Biases with DNA extraction
We found evidence for several possible causes of falsenegatives at the extraction step, however none stand out as being particularly troublesome. First, storage of samples affected DNA yield, specifically, freezing and thawing reduced yield. Sample preservation is well known to affect total concentration of extracted DNA [    Second, species differed in the amount of DNA yielded from extractions. Specifically, DNA yield was higher for Bromus inermis than Populus tremuloides. DNA yields may differ between species due to differences in gene composition, genome size, the presence of inhibitory substances (e.g. phenolics and polysaccharides), age of tissue, size of roots, and number of cells present in tissue. Ours is not the first report of variation among species in DNA yield from roots [e.g. 13,16,20]. In studies on mixed root samples, discrepancies between above and belowground species detections may be in part due to differences in extraction efficiency between species. For instance, we found that the DNA yield of Populus tremuloides was 74 ng mg −1 (mean from roots and shoots combined), approximately 40% that of Bromus inermis. This difference in extraction efficiency between the two species effectively increases the amount of tissue material required to equalize DNA concentrations, and the effective increase compounds with amplification requirements ( Figure 3). For DNA markers which require high template concentrations, i.e., the trnT-trnL intergenic spacer, interspecific differences in extraction efficiencies may give rise to higher rates of false-negatives than markers which require lower template concentrations. Template inhibition, where single-stranded template molecules hybridize with each other rather than binding with primers, likely sets an upper limit on the concentration of DNA template permitting amplification [23]. For instance, Fisk et al. [20] found that roots making up a small fraction of mass in mixtures had disproportionately higher sequence representation in clone libraries relative to those making up larger fractions. Species with an initially low concentration of DNA may be selected for amplification resulting in bias in the final product against the initially abundant species.

Biases with DNA amplification
Another cause which might underlie false-negatives is that for a given DNA template concentration, amplicon yield differed across species (and thus fragment sizes), and the difference in yield varied by chloroplast region. This finding suggests that copy numbers of the cpDNA regions vary among species and/or fidelity between primers and DNA is inconsistent across the three regions. However, this result only underscores the restriction on the use of FAFLPs: this method should not be used for quantifying abundance of species rather it should solely be used for detecting their presence. Although species vary in the amplicon production with PCR, so long as the amplified fragments are present in quantities above detection limits of the capillary sequencer, presence of these species ought to be detected using FAFLPs.
Of the methodological factors tested, those which are likely biasing detection of fragments, is the presence of size-based fragment competition. For each of the chloroplast regions, we pooled data across the created communities which included DNA from roots and leaves of various species. As a consequence, the source and amount of DNA across trials was highly variable, not unlike what we would expect from mixed template derived from field samples. We found that at low concentrations of DNA, larger fragments of the trnT-trnL intergenic spacer were less likely to be detected than smaller fragments, the likelihood of detection was higher for smaller than larger fragments yielded from the trnL intron, and detection success was overall high for trnL-trnF intergenic spacer. There were, however, exceptions to these trends suggesting other factors may affect detection in addition to fragment length. Other sources of amplification bias in community analysis include sequence composition (e.g. GC content) [24], interference from DNA flanking the template region [25], and the type of polymerase used [26].
We found it difficult to extract DNA in sufficient quantity to successfully amplify the trnT-trnL intergenic spacer. Notably, amplifications for three of eight species (Table 1) were unsuccessful in our study, matching rates reported by Taggart et al. [10]. In other reports, amplification of the trnT-trnL intergenic spacer has been met with less success than the trnL intron and trnL-trnF intergenic spacer [8,10,27]. For these reasons, in addition to evidence suggesting size-based fragment competition with amplification, we do not recommend use of this region for FAFLPs until further improvements can be made to improve extraction yields, which may also alleviate fragment competition. Studies often report this region contains greater sequence variation than the other two regions used in our study [28], thus it would be of interest to improve PCR-amplification of this region.
Larger fragments of the trnL intron were less likely to be detected than smaller fragments regardless of template concentrations. This result is driven by the unsuccessful detection of two species, Chamerion angustifolium and Populus tremuloides, which also yield the largest fragment sizes tested for this region, 603 and 706 bp, respectively (Table 3). This result reveals a difficulty in tests for size-based fragment competition, that is, fragment size is confounded with species identity. DNA extracted from some species is notoriously difficult to amplify [29]. As such, the unsuccessful detection of these two species begs the question whether it was due to size-based fragment competition or species-specific traits inhibiting amplification? For instance, Taggart et al. [10] stated that the rate of false-negatives increases with the number of 'difficult' species added to a mixture. The species identified as difficult in their study had fragment sizes of the trnL intron greater than 616 bp long. In our study, when amplified in isolation of other species both Populus tremuloides and Chamerion angustifolium were detected successfully, however, when amplified in mixture, they were not. Examining results between the studies, we suggest rather than species-specific traits inhibiting amplification, it is selection against fragment sizes approximately >600 bp that underlies these outcomes. The best test of size-based fragment competition on detection success would include fragments differing in size but present in similar quantities from a single species compared with those from several species to unequivocally rule out species-specific influences on amplification. Of course, it is difficult to create such a test because the majority of plants produce fragments of a single size for each chloroplast region [8,10]. Why size-based fragment competition occurred in mixtures of fragments amplified from the trnT-trnL intergenic spacer and the trnL intron, but not the trnL-trnF intergenic spacer is unclear.

Other origins of false-negatives and false-positives: intraspecific variation and the importance of the chosen species pool
Most of the species used in our study showed intraspecific variation, i.e., there was a continuous range of fragment sizes which emerged across individuals. Some species for example, Bromus inermis and Picea glauca, yielded two fragments of discrete sizes from single individuals. The presence of intraspecific variation highlights several essentials when employing the use of FAFLPsto ignore these procedures may affect rates of false-negatives and false-positives. First, as is widely established, users must sample multiple individuals of a single species when building reference keys for assigning species identities to fragments produced by FAFLPs. Second, users must rely on multiple DNA markers to resolve species identities; for example, overlap in fragment sizes as observed in the trnL-trnF intergenic spacer precludes its use in isolation of other markers for determining species identities of roots. In another study, Randall et al. [8] were unable to differentiate between two species of Picea based on fragment lengths of the trnT-trnL intergenic spacer, the trnL intron, and the trnL-trnF intergenic spacer. Testing other non-coding regions of plastids for high variability will have applications in systematics, evolutionary biology and plant community ecology, where the use of molecular barcoding has recently been put into use [28]. The presence of fragments of discrete sizes produced by a single individual is indicative of either polymorphisms within a particular chloroplast region, the presence of multiple primer binding sites within a genome, or primer infidelity. We ruled out primer infidelity as both fragments produced by DNA extractions from roots occurred in equal abundance. We cannot discern between polymorphisms or multiple primer binding sites underlying the presence of multiple fragments; however, the latter is possible. For all primers aside from "F" used in our study, multiple binding sites exist across the genome of Brachypodium distachyon (L.) P. Beauv., a model monocot. Specifically, we found three binding sites of primer A, three of primer B, three of primer C, four of primer D, four of primer E and one of primer F. Regardless of its cause, the presence of multiple fragments increases the number of identifiers for a given species, however it may also increase the rate of false-positives. This result underscores the necessity of developing a reference key for each plant community of interest.

Conclusions
In this study, we highlighted some methodological issues affecting false-negatives in the species identification of roots using FAFLPs. In particular, we focused on the consequences of uneven root abundances of co-occurring species and the presence of sized-based fragment competition during amplification. We do not recommend use of the trnT-trnL intergenic spacer for FAFLPs until further improvements can be made to improve extraction yields, which may alleviate size-based fragment competition. Size-based fragment competition was detected in FAFLPs of both the trnL intron and the trnL-trnF intergenic spacer. For any DNA-marker, we recommend users check for the disproportionate absence of species detected belowground versus aboveground as a function of fragment size. Detection of fragments yielded from amplification of the trnL-trnF intergenic spacer was the most successful, indicating its reliability; however this region should not be used alone in light of increased rates of false-positives with reliance on any single region.

DNA extraction: testing for differences in DNA yield and purity by tissue, species and sample condition
Appropriate permission was obtained for all collections of plant material used in this study. To test factors affecting DNA yield and purity, comparisons were made between and among tissues of a tree (Populus tremuloides Michx.) and a grass (Bromus inermis Leyss.). Invasion by Bromus inermis has been reported in both grasslands and disturbed forests [30][31][32][33], of which Populus tremuloides can be a dominant species. Approximately 20 g of leaves and roots were separately sampled for each of Populus tremuloides and Bromus inermis. Samples were collected from six Populus tremuloides plants, three of which were germinated from seeds collected from trees growing in Edmonton, Alberta, Canada and grown in the University of Alberta (UAlberta) greenhouse for 16 weeks. The other three samples were from mature trees occurring near the UAlberta campus, each separated by approximately 300 m. Leaf and root samples of Bromus inermis were located from six plants separated by approximately 300 m growing on the north perimeter of UAlberta campus. For field collections of both Populus tremuloides and Bromus inermis, care was taken to ensure root systems were attached to the focal plant.
Tissue samples were subjected to different numbers of freeze-thaw cycles: none (i.e., samples were fresh), one, two or three. Freeze-thaw cycles involved freezing samples at −20°C for 3 days, then thawing to room temperature for 2-3 h. After samples were subjected to the different levels of freeze-thaw cycles, they were lyophilized. Prior to this step, adhering debris was removed from roots by washing thoroughly under a gentle stream of tap water, followed by a rinse with deionized water and air-drying for 20 min. Samples were placed into perforated tin foil packets and lyophilized using a benchtop freeze dryer (Labconco FreeZone 2.5, Kansas City, MO, USA) for three days. Once dried, plant tissue was ground using a TissueLyser II (Qiagen Inc, Mississauga, ON, Canada). Fragmented tissue was placed in a 20 mL stainless steel milling jar with a 20 mm diameter grinding ball, and shaken at 30 Hz for 30 s. Twenty milligrams of ground leaf tissue was used to extract total cellular DNA using a DNeasy Plant Mini Kit (Qiagen Inc, Mississauga, ON, Canada) following the manufacturer's directions. Leaf DNA was eluted in 50 μL of Buffer AE. We used PowerSoil ® DNA Isolation Kits (MO BIO Laboratories, Inc., Carlsbad, CA, USA) to extract genomic DNA from 15 to 50 mg of ground root tissue following the manufacturer's directions. Root DNA was eluted in 50 μL of elution buffer. Both leaf and root DNA extracts were then further purified by ethanol precipitation. In a 1.5 mL micro-centrifuge tube, 20 μL of DNA extract was mixed with 2 μL of NaOAc-EDTA buffer (3 M sodium acetate with 125 mM ethylenediaminetetraacetic acid in water, pH 8.0), followed by the addition of 50 μL of ice-cold 95% ethanol and gently vortexed. The solution was kept at 4°C for 15 min, and then centrifuged at 10,000g for 15 min at 4°C. Supernatant was then aspirated and 70 μL of icecold 70% ethanol was added to the remaining DNA pellet and gently vortexed. The tube was again centrifuged at 10,000g for 5 min at 4°C and supernatant was aspirated. The purified DNA pellet was dried in a SpeedVac Concentrator (Savant Instruments, Inc., Farmingdale, NY, USA) for fifteen minutes, and then reconstituted in 20 μL of water. DNA yield and purity were quantified using a Nanodrop 2000 (Thermo Fisher Scientific, Wilmington, DE, USA). Purity of the extracted DNA was based on the ratio of absorbance at 260 and 280 nm, with pure DNA having a ratio between 1.8 and 2.0. Extracts were subsequently stored at −20°C for downstream activities.

Amplification: testing the effect of species and DNA concentration on amplicon yield
Using the same purified DNA extract from freshly collected leaf samples of Populus tremuloides and Bromus inermis, we tested whether fragment yield of three cpDNA regions varied by DNA template concentration and species. We based our test on three individuals of each species, and added three individuals of a third species, Melilotus officinalis (L.) Pall., collected near Fort McMurray, Alberta, Canada, from which DNA had been extracted a year earlier using methods described above, and stored at −20°C. We manipulated DNA template concentrations by first bringing all the DNA extracts to their highest concentration possible via ethanol precipitation. In a separate test, we found that the three regions varied in the optimal concentration of DNA required for amplification. Specifically, amplifications were successful using 20-290 ng μL −1 , 0.6-1.6 ng μL −1 , and 0.6-1.6 ng μL −1 of DNA for the trnT-trnL intergenic spacer, trnL intron, and trnL-trnF intergenic spacer, respectively. Serial dilutions were made in steps of 0.5x, resulting in template concentrations of 512, 256, 128, 64, 32, and 16 ng μL −1 for amplification of trnT-trnL intergenic spacer and 4, 2, 1, 0.5 and 0.25 ng μL −1 for amplification of the trnL intron and trnL-trnF intergenic spacer.
We amplified the three non-coding regions of chloroplast DNA using universal primers [34]: (1) the trnT-trnL intergenic spacer with primers A (5′-CAT TACAAATGCGATGCTCT-3′) and B (5′-TCTAC CGATTTCGCCATATC-3′), (2) the trnL intron with primers C (5′-CGAAATCGGTAGACGCTACG-3′) and D (5′-GGGGATAGAGGGACTTGAAC-3′) and (3) the trnL-trnF intergenic spacer with primers E (5′-GGTTCAAGTCCCTCTATCCC-3′) and F (5′-ATTT GAACTGGTGACACGAG-3′). Different coloured fluorescently labelled primers were used in PCR for each primer pair (primer A: FAM; primer C: VIC; primer E: NED; Integrated DNA Technologies, Coralville, Iowa, USA). Polymerase chain reactions were conducted in volumes totaling 25 μL: 12.5 μL of EconoTaq PLUS 2X Master Mix (Lucigen Corp., Middleton, WI, USA), 2.5 μL of each forward and reverse primer at 10 μM, 6.5 μL autoclaved deionized water, and 1 μL of 0.6-290 ng μL −1 DNA template. Amplifications were performed using an Eppendorf Mastercycler Pro S gradient thermal cycler (Model 6321; Eppendorf Canada, Mississauga, ON, Canada). Each region had unique thermal cycler conditions: (1) trnT-trnL intergenic spacer; 94°C for 5 min, 2 cycles of 94°C for 45 s, 56°C for 60 s, 72°C for 80 s, followed by 33 cycles of 94°C for 45 s, 61.5-0.3°C per cycle for 60 s, 72°C for 80 s and a final extension of 72°C for 30 min; (2) trnL intron; 94°C for 5 min, 2 cycles of 94°C for 60 s, 60°C for 60 s, 72°C for 80 s, followed by 33 cycles of 94°C for 60-s, 59.6-0.4°C per cycle for 60-s, 72°C for 80-s and a final extension of 72°C for 30 min; (3) trnL-trnF intergenic spacer, 94°C for 5 min, 2 cycles of 94°C for 60-s, 60°C for 60-s, 72°C for 80-s, followed by 33 cycles of 94°C for 60-s, 63-0.4°C per cycle for 60-s, 72°C for 80-s and a final extension of 72°C for 30 min [10]. Later in the study, it was found that the PCR conditions of trnL intron and trnL-trnF intergenic spacer could be used interchangeably with no reduction in amplification. Subsequently, the PCR conditions of trnL-trnF intergenic spacer were used for both regions for the rest of the study. Products were visualized by gel electrophoresis (1% agarose gel). Amplified products from the three loci were diluted 200× by combining 199 μL milliQ H 2 O and 1 μL of PCR product. From this dilution, 2 μL were added to 8 μL of HiDi formamide and 0.3 μL of GeneScan 1200 LIZ size standard (Applied Biosystems, Foster City, CA, USA). Note that by using differently colored fluorescently labelled primers, products across the three regions could be pooled; however, we chose to run the regions separately to increase the precision in fragment sizing. The final mixture was denatured at 94°C for 2 min and coldsnapped to maintain single-stranded fluorescently labelled DNA. Sizes of PCR amplicons were first resolved using capillary electrophoresis (ABI 3730 DNA analyzer; Applied Biosystems, Foster City, CA, USA) and then sized with GeneMapper (Applied Biosystems, Foster City, CA, USA) with GeneScan 1200 LIZ size standard. Fragment sizes read by the capillary sequencer were rounded to the nearest base pair. As part of this test, we confirmed that fragment sizes did not differ between roots and leaves, and by storage condition for both Populus tremuloides and Bromus inermis (n = 6). Amplicon yield was represented by the height of peaks (relative fluorescent units: rfu) detected by the capillary sequencer. Relative fluorescent units are the emission intensity of the fluorophores in samples registered as electrical voltage. This emission intensity is proportional to the molar concentration of the fluorophores, which is the same as the molar concentration of the amplicons since each fluorescently labelled amplicon carries one fluorophore.

Amplification: testing the effect of DNA concentration and fragment size on fragment detection
To test whether size-based fragment competition and DNA concentration affects detection of amplified fragments, we created known communities containing DNA extracted from leaves of Picea glauca (Moench) Voss, Melilotus officinalis, Sonchus arvensis L., Populus tremuloides, Bromus inermis, Chamerion angustifolium (L.) Holub., Trifolium hybridum and Rubus idaeus L. These species are widespread across western Canada [35]. Melilotus officinalis, Sonchus arvensis and Bromus inermis are typical invaders of boreal forests. The communities were constructed with the goal of capturing a range of fragment sizes for a given chloroplast region (see Table 1 for fragments sized in this study and previous unpublished trials). Though this test cannot discern whether underrepresentation of long fragments is due to either less efficient PCR of longer fragments during amplification, or bias in the electrophoresis of different fragment lengths during quantification, previous research shows the latter is unlikely to occur [8]. For the trnT-trnL intergenic spacer region, we mixed DNA extract of four species, Picea glauca, Populus tremuloides, Bromus inermis and Melilotus officinalis, in equal and unequal proportions replicated three times (Additional file 1: Table S1). We used a similar approach for the other two chloroplast regions using Melilotus officinalis, Sonchus arvensis, Chamerion angustifolium and Populus tremuloides for trnL intron, and Trifolium hybridum, Chamerion angustifolium, Bromus inermis and Rubus idaeus for trnL-trnF intergenic spacer. DNA template concentration was 50 ng μL −1 for amplification of trnT-trnL intergenic spacer, and 1 ng μL −1 for amplification of both the trnL intron and trnL-trnF intergenic spacer. Thus, we held the template concentration constant across trials, but varied the concentration of individual species comprising the DNA template (Additional file 1: Table S1). Sample preparation for Populus tremuloides and Bromus inermis is described above. Extractions of DNA of the remaining species were performed a year earlier using the same methods outlined above and stored at −20°C. These latter samples originated from single plants occurring in the Fort McMurray region, Alberta, Canada.
We also ran a similar trial on fragments yielded from roots with undetermined species identities spiked with DNA extracted from leaves of a known species (Additional file 1: Table S2). The roots had been sieved from 500 g soil samples collected from five locations in reclaimed boreal forest located in the Fort McMurray region of Alberta, Canada. Common understory species present on reclaimed areas included: Chamerion angustifolium, Sonchus arvensis, Salix bebbiana Sarg., Melilotus officinalis, Trifolium hybridum, and Rubus idaeus under a canopy of Picea glauca and Populus tremuloides. We extracted DNA from roots using methods described under 'DNA extraction: Testing for differences in DNA yield and purity by tissue, species and sample condition'; these samples had been stored for 7 months at −20°C. In these trials, we did separate combinations of DNA extracted from roots of two of the five soils mixed with DNA extracted from leaves of a single known species (Additional file 1: Table S2). Thus, we held the template concentration constant across trials, but varied the concentration of known species comprising the DNA template, and in the case of roots, the concentration of mixed DNA template arising from the putative presence of multiple species (Additional file 1: Table S2).

Data analysis
To test for differences in DNA yield and purity by tissue, species and sample condition, we used general linear models with species, tissue, sample condition, and their interactions as fixed explanatory effects. To predict fragment yield (relative fluorescent units), we used a linear mixed effects model with DNA concentration, species and individual (n = 3) as fixed factors, and individual nested within species as a random factor. Relative fluorescent units were summed across fragments differing in size. We analyzed combined data from created species mixtures (roots and leaves) for each chloroplast region separately using logistic regression to test the effects of fragment size, DNA concentration and their interaction on detection success. All analyses were performed in IBM SPSS Statistics for Windows, Version 21.0 (IBM Corp., Armonk, NY, USA).