Principle of the IMPLANT procedure
We designed IMPLANT as an efficient and straightforward end-point PCR system to determine transgene copy numbers (Fig. 1). It is based on a cPCR reaction, but with the integration of both the gene of interest and the competitor (based on and competing with an endogenous gene for amplification) into the plant genome. Because similar amplification efficiencies using only one primer pair for the competitor and endogene are required, the GC content and length of both amplicons have to be similar when designing the competitor sequence. Therefore, a size difference of maximally 10% is required to discriminate between both amplicons when using gel electrophoresis for amplicon quantification. Because both DNA species are integrated into the plant genome, DNA standards containing the competitor are not necessary, and due to the competitive nature of the reaction, end-point PCR can be used without any danger of PCR cycle-artifacts. For the calculation of the copy number, the signal intensity of the competitor is set relative to the intensity of the endogenous sequence. Because of differences in amplification efficiency between the endogenous sequence and the competitor, this value needs to be multiplied with an empirically derived correction factor. The resulting value is then rounded up or down to the nearest integer to obtain the final copy number estimate (see Materials and methods). When setting up the assay for a given plant species, competitor, and PCR conditions, it is advisable to confirm the results with another technique such as ddPCR or Southern blot analysis. After the assay is set up, the same correction factor can be used for subsequent experiments.
IMPLANT is proposed here in combination with Agrobacterium-mediated transformation, and tested in both Arabidopsis and rice, but it is equally applicable to other transformation methods that give rise to stably transformed plants, or even to other organisms. For the Arabidopsis experiments, we used a quick-and-dirty DNA extraction protocol [25], but for most plant species this is unfortunately not possible. For the rice experiments, we therefore used spin column-purified DNA as the PCR template, although some relatively quick methods such as the Edwards’ protocol [26] might also give reasonably good results. For high-throughput screening of single T-DNA copy plants, even these quick methods are not only labor intensive, but they also require several pipetting steps that increase the chance of sample mix-ups. Therefore, we also used IMPLANT in a direct PCR format, which will increase the ease of use.
IMPLANT can accurately determine transgene copy numbers in Arabidopsis
To provide proof of concept for IMPLANT to accurately determine copy numbers, we first tested it in transgenic Arabidopsis thaliana lines. An endogenous amplicon of 370 bp (GC: 47.3%) was selected within the SCHLEPPERLESS (AT2G28000) gene (Additional file 1: Figure S1A) and the same primer-binding sites were cloned inside pICSL11059 [27], a binary vector that contains a hygromycin plant resistance marker. The forward primer was cloned upstream of the start codon and the reverse primer inside the catalase intron of the hygromycin resistance gene (Fig. 2A, Additional file 1: Figure S1B), ensuring functionality for hygromycin selection. As such, a competitor amplicon of 408 bp with a similar GC content (51.5%) was created, which can be distinguished from the endogenous 370-bp SCHLEPPERLESS amplicon by size after amplification with the same primers. Note that our modified vector pICSL11059_AtIMPLANT, only differs by 43 bp with the parental vector. After transformation into Arabidopsis, we used IMPLANT for copy number estimation (Fig. 2B).
A segregation analysis was also done on the same lines. As can be seen in Fig. 2C, most of the generated lines contain one single T-DNA locus according to segregation analysis. This could also be concluded from the IMPLANT protocol, which generated results that deviated at most 0.2 (all values between 0.8 and 1.17) from the expected ratio of 1 for these lines. Note that some of the lines that segregated in a 3:1 ratio however, contained multiple T-DNA copies (most likely tandem insertions that cannot be picked up by segregation analyses alone) (line #1, #4, #15 and #9). IMPLANT is able to discriminate between such lines and single inserts, pointing to a distinct advantage of using IMPLANT as opposed to segregation analysis.
The three lines in this study that segregated in a ratio different from 3:1 (#7, #10, #11), were also identified by our IMPLANT protocol, and contained either two or three copies of the transgene.
To further validate our IMPLANT protocol, we performed a follow-up experiment with the T1 progeny of independent T0 lines. We performed copy number analysis with IMPLANT, and instead of segregation analysis, we made use of ddPCR to compare the obtained copy number estimates. In this experiment, we again found that IMPLANT could reliably distinguish between single and double transgene containing plants, as benchmarked with ddPCR (Additional file 1: Figure S4).
IMPLANT also works in the monocot species Oryza sativa
Agriculturally important crops frequently do not make a large number of seeds, which often makes segregation analysis impossible. Therefore, we examined if IMPLANT is also compatible with rice (Oryza sativa), one of the world’s most important food crops. We again used pICSL11059 as the backbone vector and this time cloned a complete 341-bp competitor inside of the catalase intron (Fig. 3A; Additional file 1: Figure S2C). This competitor was a randomly chosen sequence on chromosome 1 (GC: 39.9%) that was then flanked with identical primer-binding sites derived from a different 377-bp endogenous sequence that also resides on chromosome 1 and with a similar GC content (40.6%). Because of this difference in size, the endogenous and competitor amplicons can be distinguished after the PCR reaction.
Rice transformants were generated and gDNA was extracted using a commercial silica spin column kit (DNeasy plant mini kit, Qiagen) and used in the PCR reaction. According to the IMPLANT analysis (Fig. 3B) most of the lines contained two T-DNA copies with values ranging between 1.88 and 2.40. Three of the lines (#1, #6, #9) were determined to be single insertion lines, ranging between 0.43 and 0.89, of which line #1 deviated strongly from the expected number of 1. Finally, line #2 was estimated to contain three copies of T-DNA.
To verify the accuracy of our method, we compared the results with those obtained by ddPCR (Fig. 3C). We did not use a restriction enzyme to digest the DNA during the ddPCR run but the values obtained very closely matched the expected ratios. We found that all results were identical except for line #9. With IMPLANT we could detect only one copy, whereas ddPCR detected two. It must be noted however that the probe for the ddPCR analysis lies closer to the T-DNA right border than the PCR amplicon of the IMPLANT reaction (Additional file 1: Figure S2C). Therefore, partial T-DNA copies that are truncated between the ddPCR reverse primer and the IMPLANT forward primer may be detected with ddPCR but not with IMPLANT. To clarify further, we subjected line #9 to Next Generation Sequencing (NGS). The NGS analysis indeed indicated that there is a complex T-DNA integration in this line, including a tandem insertion of the T-DNA with also backbone vector integration (data not shown). Based on the reads, we were however not able to delineate all the junctions with the rice genome. For such complex integrations, it is not unexpected that these methods would give different results, given that different sequences are amplified in each method.
Taken together, these data show that IMPLANT is highly accurate in O. sativa. The variation in the obtained IMPLANT inferred values is somewhat higher than those seen in the Arabidopsis data, but single-copy insertions can clearly be distinguished from multiple copy insertions, allowing researchers to select single copy insertion lines for further research.
IMPLANT is compatible with direct PCR in O. sativa
To further improve the user-friendliness of our IMPLANT protocol, we investigated if it would be possible to combine it with direct PCR, in which a small amount of plant tissue is directly used as the template without prior DNA extraction. Several methods are available to do so, which use either very small amounts of tissue to avoid the introduction of inhibitors, or by using resistant polymerases [28]. We opted to use the latter system. As can be seen in Fig. 4A, the amount of plant material needed for copy number determination can be drastically reduced by using this direct PCR.
We tested the same 10 plants as in the previous experiment, in which we used silica column-purified DNA, but now we used leaf sections of approximately 0.25 mm2 directly as the template in PCR reactions with the Phire Plant Direct PCR kit. As seen in Fig. 4B, this method gives identical results as compared with column-purified DNA (Fig. 3B) for all lines but line #2. The range of values obtained for plants containing one copy was between 0.89 and 1.20, and for plants containing two copies between 1.54 and 1.72, which makes them clearly distinguishable from the single-copy T-DNA lines. The only line that gave a different inferred copy number from the previous experiments is line #2, which gives an apparent copy number of four, whereas the copy number as determined by IMPLANT with purified DNA and ddPCR was three.
We also loaded direct PCR IMPLANT reactions on conventional agarose gel and calculated the band intensities (Additional file 1: Figure S3), giving the same inferred copy numbers as when using the silica column-purified DNA (Fig. 3B). The values for single-copy insertions ranged between 1.02 and 1.12, for plants with two copies, the range was between 1.47 and 2.13, with two of the six lines having a ratio slightly lower than 1.5. Nevertheless, also here, the single-copy lines could be clearly distinguished from these higher-order T-DNA copy lines, making agarose electrophoresis a possible alternative for capillary gel electrophoresis.
In conclusion, direct PCR IMPLANT results corresponded perfectly to the results obtained with the conventional IMPLANT method, except for the line with the highest copy number. The high relative signal intensities of this high copy number line increase the error on the copy number calculation, and therefore make it difficult to accurately distinguish plants containing three or more T-DNAs. However, the most interesting plants, i.e., the individuals with a single copy number, can be perfectly distinguished.