From: A new method to identify flanking sequence tags in chlamydomonas using 3’-RACE
trans- formant | # of trials | valid FSTs | chromosome | FST position | locus | annotation | type of location | orientation/ gene | marker resection (nt) | notes | confirmation by genomic PCR |
---|---|---|---|---|---|---|---|---|---|---|---|
#2.1 | 3 | 2 | 16 | 304054-303727 | Cre16.g649785/Cre16.g649752 | no predicted function/ no predicted function | intergenic | not applicable | 9 | unsure, PCR failed | |
2 | 19 | 810984-811036 | Cre19.g757350 | NAC domain protein | 3' UTR | + | 8 | A 37 nt fragment of the PCR product is inserted between the active cassette and the flanking DNA; poly-A site not recorded before for the host gene | unsure, PCR failed | ||
#2.2 | 2 | 1 | 1 | 8429714-8432270 (spliced) | Cre01.g060850 | PSBS3, Chloroplast PSII-associated 22 kDa protein | exon 3; FST extends up to exon 10, with splicing | + | 0 | expression of interrupted gene is not documented by 454 or Illumina data | confirmed |
#2.3 | 2 | 1 | 12 | 589974-589850 | Cre12.g488050 | FFT5, Fructan fructosyltransferase | intron 2 | - | 0 | confirmed | |
#2.4 | 1 | 1 | 14 | 3326333-3326254 | Cre14.g630200 | no predicted function | 3' UTR | + | 0 | not the endogenous poly-A site; part of the FST is hidden below an early poly-A tail | unsure, PCR failed |
#2.5 | 3 | 2 | 12 | 3269654-3269776 and divergent 3268980-3269003 | Cre12.g512250 | protein with HRDC domain | intron 4 | + | 0 | two staggered polyadenylation sites | not tested |
#2.6 | 1 | 1 | ? | ? | ? | transposable elements TOC1 and DNA-2-7_CR | not applicable | not applicable | 0 | UNMAPPED: could be in an unsequenced region, or due to a rearrangement | not tested |
#2.7 | 2 | 0 | - | FAILED: amplifies a sequence from Cre10.g429850 exon 7 but the junction with the cassette cannot be read | not tested | ||||||
#2.8 | 1 | 1 | 27 | 75383-75179 | Cre27.g774700 | SGNH hydrolase | 5'-UTR | + | ? | poly-A tail at end of cassette masks junction with flanking DNA | confirmed |
#2.9 | 1 | 1 | 9 | 2341565-2341083 | Cre09.g400950 | Major Facilitator Superfamilly | 3’ UTR | + | 3 | uses endogenous poly-A site | confirmed |
#2.11 | 2 | 1 | 17 | 1621018-1620917 | Cre17.g707950/Cre17.g708000 | HEP1 escort protein/ PAS domain protein | intergenic | not applicable | 0 | there are several other good matches, but this is the only perfect one | confirmed |
#3.1 | 3 | 0 | FAILED: no marker DNA in sequence | - | |||||||
#3.2 | 2 | 3 | 12 | 9166015-9166885 | Cre12.g560350 | CNK2, NimA-related protein kinase 2 | intron 1 (splits 5'-UTR) | - | 0 (filled in) | at least two staggered polyadenylation sites | confirmed by Lynn Quarmby (pers. comm.) |
#3.3 | 1 | 2 | 7 | 1015855:1015936 and 1015996-1016192 | Cre07.g319550 | FIST C domain protein | intron 1 | + | 0 (filled in) | two staggered polyadenylation sites | confirmed |
#3.4 | 3 | 0 | 0 (filled in) | FAILED : FST too short (1 nt) | - | ||||||
#3.5 | 1 | 1 | ? | Chr_7:2564723–2565098 and other locations | genes similar to Cre07.g332350 | unknown function | usually intron 1 | + | 9 (incl. overhang) | UNMAPPED: maps equally well in several homologous genes | - |
#3.6 | 2 | 1 | 3 | 2440121-2440156 | Cre03.g166950 | PGM5, phosphoglycerate mutase | intron 6 | + | 0 (filled in) + additional G | confirmed | |
#3.7 | 2 | 0 | FAILED: amplifies a sequence from Cre14.g632700 exon 20 but the junction with the cassette cannot be read | disproved (gene intact) | |||||||
#3.8 | 3 | 1 | ? | ? | ? | 0 (uncut) | UNMAPPED: the 35 nt FST maps to several locations | - | |||
#3.11 | 2 | 2 | 17 | 2083314-2083570 | Cre17.g712100 | MDAR1 | intron 7 | + | 0 (filled in) | one FST suggests artifactual splicing between end of marker and exon 8 | unsure, PCR failed |
#3.12 | 2 | 1 | 10 | 1630598-1630551 | Cre10.g429850 | protein of unknown function conserved in Chlorophyceae | intron 6 | + | 0 (uncut) | 2 insertions ? underneath the main sequence, you can also read a short FST corresponding to a repeated sequence | unsure, PCR failed |
#4.1 | 1 | 2 | 5 | 377208-377154 and −377020 | Cre05.g231500 | Zn-finger protein | intron 6 | + | 0 (uncut) | two staggered polyadenylation sites | confirmed |
#4.2 | 1 | 1 | 8 | v5:4490830-4491010 | Augustus_ 11.2|g9033.t1 | unknown function | 3' UTR | + | 0 (uncut) | a good FST, not found in version 4 genome but found in three unpublished genome assemblies | not tested |
#4.3 | 2 | 0 | FAILED: no good sequence | - | |||||||
#4.4 | 2 | 1 | 2 | 9598215-9596562 | Cre02.g115000 | Ribosome-binding factor A | intron 4 | + | 0 (uncut) | two staggered polyadenylation sites; intron 5 is retained in the chimeric mRNA, but intron 6 is spliced out | confirmed |
#4.5 | 3 | 0 | FAILED: no good sequence | - | |||||||
#4.6 | 2 | 2 | ? | ? | ? | UNMAPPED: a 35 nt FST mapping to several locations, and a long one not mapping at all | - | ||||
#4.8 | 1 | 2 | 7 | 698506-698423 and −698252 | Cre07.g317300/ Cre07.g317350 | MAPKKKK1 and a protein of unknown function | intergenic | not applicable | 0 (uncut) | confirmed | |
#4.9 | 1 | 0 | FAILED: polyadenylation starts within the marker | - | |||||||
#4.10 | 1 | 1 | 7 | 4591244-4590811 | Cre07.g346000 | unknown function | end of 3'UTR | + | poly-A site downstream of gene model | confirmed | |
#4.12 | 1 | 0 | FAILED: no good sequence | ||||||||
#11.1 | 1 | 16 | 2391524-2391584 | Cre16.g666300 | protein kinase | upstream, intergenic | not applicable | ? | the PCR product cannot be read in the FST (primer too close to end) | disproved (gene intact) | |
#11.3 | 1 | 0 | FAILED: no marker DNA in sequence | - | |||||||
#11.4 | 1 | 1 | 4 | 704462-705281 | Cre04.g215800 | no annotation | last exon | + | 2 | unsure, PCR failed | |
#11.5 | 1 | 0 | FAILED: no match to genome | - | |||||||
#14.1 | 3 | 1 | 14 | 2909515-2909394 and 2908783- 2908718 | Cre14g.627600 | Dynein heavy chain | intron 6 and exon 8 | + | 0 | evidence for genome rearrangement or aberrant splicing | disproved (gene intact) |
#14.2 | 1 | 2 | 12 and other | > a dozen locations (with splicing) | many | Chlamydomonas- specific kinase family | usually exon 4 | + | 0 | UNMAPPED: too many good hits | - |
#14.3 | 3 | 1 | 14 | 3326333-3326254 | Cre14.g630150/ Cre14.g630200 | TRAF-type zinc finger protein; protein of unknown function conserved in Volvox only | intergenic | not applicable | 0 | unsure, PCR failed | |
#14.4 | 3 | 0 | FAILED: FST too short (2 nt) | - |