|Materials and Method
All cDNAs were cloned from C57BL/6J mouse mRNA. Around 160 different
tissues at different stages were used whose list was reported elsewhere1
(P. Carninci, et al. submitted). The CTAB method was used for RNA
extraction. Our original system to construct full-length cDNA libraries
consists of several technologies, such as Cap trapper method2,3,
the thermoactivation method of reverse transcriptase by trehalose4,
normalization and subtraction method5, oligolinker method (SSLMP,
Biotechniques) and usage of a new vector
The strategy based on the end sequencing to make non-redundant set of full-length clones was described elsewhere (H. Konno et al., Genome Research, in press and P. Carninci., et al. submitted). The clones were clustered based on the 3'-end sequences.
Rearray of clones
A single clone derived from the library of the best quality was selected as the representative from each cluster. The representative clones were picked up by Q-bot (GENETIX LIMITED) and arrayed onto 384 well-format plates. To make rearray plates, the E. coli was cultivated at 30oC for 18-24 hrs with 50 microL of LB medium (100 microG/ml of ampicillin / 50 microG/ml of kanamicin or 100 microG/ml of ampicillin / 25 microG/ml of streptomicin for PS/DH10B or ZAP/SOLR as host/vector system, respectively).
Plasmid extraction and InsSizing
Each clone was cultivated in 1.3 ml of HT medium with 100 microG/ml of ampicillin at 37oC for 21 hrs and the plasmid DNA was purified using QIAprep 96 Turbo (QIAGEN). To check the size of cDNA, 1/30 of plasmid DNA was digested by PvuII and subjected to 1% of agarose electrophoresis.
Three types of sequencers were used for the full-sequencing analysis. Depending on the insert size, the cDNAs were classified into two categories (cDNAs shorther than 2.5 kb and longer than 2.5kb) The short clones of the first category were sequenced from both ends using Licor DNA4200 (long read sequencer) with Thermosequenase Primer Cycle Sequencing Kit (Amersham Pharmacia Biotech). To achieve forward and reverse sequencing for the end sequencing we used the primer sets CACGACGTTGTAAAACGAC/GGATAACAATTTCACACAGG for ZAP, and CACGACGTTGTAAAACGAC/GGATAACAATTTCACACAGG for PS, respectively. The remaining gaps were filled up by the primer walking procedure, using ABI Prism377 and/or ABI Prism3700 (Applied Biosystems Inc.) with BigDye terminator kit and Cycle Sequencing FS ready Reaction Kit (Applied Biosystems Inc.). The long clones of the second category were sequenced based on the shotgun sequencing strategy by Shimadzu RISA 384 with DYEnamic ET terminator cycle sequencing kit (Amersham Pharmacia Biotech). In order to make a shot gun library, 48 PCR-amplified DNA fragments from 48 independent representative clones, whose identification were confirmed by end sequencing, were pooled and concatenated, followed by a shearing step using the Double Stroke Shearing Device (Fiore Inc.) as described elsewhere (M. Yoshino et al., in preparation). The ends of the DNA fragments were truncated to make blunt ends by T4 DNA polymerase. These DNA fragments were cloned into pUC18 and transformed into DH10B. Shotgun sequence was achieved with 12-15 redundancy. The remaining gaps were filled up by primer-walking procedure as mentioned above.
Assembling and gap-closing
All electropherograms from four sequencers, RISA, Licor DNA4200, ABI Prism377 and ABI Prism3700 were base-called by Phred6,7. The electropherogram of the Licor DNA4200 was modified for Phredby BaseImagIR version 3.1. The electropherogram of the RISA was also adapted to Phred(J. Adachi et al., in preparation).
Editing sequence data consisted of three steps. The first step comprises a computer-assisted system that assembles the raw sequence data. The second step includes gap-closing using public EST database, such as Genbank mouse EST database. Even after the second step, gaps still remain. In this case we proceeded to the final step of primer walking and resequencing.
In the first step, the Phrapassembler and Consed8
were used for assembling and editing the sequence including primer design,
respectively. Prior to the assembling procedure, the data were treated
with the cross-match program of the Phrappackage to mask vector
sequences. Phrap requires a base quality-indicator, the Phred
score, for the assembling step in each base of each cDNA. Since the Phredscore
is not given in public EST databases we assigned a putative Phredscore
to each base of each EST of a public database. Usually,
The primer walking procedure was employed to close the remaining gaps. The primer sequences were designed by the computer software Consed.The additionally produced sequence data by primer walking were connected to the original gap-containing sequence data until the gaps were closed. ABI Prism377 and/or ABI Prism3700 sequencers were used for the primer walking strategy.
After the entire sequence of each clone has been determined, we checked the identification between the clone and the sequence. Due to mistakes the 3'- and 5'- sequences produced by Phase II are sometimes different from original Phase I 3'- and 5'- sequences.Some of the errors for example, the reversed placement of a 384 well-format plate, could be clearly corrected by checking the whole procedure of Phase II. The Phase II sequences whose corresponding clones could not be identified were eliminated from the subsequent analyses.
1. Carninci, P. & Hayashizaki, Y. High-efficiency full-length cDNA cloning. Methods Enzymol 303, 19-44 (1999).
2. Carninci, P. et al. High-efficiency full-length cDNA cloning by biotinylated CAP trapper. Genomics 37, 327-36 (1996).
3. Carninci, P. et al. High efficiency selection of full-length cDNA by improved biotinylated cap trapper. DNA Res 4, 61-6 (1997).
4. Carninci, P. et al. Thermostabilization and thermoactivation of thermolabile
enzymes by trehalose and its application for the synthesis
5. Carninci, P. et al. Normalization and Subtraction of Cap-Trapper-Selected
cDNAs to Prepare Full-Length cDNA Libraries for Rapid
6. Ewing, B., Hillier, L., Wendl, M. C. & Green, P. Base-calling
of automated sequencer traces using phred. I. Accuracy assessment.
7. Ewing, B. & Green, P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8, 186-94 (1998).
8. Gordon, D., Abajian, C. & Green, P. Consed: a graphical tool for sequence finishing. Genome Res 8, 195-202 (1998).