Senior Fitness - Exercise and Nutrition for Aging Men and Women
FREE Article Feed for your website.
Bio-Medical Research Article Database
Informative Articles on Life, Love and Happiness
Tutorials on Business to Writing
Famous Quotes from Famous People
Song Lyric Information
New US Patent Information
Comprehensive List of Content by Category
Online Auctions and Shopping Related Articles
Article Search
Most Recent Articles

Directionally cloned random cDNA expression vector libraries, compositions and methods of use Number:6,808,906 from the United States Patent and Trademark Office (PTO) owispatent

Home    Author Login    Submit Article    Article Search    Add Your Link    Edit Your Link    Contact Us    Advertising    Disclaimer

   

Google
 

Top Breaking News
     Palestinian Hunger Striker Stirs Emotions by Robert Berger
     Al-Qaida Leader Voices Support for Syrian Uprising by VOA News
     Senegal Youth Mobilizes Before Elections by Nick Loomis

Title: Directionally cloned random cDNA expression vector libraries, compositions and methods of use

Abstract: The present invention provides random cDNA expression vector libraries, comprising expression vectors which comprise random cDNAs positioned in sense orientation. Also provided are random cDNA expression vector libraries, comprising expression vectors which comprise random cDNAs positioned in antisense orientation. Methods for producing these libraries through directional cloning of random cDNAs are also provided. Also provided herein are methods of using these libraries to screen for agents capable of modulating cell phenotype in desirable ways.

Patent Number: 6,808,906 Issued on 10/26/2004 to Shen,   et al.


Inventors: Shen; Mary (Newark, CA); Yu; Simon (Newark, CA); Wu; Xian (Redwood City, CA); Payan; Donald (Hillsborough, CA)
Assignee: Rigel Pharmaceuticals, Inc. (South San Francisco, CA)
Appl. No.: 10/142,662
Filed: May 8, 2002


Current U.S. Class: 435/91.41 ; 435/252.33; 435/320.1; 435/325; 435/489; 435/91.51
Field of Search: 435/91.41,320.1,325,252.33,489,91.51


References Cited [Referenced By]

U.S. Patent Documents
6344541 February 2002 Bass et al.

Other References

Gossen, et al. "Tight control of gene expression in mammalian cells by tetracycline-responsive promoters", Proc. Natl. Acad. Sci. USA, (1992) vol. 89: 5547-5551. .
Hofmann, et al. "Rapid retroviral delivery of tetracycline-inducible genes in a single autoregulatory cassette", Proc. Natl. Acad. Sci. USA, (1996) vol. 93: 5185-5190. .
Lorens, et al. "Retroviral delivery of peptide modulators of cellular functions", Molecular Therapy, (2000) vol. 1(5): 438-447. .
Matz, et al. "Fluorescent proteins from nonbioluminescent Anthozoa species", Nature Biotechnology, (1999) vol. 17: 969-973. .
Xu, et al. "Dominant effector genetics in mammalian cells", Nature Genetics, (2001) vol. 27: 23-29..

Primary Examiner: Ketter; James
Attorney, Agent or Firm: Keddie; James S. Francis; Carol L. Diehl; James J.

Claims



We claim:

1. A method for producing a vector, comprising: a) contacting an mRNA with a random primer under conditions suitable for production of a double-stranded cDNA, said random primer having at its 5' terminus a partial sequence of a restriction site for an infrequently cutting restriction endonuclease; b) ligating double-stranded adaptors to the ends of said double-stranded cDNA to produce an adaptor-modified cDNA having a complete restriction site sequence for said endonuclease at one end of said double-stranded cDNA; c) contacting said adaptor-modified cDNA with said restriction endonuclease to produce digested cDNA; and d) ligating the digested cDNA into a vector.

2. The method of claim 1, wherein said vector is an expression vector.

3. The method of claim 1, wherein said vector comprises a transcriptional regulatory sequence.

4. The method of claim 3, wherein said cDNA is ligated in a sense orientation with respect to said transcriptional regulatory sequence.

5. The method of claim 3, wherein said cDNA is ligated in an anti-sense orientation with respect to said transcriptional regulatory sequence.

6. The method of claim 1, wherein said vector is a retroviral vector.

7. The method of claim 1, wherein said adaptors contain a 3' overhang ligatable to said vector.

8. A vector comprising a cDNA insert flanked on both sides by different restriction sites for an infrequently cutting restriction endonuclease.

9. A cell containing the vector of claim 8.

10. A library of vectors made by the method of claim 1.

11. A library of cells containing the library of vectors of claim 10.

12. The method of claim 1, wherein said infrequently cutting restriction endonuclease is Sfi1, BstAP1, PfiM1, Mwo1 or AlwN1.

13. The method of claim 1, wherein said infrequently cutting restriction endonuclease is Sfi1.

14. The method of claim 8, wherein said infrequently cutting restriction endonuclease is Sfi1, BstAP1, PfiM1, Mwo1 or AlwN1.

15. The method of claim 8, wherein said infrequently cutting restriction endonuclease is Sfi1.
Description



FIELD OF THE INVENTION

The present invention relates generally to the field of molecular biology and in particular to the creation and use of gene libraries containing cloned cDNAs that encode expressed genes.

BACKGROUND OF THE INVENTION

A common practice in molecular biology is to create "gene libraries," which are collections of cloned fragments of DNA that represent genetic information in an organism, tissue or cell type. To construct a library, desired DNA fragments are prepared and inserted by molecular techniques into self-replicating units generally called cloning vectors. Each DNA fragment is therefore represented as part of an individual molecule, which can be reproduced in a single bacterial colony or bacteriophage plaque. Individual clones of interest can be identified by various screening methods, and then grown and purified in large quantities to allow study of gene organization, structure and function.

Only a small fraction of the genetic information for an organism is actually used in an individual cell or tissue at a particular time. A cDNA library is a type of gene library in which only DNA for actively expressed genes is cloned. These active genes can be selectively cloned over silent genes because the DNA for active genes is transcribed into messenger RNA (mRNA) as part of the pathway by which proteins are made. RNA molecules are polar in nature, i.e. the constituent nucleoside bases are linked via phosphodiester bonds between the 3' ribosyl position of one nucleoside and the 5' ribosyl position on the following nucleoside. RNA is synthesized in the 5' to 3' direction, and mRNAs are read by ribosomes in the same direction, such that proteins are synthesized from N-terminus to C-terminus. Over the past decade, cDNA libraries have become the standard source from which thousands of genes have been isolated for further study.

cDNA libraries may be expression libraries, whereby the cDNAs are transcribed and translated, resulting in the production of polypeptides corresponding to mRNA-encoded proteins. The activity of cDNA expression products may be assayed, and the function of corresponding mRNAs and proteins encoded thereby may be determined.

Full length cDNA, which comprises the entire open reading frame (ORF) of an mRNA, is desirable for many applications. Alternatively, partial cDNA and cDNA fragments are useful in some applications, for example, identifying functional domains within proteins. Interestingly, microdomains can exert unique biological effects compared to the parental molecules from which they are derived (Lorens et. al., Mol. Therapy, 1:438-447, 2000). The ability to express protein microdomains can be a powerful means to subtly perturb cellular physiology in manners that reveal new paths for therapeutic intervention.

The use of retroviruses is desirable for the stable transduction of genetic material into host cells, particularly host cells which are poorly transfectable, such as myoblasts and lymphocytes.

One object of the present invention is to provide methods and compositions for stably expressing genetic effectors, comprising random cDNAs, in host cells.

An additional object of the invention is to provide methods and compositions to screen for genetic effectors, comprising random cDNAs, that alter cell phenotype in a desirable way.

SUMMARY OF THE INVENTION

The present invention provides methods and compositions for producing directional random cDNA libraries. Directional random cDNA libraries comprising pluralities of directional random cDNA expression vectors, and methods of using these libraries, are also provided.

In one aspect of the invention, directional random cDNA expression vector libraries are provided. Each library comprises a plurality of directional random cDNA expression vectors. In a preferred embodiment, libraries comprising expression vectors with random cDNA in sense orientation are provided. In another embodiment, libraries comprising expression vectors with random cDNA in antisense orientation are provided. In another embodiment, libraries comprising a mixture of expression vectors with random cDNAs in sense orientation and antisense orientation are provided. As discussed below, the methods provided herein for making random cDNA libraries involve the directional cloning of random cDNAs into expression vectors. Accordingly, the orientation of a random cDNA in each vector is predetermined, facilitating construction of sense libraries, antisense libraries, and mixtures thereof. Such a scheme provides for the expression of antisense nucleic acid and nucleic acid corresponding in sequence to mRNA, as desired.

It will be understood that the cDNA libraries of the present invention comprise vectors, which comprise random cDNAs, which random cDNAs are directionally positioned in expression vectors in sense orientation, or antisense orientation. These libraries are sometimes referred to herein as directional random cDNA libraries. For the ease of description, the terms "directional" and "random" will often be omitted when referring herein to these libraries and methods of making the same.

In a preferred embodiment, the present invention provides cDNA expression vector libraries, each comprising a plurality of expression vectors, each vector comprising a) a first nucleic acid comprising a cDNA; b) a second nucleic acid which is a fusion partner; and c) a transcriptional regulatory sequence recognized by a host cell, wherein the first and second nucleic acids form a fusion nucleic acid which is operably linked to the transcriptional regulatory region (sometimes referred to herein as a transcriptional regulatory sequence). In some embodiments, the vectors also comprise a translational regulatory region (sometimes referred to herein as a translational regulatory sequence or start site) which forms part of the fusion nucleic acid and initiates translation of the fusion nucleic acid.

Preferred cDNAs for use in the present invention comprise sequences complementary to complete or near complete 5' mRNA ends, including native translational start sites, which facilitate translation of cDNA encoded transcript in a host cell.

Other cDNAs may be used however, as will be appreciated by those in the art. For example, cDNAs lacking native translation start sequences, and comprising sequences complementary to 3' mRNA ends also find use in some embodiments of the present invention.

In a preferred embodiment, the fusion partner encodes a detectable protein. In a preferred embodiment, the detectable protein is an autofluorescent protein. In a further preferred embodiment, the autofluorescent protein is a green fluorescent protein (GFP). In a further preferred embodiment, the autofluorescent protein is a GFP from Aequorea, or one of the well known variants thereof including red flourescent protein (RFP), blue fluorescent protein (BFP), and yellow fluorescent protein (YFP). In another further preferred embodiment, the autofluorescent protein is a GFP from Renilla. In another further preferred embodiment, the autofluorescent protein is a GFP from Ptilosarcus. In another preferred embodiment, the autofluorescent protein is a GFP homologue from Anthozoa species (Matz et al., Nat. Biotech., 17:969-973, 1999).

In a preferred embodiment, the first nucleic acid is fused to the 5' end of the second nucleic acid. The expression products of such a vector include a fusion nucleic acid wherein cDNA encoded sequence is located at the 5' end and nucleic acid sequence encoding detectable protein is located at the 3' end. Expression products also include a fusion protein that comprises an N-terminal polypeptide encoded by cDNA and a C-terminal polypeptide which is a detectable protein moiety. In embodiments where cDNA is inserted in antisense orientation, the expression products include a fusion nucleic acid wherein antisense nucleic acid is located at the 5' end and nucleic acid sequence encoding detectable protein is located at the 3' end.

In a preferred embodiment, the expression vector does not comprise a heterologous translation start site for the initiation of cDNA transcript translation.

In another embodiment, the expression vector comprises an heterologous translation start site for initiating translation of a cDNA transcript. In embodiments where cDNA is in antisense orientation, the heterologous translation start site provides for the translation of antisense cDNA transcripts. In embodiments where cDNA is in sense orientation, cDNA transcripts may be translated in frame or out of frame, depending on the positioning of the cDNA relative to the heterologous translation start site. cDNAs translated out of frame, and cDNA antisense transcripts, encode what are herein referred to as "random peptides".

Translation of cDNA transcripts out of frame may present internal "stop" codons (TAA, TGA, TAG), interrupting or inhibiting cDNA translation. Stop codons may also be encountered in antisense transcripts. For clarity of description, the occurrence of internal translational "stop" codons within cDNA antisense transcripts and cDNAs translated out of frame is not treated in every relevant embodiment discussed herein, though it is understood that such "stop" codons may occur.

In one embodiment, the first nucleic acid is fused to the 3' end of the second nucleic acid. The expression products of such a vector include a fusion nucleic acid wherein cDNA encoded sequence is located at the 3' end and nucleic acid sequence encoding detectable protein is located at the 5' end. Expression products may also include a fusion protein that comprises a C-terminal polypeptide encoded by cDNA and an N-terminal polypeptide which is a detectable protein moiety. Some cDNAs will be translated in frame while others will translate out of frame, encoding what are herein referred to as "random peptides". In embodiments where cDNA is in antisense orientation, the expression products include a fusion nucleic acid wherein antisense nucleic acid is located at the 3' end and nucleic acid sequence encoding detectable protein is located at the 5' end. In addition, antisense transcripts may be translated yielding fusion proteins comprising an N-terminus polypeptide which is a detectable protein moiety and a C-terminus peptide which is encoded by antisense cDNA transcript.

In another embodiment, the first nucleic acid is positioned within the second nucleic acid (e.g., the second nucleic acid comprises the first nucleic acid). Expression products of such vectors include fusion nucleic acids wherein cDNA-encoded sequence is located within nucleic acid sequence encoding detectable protein. Expression products also include fusion proteins that comprise cDNA-encoded peptides within detectable proteins, preferably in the surface exposed loop region of a detectable protein, as described herein. Some cDNAs will be translated in frame while others will translate out of frame, encoding what are referred to herein as random peptides. In embodiments where cDNA is inserted in antisense orientation, the expression products include fusion nucleic acids wherein antisense nucleic acid is located within nucleic acid sequence encoding detectable protein. In addition, antisense nucleic acids may be translated if stop codons are not encountered, yielding fusion proteins that comprise antisense encoded peptide within detectable protein.

In a preferred embodiment, expression vectors additionally comprise a third nucleic acid sequence, referred to herein as a linker, which is interposed between the first and second nucleic acids. In this embodiment, the linker may encode a linking peptide that joins cDNA encoded peptide to the detectable protein moiety in a fusion protein. Alternatively, as outlined, the linker may be a separation sequence that provides for the expression of separate cDNA encoded peptide and detectable protein moieties.

In a preferred embodiment, the linker connecting the first and second nucleic acids comprises an internal ribosome entry site (IRES). Such a linker may be used to fuse the first nucleic acid to the 5' end or the 3' end of the second nucleic acid. The expression products of such a vector include a fusion nucleic acid and two separate polypeptides translated from a fusion nucleic acid, particularly a first polypeptide which is encoded by a cDNA, and a second polypeptide which is a detectable protein.

In another embodiment, the linker connecting the first and second nucleic acids comprises a cleavage site. Such a linker may fuse the first nucleic acid to the 5' end or the 3' end of the second nucleic acid. The expression products of such a vector include a fusion nucleic acid, and a fusion protein wherein the cDNA-encoded polypeptide moiety and the detectable protein moiety are separated by an intervening cleavage site which is a polypeptide sequence that is recognized by a protease. This site provides for cleavage of the covalent peptide linkage which fuses the cDNA-encoded polypeptide moiety to the detectable protein moiety in the fusion protein and thereby provides for the expression of two separate polypeptides.

In another embodiment, the linker comprises a 2a sequence. Such a linker may fuse the first nucleic acid to the 5' end or the 3' end of the second nucleic acid. The expression products of such a vector include a fusion nucleic acid and two separate polypeptides translated from a fusion nucleic acid, particularly a first polypeptide which is encoded by a cDNA, and a second polypeptide which is a detectable protein.

In a preferred embodiment, cDNA expression vectors comprise a fusion partner, in addition to the second nucleic acid encoding a detectable protein. The fusion partner may be fused or linked to the first or second nucleic acid, or both.

In some embodiments, the second nucleic acid is a fusion partner other than a fusion partner encoding a detectable protein.

In some especially preferred embodiments, the cDNA expression vectors provided are retroviral vectors. Accordingly, retroviral cDNA expression vectors and libraries comprising the same are provided herein. In a preferred embodiment, retroviral vectors comprising random cDNAs which are operably linked to transcriptional regulatory sequence in sense orientation are provided. In another embodiment, retroviral vectors comprising random cDNAs which are operably linked to transcriptional regulatory sequence in antisense orientation are provided. In another embodiment, libraries comprising a mixture of retroviral vectors with random cDNAs in sense orientation and antisense orientation are provided.

In a preferred embodiment, the present invention provides retroviral expression vector libraries, each comprising a plurality of retroviral expression vectors, each vector comprising a) a first nucleic acid comprising a cDNA; b) a second nucleic acid which is a fusion partner; and c) a transcriptional regulatory sequence recognized by a host cell, wherein the first and second nucleic acids form a fusion nucleic acid which is operably linked to the transcriptional regulatory region. In some embodiments, the vectors also comprise a translational regulatory region which forms part of the fusion nucleic acid and initiates translation of the fusion nucleic acid.

In a preferred embodiment, the retroviral cDNA expression vectors provided herein comprise a self-inactivating 3' long terminal repeat (LTR) region which is located 3' of the first and second nucleic acids. These vectors are sometimes referred to as SIN vectors.

In a preferred embodiment, the retroviral cDNA expression vectors provided herein comprise a tetracycline-inducible (tet-inducible) promoter with an orientation opposite to the LTR and are SIN vectors. Preferred tet-inducible promoters comprise multiple copies of the tet operon operably linked to a minimal human cytomegalovirus (CMV) promoter (for example, see Gossen et al., PNAS 89:5547-5551, 1992).

In one aspect of the present invention, methods for producing random cDNA expression vectors, and libraries comprising the same, are provided. The methods involve the directional cloning of random cDNAs into expression vectors using particular adaptors and cloning sites, described below. In a preferred embodiment, the expression vectors are retroviral expression vectors. Accordingly, in a preferred embodiment, methods for producing retroviral random cDNA expression vectors, and libraries comprising the same, are provided.

In one aspect of the present invention, methods of screening for a bioactive agent capable of altering the phenotype of a cell in a desirable way are provided. In a preferred embodiment, the methods comprise the steps of a) introducing a cDNA expression vector library into a plurality of cells; b) screening the plurality of cells for a cell exhibiting a phenotype which is altered in a desirable way, wherein the altered phenotype is due to the expression of a cDNA. The methods may also comprise any of the steps of c) isolating at least one cell exhibiting an altered phenotype; d) isolating a nucleic acid comprising the cDNA from the cell exhibiting an altered phenotype; e) identifying the bioactive agent; and f) identifying and/or isolating the molecule(s) to which the agent binds. Additionally, in some preferred embodiments, the methods involve stimulating the plurality of cells in manner known to produce a disease-like response or a phenotype of the disease process. In an especially preferred embodiment, retroviral cDNA libraries provided herein are used.

In another preferred embodiment of this aspect of the invention, the methods comprise the steps of a) introducing a cDNA expression vector library into a first plurality of cells; b) contacting the first plurality of cells with a second plurality of cells; and c) screening the second plurality of cells for a cell exhibiting a phenotype which is altered in a desirable way, wherein the altered phenotype is due to contact with the first plurality of cells and expression of cDNA in the first plurality of cells. The method may also comprise any of the steps of d) isolating a cell from the first plurality of cells which is contacted with at least one cell in the second plurality of cells exhibiting an altered phenotype; e) isolating a nucleic acid comprising the cDNA from the cell isolated from the first plurality of cells; f) identifying the bioactive agent; and g) identifying and/or isolating the molecule(s) to which the agent binds. In an especially preferred embodiment, retroviral cDNA libraries provided herein are used.

In preferred embodiments of this aspect of the invention, methods of screening for bioactive agents capable of modulating the following physiological processes or biochemical activities are provided: IgE production in B cells; mast cell activation by IgE binding; mast cell degranulation; B cell activation and antibody secretion in response to antigen receptor stimulation; T cell activation in response to antigen receptor stimulation; epithelial cell activation; E3 ubiquitin ligase activity; inflammation induced by E3 ubiquitin ligase activity; inflammation induced by TNF activity; apoptosis in activated T cells; angiogenesis; uncontrolled cell proliferation; uncontrolled cell proliferation mediated by E3 ubiquitin ligase activity; and translation of Hepatitis C-encoded proteins.

Bioactive agents interact with target molecules to modulate cell phenotype. Provided herein are methods for isolating and identifying a target molecule using either the cDNA insert of a cDNA expression vector or an expression product thereof, including nucleic acids and polypeptides. Target molecules may be used to characterize signaling pathways, provide lead compounds for pharmaceutical development, and to screen for bioactive agents, including small molecule chemical compounds, capable of modulating target molecule activity.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 (SEQ ID NOS: 1-9) is a schematic diagram showing the preferred scheme for modifying random cDNA with adaptors, and for directionally cloning adaptor-modified cDNA into an expression vector. The sequence of preferred SfiI adaptors for use in the present invention is given. Additionally, the figure shows preferred vector cloning site sequences, comprising SfiI recognition sequence, for directionally cloning adaptor-modified cDNAs following digestion with SfiI.

FIG. 2 (SEQ ID NOS: 10-12) is a schematic diagram showing the vector P.multidot.96.7.multidot.C2sf, a preferred vector for directionally cloning random cDNA modified with preferred adaptors comprising the SfiI site. The vector comprises the composite CRU5 promoter, which is located upstream of the SfiI-a and SfiI-b cloning sites.

FIG. 3 (SEQ ID NOS:7, 13-16) is a schematic diagram showing the recognition sequences and cleavage patterns of restriction endonucleases Sfil, BstAP1, PfiM1, Mwo1 and AlwN1.

FIG. 4 shows cDNA inserts present in 12 samplings from a directionally cloned random cDNA library generated from Jurkat T cell RNA (method described in Example 1).

FIG. 5 depicts a schematic diagram of a preferred vector.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides methods and compositions for producing directional random cDNA libraries. Directional random cDNA libraries comprising pluralities of directional random cDNA expression vectors, and methods of using these libraries, are also provided.

As used herein, the term "cDNA" means DNA that corresponds to or is complementary to at least a portion of messenger RNA (mRNA) sequence and is generally synthesized from an mRNA preparation using reverse transcriptase or other methods. cDNA as used herein includes full length cDNA, corresponding to or complementary in sequence to full length mRNA sequences, partial cDNA, corresponding to or complementary in sequence to portions of mRNA sequences, and cDNA fragments, also corresponding to or complementary to portions of mRNA sequences. It should be understood that references to a particular "number" of cDNAs or other nucleic acids actually refers to the number of clones, cDNA sequences or species, rather than the number of physical copies of substantially identical sequences present. Moreover, the term is often used to refer to cDNA sequences incorporated into a plasmid or viral vector which can, in turn, be present in a bacterial cell, mammalian packaging cell line, or host cell.

By "cDNA fragment" is meant a portion of a cDNA that is derived by fragmentation of a larger cDNA. cDNA fragments may be derived from partial or full length cDNAs. As will be appreciated, a number of methods may be used to generate cDNA fragments. For example, cDNA may be subjected to shearing forces in solution that can break the covalent bonds of the backbone of the cDNA. In a preferred embodiment, cDNA fragments are generated by digesting cDNA with restriction endonuclease(s). Other methods are well known in the art.

"Partial cDNA" refers to cDNA that comprises part of the nucleic acid sequence which corresponds to or is complementary to the open reading frame (ORF) of the corresponding mRNA.

"Full length cDNA" refers to cDNA that comprises the complete sequence which is complementary to or corresponds to the ORF of the corresponding mRNA. In some instances, which are clear, full length cDNA refers to cDNA that comprises sequence complementary to or corresponding to the 5' untranslated region (UTR) of the corresponding mRNA, in addition to sequence which is complementary to or corresponds to the complete ORF.

A corresponding mRNA comprises the nucleotide sequence of the mRNA used as template for synthesis of a particular cDNA, or is the template mRNA used for synthesis of a particular cDNA.

The occurrence of alternatively spliced mRNAs in an mRNA pool used to make cDNA may lead to the synthesis of a cDNA which has sequence corresponding to more than one mRNA type. In addition, the cDNA may comprise a nucleotide sequence that is identical to only a segment of an alternatively spliced mRNA.

By "libraries" is meant a plurality. In a preferred embodiment, the cDNA expression vector libraries provided herein comprise between about 10.sup.3 and about 10.sup.9 independent clones, with from about 10.sup.5 to about 10.sup.8 being preferred, and about 10.sup.5 to about 10.sup.6 being especially preferred.

In one aspect, provided herein are methods for producing cDNA expression vector libraries. In a preferred embodiment, methods for producing retroviral cDNA expression vector libraries are provided. The methods involve the directional cloning of random cDNA into expression vectors, using adaptors and vector cloning sites described herein. Directional cloning of random cDNA refers to the insertion of a random cDNA into a vector in a single determined orientation, which is facilitated by the non-equivalent nature of adaptor-modified cDNA ends and complementary vector cloning site sequences. In contrast, bi-directional, or non-directional cloning, involves the insertion of cDNA in either of the two possible orientations, whereby half of the cDNA is inserted in sense orientation and half of the cDNA is inserted in antisense orientation. Non-directional cloning can be achieved through the use of identical adaptor-modified cDNA ends and complementary vector cloning site sequences.

General methods for producing cDNA libraries are known in the art (Blumberg et al. Science 253:194-196 (1991); Cho et al. Cell 67:1111-1120 (1991); Hawley et al. Genes Dev. 9:2923-2935 (1995)).

Methods for constructing cDNA libraries from mRNA isolated from a cellular source are well known in the art. General protocols are, for example, disclosed in Current Protocols in Molecular Biology, John Wiley & Sons, Ausubel et. al. eds., 1988, updated October 2001, Chapter 5, Construction of Recombinant DNA Libraries, particularly Section III, Preparation of Insert DNA from Messenger RNA, expressly incorporated herein by reference. Additionally, two commonly used methods of producing cDNA from mRNA are described in Okayama and Berg, Mol. Cell Biol. 2, 161-170 (1982) and Gubler and Hoffman, Gene 25 263-269, (1983).

In a typical procedure, poly(A)+ mRNAs are isolated from cells. However, isolated RNA that is not poly(A)+ enriched may also be used.

Methods for isolating RNA from eukaryotic and prokaryotic cells are well known in the art. For example, see Current Protocols in Molecular Biology, John Wiley & Sons, Ausubel et. al. eds., 1988, updated October 2001, Chapter 4, Preparation of RNA from Eukaryotic and Prokaryotic Cells, expressly incorporated herein by reference; Molecular Cloning: A Laboratory Manual, 3.sup.rd Edition, Sambrook et al. eds., Cold Spring Harbor Press, Cold Spring Harbor, N.Y., 2001, ISBN 0-87969-577-3. Poly(A)+, which is greatly enriched in mRNA can be separated from the remainder of total RNA, which is largely ribosomal RNA (rRNA) and transfer RNA (tRNA), for example, by binding to oligo(dT) cellulose (e.g., latex beads) while the remainder washes through. The poly(A)+ mRNA can be eluted from the beads following known procedures, such as the protocol described in Ausubel et al., supra, Unit 4.5. Some other protocols use poly(U)Sephadex instead of oligo(dT). See, e.g. Moore and Sharp, Cell 36, 581-591 (1984). A preferred method is that of Chomczynski and Sacchi, Anal. Biochem. 162:156-159 (1987). The RNA can be from any organism.

The initial mRNA may be present in a variety of different samples, where the sample will typically be derived from a physiological source. The physiological source may be derived from a variety of eukaryotic and prokaryotic sources. In addition, viral RNA may be used to serve as template for cDNA synthesis. Physiological sources of interest include sources derived from single celled organisms such as yeast and multicellular organisms, including plants and animals, particularly mammals, preferably humans, primates and rodents, where the physiological sources from multicellular organisms may be derived from particular organs or tissues of the multicellular organism, or from isolated cells derived therefrom. In obtaining the sample of RNAs from the physiological source from which it is derived, the physiological source may be subjected to a number of different processing steps, where such processing steps might include tissue homogenization, cell isolation and cytoplasmic extraction, nucleic acid extraction and the like, where such processing steps are known to those of skill in the art. Eukaryotic and prokaryotic sources include, but are not limited to, bacteria, plant, fungi, insect and mammalian sources, which include, but are not limited to algae, Arabidopsis thaliana, Aspergillus, Axolotl, baboon, bovine, barley, canine, carp, chicken, corn, Drosophila melanogaster, feline, firefly, frog, Fugu fish, hamster, human, lobster, monkey, mouse, nematode, opposum, pea, porcine, rabbit, rat, rice, sea urchin, sheep, soybean, spinach, tobacco, tomato, wheat, Xenopus laevis, yeast, and zebrafish. Preferred sources of RNA for use in the present invention are human, rodent, and primate. Tissue and cell sources for RNA include, but are not limited to, adipose, adrenal, adult brain, adult liver, adult ovary, amygdala, aorta, B-cell, T-cell, mast cell, bladder, blood, bone marrow, brain tumor, breast, breast tumor, capillary endothelial cells, carcinoma, cerebellum, cervix, chondrocyte, colon, colon tumor, colorectal adenocarcinoma, embryo, embryonic brain, embryonic adrenal, embryonic eye, embryonic gut, embryonic liver, embryonic lung, embryonic muscle, embryonic spleen, endothelial, epidermis, epithelial cell, erythroleukemia, esophageal tumor, esophagus, eye, fetus, fetal brain, fetal adrenal, fetal eye, fetal gut, fetal liver, fetal lung, fetal muscle, fetal spleen, fibroblast, fibrosarcoma, glioblastoma, glioma, heart, adult heart, HeLa, hepatocarcinoma, hepatoma, hippocampus, hypothalamus, intestine, small intestine, keratinocyte, kidney, kidney tumor, liver, liver tumor, lung, lung tumor, lymph node, lymphocyte, lymphoblast, lymphoma, macrophage, microglia, mammary gland, mucus-producing gland, muscle, myoblast, monocyte, nasal mucosa, neuronal, NIH 3T3, stomach, thyroid, uterus, oocyte, pancreas, ovarian tumor, pituitary, prostate, rectal tumor, rectum, retina, salivary gland, spinal cord, spleen, submucosa, stem cell, and tonsil. Viral nucleic acids may also be used.

Once isolated, mRNAs are then used as template for the synthesis of double stranded cDNA (dscDNA) using the enzyme reverse transcriptase. Synthesis of cDNA may be done in vitro or in vivo, as is known (for example, see U.S. Pat. No. 5,891,637, issued 6 Apr. 1999 to Ruppert et. al, incorporated herein be reference).

Reverse transcriptases have been traditionally purified from retroviruses, such as avian myoblastosis virus (AMV) and Moloney murine leukemia virus (M-MuLV), which use them to make DNA copies of their own RNA genomes. The M-MuLV reverse transcriptase has also been purified from overproducing E. coli cells containing the cloned gene. Tanese et al. in PNAS USA 82, 4944-4948 (1985) and Roth et al. in J. Biol. Chem. 260(16), 9326-9335 (1985) report on the expression, isolation and characterization of a reverse transcriptase isolated from Moloney murine leukemia virus (M-MuLV). This reverse transcriptase is encoded by the viral pol gene and is a monomer having a molecular weight of about 80 kD. See also U.S. Pat. No. 4,943,531.

In the process of converting mRNA into double stranded cDNA in vitro, a first cDNA strand is synthesized by the reverse transcriptase. A DNA polymerase, such as E. coli DNA polymerase, then uses the first cDNA strand as a template for the synthesis of the second cDNA strand, thereby producing a population of dscDNA molecules from the original poly(A)+ mRNA. The dscDNA is ligated to adaptors, and adaptor-modified cDNA is subsequently directionally cloned into expression vectors.

First strand cDNA synthesis is performed using any convenient protocol. In preparing the first strand cDNA, a primer is contacted with the mRNA, a reverse transcriptase, and other reagents necessary for primer extension under conditions sufficient for first strand cDNA synthesis to occur. In a preferred embodiment, the primers used for cDNA synthesis comprise a random polynucleotide from about 6 to about 12, more preferably from about 6 to about 10, more preferably from about 6 to about 9, most preferably about 8 nucleotides in length, and further comprise a 5' terminal nucleotide comprising the base cytosine, and a nucleotide immediately 3' to the 5' terminal nucleotide, comprising the base cytosine. Preferred primers may be generally described by the nucleic acid sequence 5'-CCN.sub.x -3', wherein N is any nucleotide, preferably a nucleotide selected from the group consisting of dAMP, dTMP, dGMP, dCMP, or analogs thereof which are known in the art, and where x indicates a number of N nucleotides from about 4 to about 10, more preferably about 4 to about 8, with about 6 being most preferred. Thus, an especially preferred primer has the general sequence 5'-CCNNNNNN-3'.

These primers are sometimes referred to herein as random primers, with the "CC" portion being considered an overhang to the random primer.

By "random primers" is meant random sequence oligonucleotide primers, in which each of the nucleotide positions is occupied by a nucleotide selected at random from among a complete set of possibilities, but commonly limited to the four nucleotides, dAMP, dCMP, dGMP, or dTMP.

The use of primers in cDNA synthesis is well known in the art, see for example, Sambrook et al., supra.

Additional reagents that may be present include: dNTPs; buffering agents, e.g. TrisCl; cationic sources, both monovalent and divalent, e.g. KCl, MgCl.sub.2 ; sulfhydril reagents, e.g. dithiothreitol; and the like. A variety of enzymes, usually DNA polymerases, possessing reverse transcriptase activity can be used for the first strand cDNA synthesis step. Examples of suitable DNA polymerases are described above. Preferably, the DNA polymerase will be selected from the group consisting of Moloney murine leukemia virus (M-MLV) as described in U.S. Pat. No. 4,943,531 and M-MLV reverse transcriptase lacking RNaseH activity as described in U.S. Pat. No. 5,405,776 (the disclosures of which patents are herein incorporated by reference), human T-cell leukemia virus type I (HTLV-I), bovine leukemia virus (BLV), Rous sarcoma virus (RSV), human immunodeficiency virus (HIV) and Thermus aquaticus (Taq) or Thermus thermophilus (Tth) as described in U.S. Pat. No. 5,322,770, the disclosure of which is herein incorporated by reference, avian reverse transcriptase, and the like. Suitable DNA polymerases possessing reverse transcriptase activity may be isolated from an organism, obtained commercially, or obtained from cells which express high levels of cloned genes encoding the polymerases by methods known to those of skill in the art, where the particular manner of obtaining the polymerase will be chosen based primarily on factors such as convenience, cost, availability and the like. Of particular interest because of their commercial availability and well characterized properties are avian reverse transcriptase and M-MLV.

The order in which the reagents are combined may be modified as desired. One protocol that may be used is as follows.

Primers are mixed with the total RNA or poly(A)+ RNA and processed under suitable conditions to promote first strand cDNA synthesis. Initially, the mixture of primers and RNA is, for a sufficient time, brought to a temperature sufficiently high to denature double-stranded portions of the nucleic acids. A denaturing step at 70.degree. C. for 10 minutes is generally suitable. While reaction components are added, the mixture is kept chilled to prevent renaturation or priming. Reaction components are added to bring the mixture to a suitable buffered pH and ionic strength, to allow RNA-dependent DNA synthesis to proceed. Also added to the reaction are deoxynucleotide triphosphates for incorporation into the first cDNA strand and an RNA-dependent DNA polymerase as described above. A preferred reverse transcriptase is the Moloney murine leukemia virus reverse transcriptase.

When the first strand synthesis reaction components have been added, the mixture is incubated for a sufficient time and at a temperature appropriate for RNA-dependent DNA polymerization. Incubation at 37.degree. C. for 60 minutes is generally suitable. When first strand synthesis is complete, the reaction is heated to a sufficiently high temperature for an adequate length of time to inactivate the RNA-dependent DNA polymerase (e.g., 70.degree. C. for 10 minutes).

In a preferred method, following first strand cDNA synthesis, the resultant duplex mRNA/cDNA (e.g., hybrid) is contacted with an RNAse capable of degrading single stranded RNA but not RNA complexed to DNA under conditions sufficient for any single stranded RNA to be degraded. A variety of different RNAses may be employed, where known suitable RNAses include: RNAse Ti from Aspergillus orzyae, RNase I, RNase A and the like. The exact conditions and duration of incubation during this step will vary depending on the specific nuclease employed. However, the temperature is generally between about 20 to 37.degree. C., and usually between about 25 to 37.degree. C. Incubation usually lasts for a period of time ranging from about 10 to 60 min, usually from about 15 to 60 min. Nuclease treatment results in the production of blunt-ended mRNA/cDNA duplexes or hybrids. In the resultant mixture, those mRNA/cDNA hybrids that include a full length cDNA will have the 5' cap structure of the template mRNA.

Second strand cDNA synthesis can proceed in the same reaction vessel as the first strand synthesis reaction. The reaction mixture is adjusted to buffering conditions appropriate for DNA polymerization using a DNA-dependent DNA polymerase. Also added to the second strand synthesis reaction are nucleotides for incorporation into a nascent second strand. Finally, an agent for introducing nicks into the RNA strand is added to the second strand reaction. By introducing nicks into the RNA strand, the DNA-dependent DNA polymerase can utilize the nicked RNA strands as primers for second strand DNA synthesis. During second strand synthesis, remaining RNA residues are displaced from the first strand by the growing second strand. A suitable nicking agent is RNase H (Okayama, H. and Berg, P. (1982) Mol. Cell. Biol. 2,161; Gubler, U. and Hoffman, B. (1983) Gene 25, 263). When the reaction components have been added, the second strand synthesis reaction is allowed to proceed for a suitable length of time at a temperature adequate to support DNA-dependent DNA polymerization. A generally suitable incubation condition is 15.degree. C. for 90 minutes. When second strand synthesis is complete, the double-stranded cDNA molecules thus formed are purified from the reaction components. Proteins can be inactivated and removed from the mixture by phenol:chloroform:isoamyl alcohol extraction. The double stranded cDNA is then precipitated with alcohol, centrifuged, and resuspended in water.

Alternatively, the first cDNA strand may be separated from mRNA using methods known in the art, and oligonucleotide primers may be used to prime synthesis of the second cDNA strand.

Secondary structure in mRNA, which can decrease the efficiency of the synthesis of cDNA, can be reduced with the use of methylmercury hydroxide to destroy base pairing as is known in the art. However, cDNA yields are reduced with the use thereof (see Krug and Berger, Methods Enzymol., 152:313-325,1987, incorporated herein by reference.

As is known in the art, by altering the ratio of primers to mRNA in the synthesis of cDNA, the average cDNA size is modified. Decreasing the ratio of primer to mRNA increases the average cDNA length, while increasing the ratio of primer to mRNA decreases the average cDNA length. For some applications, shorter cDNA length may be desirable, for example, screening for functional domains of proteins, or screening for protein fragments with dominant negative activity. Additionally, shorter cDNA may be desired when cDNA is fused to a fusion partner that better accommodates smaller cDNA as opposed to longer cDNA, as described below. For other applications, longer cDNA sequences may be desired.

cDNAs greater than about 0.5 kb in length, preferably from between about 0.5 kb and about 5.0 kb in length, and comprising native translation start sites are particularly preferred for use in the present methods of producing expression vectors.

By native translation start site is meant the translation start site sequence found in the corresponding mRNA.

Following second strand synthesis, 3' single stranded protrusions or overhangs commonly remain on the cDNA due to dissociation of short primers near the termini. Therefore, it is desirable to remove any overhanging bases in the cDNA molecules thus formed. An appropriate enzyme for "trimming" 3' extensions and/or adding terminal nucleotides to fill in 5' overhang ends is T4 DNA polymerase.

Conditions for using T4 DNA polymerase to make double stranded DNA blunt ended are well known, for example, see Sambrook et al., supra.

It will be appreciated that the preferred primers used for cDNA synthesis in the present methods provide for the synthesis of a double stranded cDNA wherein the sense strand comprises a 3' terminus GG. As is known in the art, by convention, mRNA is a sense strand.

Alternatively, in one embodiment, following second strand synthesis, dscDNA is cleaved with selected restriction endonucleases to generate restriction fragments. These restriction fragments are then blunted with T4 DNA polymerase and used in place of uncut dscDNA. In this way, cDNA fragments are produced. Fragments useful in the present invention are those comprising the 3' terminus sequence GG or the 5' terminus sequence CC, but not both. That is, restriction enzyme digestion and blunting produces a dscDNA product having the 3' terminus sequence GG, or the 5' terminus sequence CC. Any restriction endonuclease that satisfies these requirements may be used. Preferred enzymes are those which do not cut DNA frequently (i.e., those with longer recognition sequences). Many such restriction endonucleases are known, see Sambrook et al., supra. When restriction endonuclease digestion is used to generate one of these termini, it will be appreciated that random primers or poly dT primers, rather than the preferred primers described above which comprise a 5' terminus CC, may be used in the cDNA synthesis step.

Particularly preferred are those fragments additionally comprising a translational start site.

The next step in the method is to ligate the cDNA molecule to a pair of adaptors, generating adaptor modified cDNA.

cDNA synthesis by prior art methods typically involves methylation of cDNA in order to avoid digestion in subsequent steps, for example, during cleavage of adaptors. dCTP can be replaced in the reaction mix with 5-methyl dCTP. Incorporation of 5-methyl dCTP into the growing first strand protects the synthetic DNA from cleavage by restriction endonucleases. dCTP can, if desired, be replaced with 5-methyl dCTP during synthesis of the second cDNA strand as well so that the second strand will also be methylated, and thereby protected from cleavage by restriction endonucleases. Hemi-methylated and fully-methylated DNA are protected from cleavage by most restriction endonucleases. Another acceptable method for protecting against digestion at internal sequences is to treat the cDNA fragments with a specific DNA methylase prior to adaptor ligation.

However, an advantage of the present invention is that cDNA need not be methylated during or after synthesis to protect from digestion, as the adaptors provided for directional cloning of random cDNAs are cut with an infrequently cutting restriction enzyme, particularly SfiI. While SfiI will cut adaptors linked to cDNAs (at one end), as described below, it will not cut cDNAs internally at a high frequency, thus obviating the need to protect cDNA with methylation.

The present methods have an additional advantage over prior art methods for directionally cloning cDNA. In a preferred embodiment, cDNA synthesis is done using the preferred primer 5'-CCN.sub.6 -3'. Other directional cloning strategies typically use primers having long overhangs (12-20 nucleotides). These long overhangs are used to introduce restriction sites which provide for directional cloning of cDNA, but the primers are difficult to use and do not anneal to mRNA as stably as primers having short overhangs.

The primers used in the present methods comprise two nucleotides of the SfiI recognition sequence, which recognition sequence is generated in full at one cDNA end when cDNA produced with these primers is ligated to the present adaptors, as discussed below. Thus, primers with long primer overhangs comprising full restriction sites are not used in the present methods.

Adaptors are ligated to cDNA using T4 DNA ligase. The same adaptors are ligated to the 5' and 3' end of the cDNA. Preferred adaptors are generally described by the following sequence:

5'-p-C C N.sub.1 N.sub.2 N.sub.3 N.sub.4 N.sub.5 G G C C N.sub.x G G C C N.sub.6 N.sub.7 N.sub.8 N.sub.9 -3' (SEQ ID NO:17) 3'-G G N'.sub.1 N'.sub.2 N'.sub.3 N'.sub.4 N'.sub.5 C C G G N'.sub.x C C G G N'.sub.6 -p -5';

wherein N.sub.1 through N.sub.9 are each any nucleotide, preferably a nucleotide selected from the group consisting of dAMP, dTMP, dGMP, dCMP, or analogs thereof which are known in the art, and where N.sub.x indicates a number of nucleotides, which may be any nucleotide, from about 1 to about 9 nucleotides, with 3 being most preferred, and wherein N' denotes a nucleotide which is complementary to N.

When the preferred adaptors are ligated to cDNA as described above, an SfiI recognition site is generated at one end of the adaptor-modified cDNA molecule. Once cut (at one end) with SfiI, the adaptor-modified cDNA has distinct, non-complementary 3' overhangs; one being N.sub.2 N.sub.3 N.sub.4, the other being N.sub.7 N.sub.8 N.sub.9. The adaptor-modified cDNA can be directionally cloned into a vector comprising distinct overhangs complementary to those of the adaptor-modified cDNA, as described below.

In addition, the preferred adaptors are designed such that unwanted blunt end ligated adaptor dimers are also cut with SfiI.

In an especially preferred embodiment, the adaptors have the following sequence:

5'-p-C C G C C T C G G C C A G T G G C C G T A A- 3' (SEQ ID NO:1) 3'-G G C G G A G C C G G T C A C C G G C-p-5';

Excess adaptors and small cDNAs may be removed in a gel filtratfon step. Preferred cDNAs are from about 0.5 kb to about 5.0 kb in size.

Adaptor modified cDNA is inserted between 5' and 3' sites in an expression vector. The vector sites when cut provide distinct 3' overhangs which are complementary to the 3' overhangs of adaptor-modified cDNA which has been cut with SfiI, providing for the directional cloning of cDNA.

The preferred vector sites are as follows:

a) a 5' SfiI-a site comprising the sequence 5'-GGCCNN'.sub.9 N'.sub.8 N'.sub.7 NGGCC-3' (SEQ ID NO: 7), and an SfiI-b site located 3' of this SfiI-a site, comprising the sequence 5'-GGCCNN.sub.2 N.sub.3 N.sub.4 NGGCC-3' (SEQ ID NO: 7) as read on the same strand; or

b) a 5' SfiI-b site comprising the sequence 5'-GGCCNN'.sub.4 N'.sub.3 N'.sub.2 NGGCC- 3' (SEQ ID NO: 7) and an SfiI-a site located 3' of this SfiI-b site, comprising the sequence 5'-GGCCNN.sub.7 N.sub.8 N.sub.9 NGGCC-3' (SEQ ID NO: 7) as read on the same strand;

wherein N.sub.2 N.sub.3 N.sub.4 and N.sub.7 N.sub.8 N .sub.9 are the same nucleotides denoted for adaptors, wherein N is any nucleotide, preferably a nucleotide selected from the group consisting of dAMP, dTMP, dGMP, dCMP, or analogs thereof which are known in the art, and wherein N' denotes a nucleotide which is complementary to N.

Especially preferred vector sites for use with the especially preferred adaptors described above are as follows:

a) the 5' SfiI-a site 5'-GGCCATTACGGCC-3' (SEQ ID NO:8) and the 3' SfiI-b site 5'-GGCCGCCTCGGCC-3' (SEQ ID NO:9);

b) the 5' SfiI-b site GGCCGAGGCGGCC (SEQ ID NO:18) and the 3' SfiI-a site GGCCGTAATGGCC (SEQ ID NO:19).

These sites comprise the SfiI recognition sequence, and the vector is engineered such that these are the only SfiI sites present in the expression vector.

By cleaving adaptor-modified cDNA with SfiI, distinct, non-complementary ends are produced. By cleaving vector with SfiI, the same distinct, non-complementary ends are produced. The cleaved adaptor-modified cDNA can then be directionally cloned into the expression vector. Further, when cDNA-modified with preferred adaptors is cloned into these preferred vector sites, SfiI sites remain flanking the cDNA insert, and SfiI may be used to excise the cDNA from the expression vector.

In one embodiment, cDNA used in the methods is a cDNA restriction fragment having a 3' terminus GG, or a 5' terminus CC, but not both, as described above. As will be appreciated, those cDNA fragments which have both a 3' terminus GG and a 5' terminus CC (3' antisense GG) will not be directionally cloned, as both ends of the adaptor-modified cDNA will be cut with SfiI, generating identical 3' overhangs.

Especially preferred cDNA fragments are those fragments cut once with restriction endonuclease and comprising a native translation start site.

Additional vector sites may be used, but are less preferred because insertion of adaptor-modified cDNA at these sites does not generate an SfiI site, which is desirable for excision. In addition, SfiI is an infrequently cutting restriction endonuclease, which is desirable. The use of restriction endonucleases which cut DNA with a higher frequency than SfiI increases the chance of cutting cDNA internally with excision from the expression vector. However, flanking sites may be engineered into a vector, and cDNA may be removed using these flanking sites, which may be SfiI sites. Additionally, cDNA may be obtained by means other than excision, for example, by PCR.

Accordingly, a vector can comprise cloning sites other than SliI-a and SfiI-b sites, and still provide for directional cloning of cDNA that is adaptor modified as described herein. Useful vector sites are those that when cut with the corresponding restriction enzymes generate distinct 3' overhangs which are complementary to those of the adaptor-modified cDNA. Useful sites include, but are not limited to, the recognition sequences for EstAP1, PfiM1, Mwo1 and ALwN1. When these sequences are cut with corresponding restriction enzymes, 3' overhangs 3 nucleotides in length are generated. The sequence of the overhangs is determined by the sequence of the recognition site. The consensus recognition sequences of the enzymes listed above are similar to that of SfiI, in that the core region of the sequence, which comprises the overhang sequence generated following digestion, may comprise any nucleotide sequence. For example, the MwoI recognition sequence is as follows:

5'-GCNNNNNNNGC-3', (SEQ ID NO:15) 3'-CGNNNNNNNCG-5'

where N is any nucleotide.

When cut with Mwol, the following 3' overhang is generated:

5'-GCNNNNN-3' 3'-CGNN-5'.

Accordingly, an Mwol site can be engineered in the vector to provide a specific 3' overhang sequence, which by design will be complementary to one of the adaptor modified cDNA ends.

Additional adaptors may also be used, but are less preferred because they do not generate SfiI recognition sites at both ends following insertion into an expression vector comprising SfiI-a and SfiI-b sites.

These adaptors are generally described by the following sequence:

5'-p-C C N.sub.1 N.sub.2 N.sub.3 N.sub.4 N.sub.5 G G C C N.sub.x N.sub.6 N.sub.7 N.sub.8 -3' (SEQ ID NO:20) 3'-G G N'.sub.1 N'.sub.2 N'.sub.3 N'.sub.4 N'.sub.5 C C G G N.sub.x '-p-5';

wherein N.sub.1 through N.sub.8 are each any nucleotide, preferably a nucleotide selected from the group consisting of dAMP, dTMP, dGMP, dCMP, or analogs thereof which are known in the art, and where N.sub.x indicates a number of nucleotides, which may be any nucleotide, from about 4 to about 14 nucleotides, more preferably about 6 to about 14, more preferably about 8 to about 14, and wherein N' denotes a nucleotide which is complementary to N.

When adaptors are ligated to cDNA as described above, an SfiI recognition site is generated at one end of the molecule. Once cut with SfiI, the adaptor-modified cDNA has distinct non-complementary 3' overhangs; one being N.sub.2 N.sub.3 N.sub.4, the other being N.sub.6 N.sub.7 N.sub.8. The adaptor-modified cDNA can be directionally cloned into an appropriate vector, which may comprise the preferred SfiI-a/b sites, or other sites capable of generating distinct overhangs complementary to those of the cDNA.

The requirement of at least four nucleotides (N.sub.x) following the 3' end of the SfiI recognition sequence (as formed at one end of the cDNA following ligation thereto) is to ensure cleavage of the sequence by SfiI, which requires some sequence following the end of the recognition sequence for effective cleavage.

Enriching for full-length cDNAs is useful in the present invention. Clones having cDNAs that comprise the 5' UTR and which are operably linked to transcription control sequences in the vector allow initiation from proper transcription initiation sites. In addition, full length cDNAs comprise native translation start sites, providing for translation of a native ORF. Further, full length cDNAs provide 5' sequence which often encodes important N-terminal functional moieties, including targeting signals.

Enriching for full length cDNAs can be done by the oligo-capping method (Maruyama and Sugano, Gene 138:171-174 (1994)). This method has been used to obtain libraries with more than 80% full-length clones (Suzuki et al., Gene 200:149-156 (1997)). Regarding the capping method, see also Kato et al. Gene 25, 243-250 (1994). Kits for performing the oligo-capping method are commercially available and may be used in the present methods. For example, see Ambion, FirstChoice.TM. RLM-RACE kit, catalog #1700, Ambion Inc., Austin, Tex., USA.

The capping method is briefly described as follows. A combination of enzymes may be used to select full length poly(A)+ mRNAs and tag their ultimate 5' ends. Starting from a population of poly(A)+ mRNAs including sequences that are not full length, a phosphatase (such as HK thermolabile phosphatase) can be used to remove the phosphate moiety from mRNAs that are not full length, leaving 5'-OH ends at those mRNAs. Full length poly(A)+ mRNAs are protected due to the 7-methyl-Gppp cap. Tobacco Acid Pyrophosphatase is then used to digest the 7-methyl-Gppp cap, leaving a 5' phosphate moiety at the 5' end of the full length mRNA. T4 RNA ligase is then used to tag the full length poly(A)+ mRNAs at their 5' ends with "oligo-caps". The oligo caps have a 3'-OH end and thus can be ligated only to poly(A)+ mRNAs displaying a 5' phosphate moiety. Thus, at the end of this procedure, the full-length mRNAs are tagged at the 5' end by an oligonucleotide and naturally at the 3' end by poly(A). Conveniently, the oligonucleotide cap is an RNA oligonucleotide, made by in vitro transcription or made by using an oligonucleotide synthesizer, or a hybrid RNA/DNA oligonucleotide made in an oligonucleotide synthesizer. The oligonucleotide cap can be engineered to include other sequences, including linker sequences for linking first and second nucleic acids, as described herein.

In a preferred embodiment, the oligonucleotide cap is engineered to provide a 5' terminus CC (encoding a first strand cDNA 3' terminus GG). A preferred oligonucleotide CAP comprises a 5' terminus CC. Alternatively, the CAP may comprise an internal CC sequence, and the 5' terminus CC required for the method may be generated subsequent to capping with an inner PCR reaction (for example, see Ambion, FirstChoice.TM. RLM-RACE kit, catalog #1700, Ambion Inc., Austin, Tex., USA).

The 5' CAP attached to the mRNA is transcribed into cDNA. An oligonucleotide comprising the CAP oligo sequence can be annealed to the cDNA CAP and used as a primer for synthesizing the second cDNA strand.

In one embodiment, cDNA synthesis biased towards the 5' end, as opposed to the 3' end bias that typically occurs with poly dT primer, is performed. Random primers are used in combination with CAP oligos and primers. This method provides for capturing important 5' encoded functional moieties, described above, without synthesizing full length cDNA.

Normalizing cDNA synthesis may also be done. Normalizing is useful because it generally increases the diversity of isolated mRNAs. Normalizing reduces the number of abundant mRNAs while increasing the frequency of rare mRNAs in a sample. For example, abundant mRNAs can be reduced between 100- to 1000-fold, while rare mRNAs can be increased up to 100-fold. Normalized libraries are well known in the art (Soares et al., Proc. Nat'l Acad. Sci. USA 91:9228-9232 (1994); Bonaldo et al., Genome Res. 6:791-806 (1996), Komiya et al., Anal. Biochem. 254:23-30 (1997)).

Typically, normalization is carried out prior to capping and comprises the following steps:

(i) binding the poly(A)+ mRNAs to oligo d(T) coated substrate;

(ii) synthesizing cD


Free Web Sudoku Puzzles.
Solve with your browser.
1       5        
  7   9   8      
9 4         2    
8 2       5      
  9           3  
      1       8 4
    7         5 3
      8   9   4  
        3       1
What is it?



Add Your Site · Terms Of Service · Privacy Policy


DISCLAIMER
Linkgrinder is a free service that searches the Internet and indexes all files found so that you may search quickly and easily for shared files. These files are created and made available individually by users whose identity we are not aware of and who we have no control over. In essence we function like a search engine tool; these files ARE NOT STORED OR SERVED BY OUR NETWORK. We are not responsible for any materials obtained by using our service. We do not monitor any of the contents of these files. These files may contain viruses, illegal materials, materials inappropriate for minors, offensive files and the like. BY USING OUR SERVICE, YOU ASSUME FULL RESPONSIBILITY FOR DOWNLOADING THESE MATERIALS AND WILL INDEMNIFY US FOR ANY DAMAGES THAT MAY BE INCURRED.

For More Specific Information VIEW OUR TERMS OF SERVICE.

Thank you and Enjoy!