Senior Fitness - Exercise and Nutrition for Aging Men and Women
FREE Article Feed for your website.
Home Ownership Magazine
Party Planning Information
Article Marketing Resources
Bio-Medical Research Article Database
Informative Articles on Life, Love and Happiness
Tutorials on Business to Writing
Famous Quotes from Famous People
Song Lyric Information
New US Patent Information
Comprehensive List of Content by Category
Online Auctions and Shopping Related Articles
Article Search
Most Recent Articles
Title: Light emitting diode module having a latching component and a heat-dissipating device
Patent Number: 7,438,449 Issued on 10/21/2008 to Lai,   et al.

Title: Light set with heat dissipation means
Patent Number: 7,438,448 Issued on 10/21/2008 to Chen

Title: Apparatus and method for improved illumination area fill
Patent Number: 7,438,447 Issued on 10/21/2008 to Holder,   et al.

Title: Night light projector
Patent Number: 7,438,446 Issued on 10/21/2008 to McCann

Title: Side-emitting light-emitting element and packaging lens thereof
Patent Number: 7,438,445 Issued on 10/21/2008 to Shiau,   et al.

Title: Light guide lens and light emitting diode package structure having the light guide lens
Patent Number: 7,438,444 Issued on 10/21/2008 to Pao,   et al.

Title: Lighting device, image-reading device, color-document reading apparatus, image-forming apparatus, projection apparatus
Patent Number: 7,438,443 Issued on 10/21/2008 to Tatsuno,   et al.

Title: Light emitting package, backlight unit and liquid crystal display device including the same
Patent Number: 7,438,442 Issued on 10/21/2008 to Lee,   et al.

Title: Light emitting light diode light tube
Patent Number: 7,438,441 Issued on 10/21/2008 to Sun,   et al.

Title: Lamp thermal management system
Patent Number: 7,438,440 Issued on 10/21/2008 to Dorogi

Title: Temperature adjusting device for an LED light source
Patent Number: 7,438,439 Issued on 10/21/2008 to Nakano

Title: Decorative lighting fixture with adjustable range motion detector
Patent Number: 7,438,438 Issued on 10/21/2008 to Sandell

Title: LED lamp assembly
Patent Number: 7,438,437 Issued on 10/21/2008 to Chang

Title: Backlight unit
Patent Number: 7,438,436 Issued on 10/21/2008 to Moon

Title: Illuminated support structure
Patent Number: 7,438,434 Issued on 10/21/2008 to Prince, Jr.,   et al.

Title: Lighting fixture assembly with easy access junction box
Patent Number: 7,438,433 Issued on 10/21/2008 to Steadman,   et al.

Title: Linear fixture assembly
Patent Number: 7,438,432 Issued on 10/21/2008 to Yaphe,   et al.

Title: Portable light with clip
Patent Number: 7,438,431 Issued on 10/21/2008 to Ford,   et al.

Title: Light beam generator apparatus
Patent Number: 7,438,430 Issued on 10/21/2008 to Kim

Title: Planar lighting device with transmittance adjuster and liquid crystal display device using the same
Patent Number: 7,438,429 Issued on 10/21/2008 to Matsushita

Title: Novelty glow spike
Patent Number: 7,438,428 Issued on 10/21/2008 to Schrimmer,   et al.

Title: Integration rod
Patent Number: 7,438,427 Issued on 10/21/2008 to Lin,   et al.

Title: Projection display apparatus and digital zooming method thereof
Patent Number: 7,438,426 Issued on 10/21/2008 to Yen,   et al.

Title: Illumination optical apparatus and projection type display apparatus
Patent Number: 7,438,424 Issued on 10/21/2008 to Yamada,   et al.

Title: Illumination system and projection system incorporating same
Patent Number: 7,438,423 Issued on 10/21/2008 to Conner

Title: Simplified night sky display system
Patent Number: 7,438,422 Issued on 10/21/2008 to Castellano

Title: Ophthalmic diagnostic apparatus for different types of tests
Patent Number: 7,438,417 Issued on 10/21/2008 to Divo

Title: Optometric apparatus
Patent Number: 7,438,416 Issued on 10/21/2008 to Hayashi,   et al.

Title: Eye examination device by means of tomography with a sighting device
Patent Number: 7,438,415 Issued on 10/21/2008 to Lacombe,   et al.

Title: Gaze discriminating electronic control apparatus, system, method and computer program product
Patent Number: 7,438,414 Issued on 10/21/2008 to Rosenberg

Title: Ophthalmic image sensing apparatus
Patent Number: 7,438,413 Issued on 10/21/2008 to Kashiwagi,   et al.

Title: Colored contact lens with a more natural appearance
Patent Number: 7,438,412 Issued on 10/21/2008 to Ocampo

Title: Plasmon resonant based eye protection
Patent Number: 7,438,411 Issued on 10/21/2008 to Payne,   et al.

Title: Rolling ink stick
Patent Number: 7,438,402 Issued on 10/21/2008 to Jones,   et al.

Title: Inkjet recording apparatus and ink cartridge
Patent Number: 7,438,401 Issued on 10/21/2008 to Seino,   et al.

Title: Methods and devices for purging gases from an ink reservoir
Patent Number: 7,438,397 Issued on 10/21/2008 to Anderson, Jr.,   et al.

Title: Liquid-jetting apparatus and method for producing the same
Patent Number: 7,438,395 Issued on 10/21/2008 to Sugahara

Title: Micro-electromechanical nozzle arrangement with non-wicking roof structure for an inkjet printhead
Patent Number: 7,438,391 Issued on 10/21/2008 to Silverbrook,   et al.

Title: Inkjet head
Patent Number: 7,438,389 Issued on 10/21/2008 to Katayama

Title: Printer having sprung printed circuit board for printhead assembly
Patent Number: 7,438,388 Issued on 10/21/2008 to Silverbrook,   et al.

Title: Device for washing an inkjet head and an inkjet printing system with the same
Patent Number: 7,438,384 Issued on 10/21/2008 to Byun,   et al.

Title: Method of removing flooded ink from a printhead
Patent Number: 7,438,381 Issued on 10/21/2008 to Morgan,   et al.

Title: Image forming apparatus
Patent Number: 7,438,380 Issued on 10/21/2008 to Ishikawa

Title: Fluorescent ink detector
Patent Number: 7,438,378 Issued on 10/21/2008 to Reichelsheimer,   et al.

Title: Printing device, printing device control program and method, and printing data generation device, program, and method
Patent Number: 7,438,375 Issued on 10/21/2008 to Arazaki

Title: Method of modulating printhead peak power requirement using redundant nozzles
Patent Number: 7,438,371 Issued on 10/21/2008 to Silverbrook,   et al.

Title: Recording apparatus having a device for detecting the presence or absence of a liquid
Patent Number: 7,438,369 Issued on 10/21/2008 to Uchikata

Title: Mining device
Patent Number: 7,438,365 Issued on 10/21/2008 to Kaiser,   et al.

Title: Armrest and method of making the same
Patent Number: 7,438,360 Issued on 10/21/2008 to Chung

Title: Longitudinal adjuster for a vehicle seat
Patent Number: 7,438,359 Issued on 10/21/2008 to Klahold,   et al.

Title: Infant chair
Patent Number: 7,438,358 Issued on 10/21/2008 to Jane Santamaria

Title: Multi-function transportable recreation chair
Patent Number: 7,438,355 Issued on 10/21/2008 to Pedemonte

Title: Guide tube-fixing structure for sunroof device
Patent Number: 7,438,353 Issued on 10/21/2008 to Tsukamoto,   et al.

Title: Air guiding system for a vehicle
Patent Number: 7,438,347 Issued on 10/21/2008 to Froeschle,   et al.

Title: Method and apparatus for controlling a vehicle door
Patent Number: 7,438,346 Issued on 10/21/2008 to Breed

Title: Convertible top device and method
Patent Number: 7,438,345 Issued on 10/21/2008 to Mrotek

Title: Convertible top weather strip
Patent Number: 7,438,344 Issued on 10/21/2008 to Williams,   et al.

Title: Convertible
Patent Number: 7,438,343 Issued on 10/21/2008 to Heselhaus

Title: Topper with retractable door for pickup trucks
Patent Number: 7,438,342 Issued on 10/21/2008 to Greenwood

Title: Headliner retainer
Patent Number: 7,438,341 Issued on 10/21/2008 to Olson, Jr.

Title: Vehicular crash attenuator
Patent Number: 7,438,337 Issued on 10/21/2008 to Gertz

Title: Compact electric strike with preload release capability
Patent Number: 7,438,335 Issued on 10/21/2008 to Uyeda

Title: Bolt-type seal lock
Patent Number: 7,438,334 Issued on 10/21/2008 to Terry,   et al.

Title: Magnetic latch assembly
Patent Number: 7,438,333 Issued on 10/21/2008 to Wu,   et al.

Title: Cam-action remote latch mechanism
Patent Number: 7,438,332 Issued on 10/21/2008 to Wang,   et al.

Title: Apparatus for opening and closing door
Patent Number: 7,438,331 Issued on 10/21/2008 to Wakatsuki

Title: Vehicle door lock actuator
Patent Number: 7,438,330 Issued on 10/21/2008 to Takahashi

Title: Methods and connections for coupled pipe
Patent Number: 7,438,329 Issued on 10/21/2008 to DeLange,   et al.

Title: Quick connector
Patent Number: 7,438,328 Issued on 10/21/2008 to Mori,   et al.

Title: Electrical connection assembly with unitary sealing and compression ring
Patent Number: 7,438,327 Issued on 10/21/2008 to Auray,   et al.

Title: Tee baffle for use at inlet or outlet of septic and other on-site waste disposal systems
Patent Number: 7,438,326 Issued on 10/21/2008 to Meyers

Title: Rotating passage
Patent Number: 7,438,325 Issued on 10/21/2008 to Rocca,   et al.

Title: Method and components for repairing broken conduit extending from concrete foundations
Patent Number: 7,438,324 Issued on 10/21/2008 to Keiper

Title: Business communication assembly having one or more recessed areas created through ablation by electromagnetic radiation
Patent Number: 7,438,323 Issued on 10/21/2008 to Lowry,   et al.

Title: Label
Patent Number: 7,438,322 Issued on 10/21/2008 to Miller

Nucleotide sequence of the Haemophilus influenzae Rd genome, fragments thereof, and uses thereof Number:6,846,651 from the United States Patent and Trademark Office (PTO) owispatent

Home    Author Login    Submit Article    Article Search    Add Your Link    Edit Your Link    Contact Us    Advertising    Disclaimer

   

 
Web LinkGrinder.com

Top Breaking News
     Greek, Cypriot Leaders Resume Unification Talks in Nicosia by Nathan Morley
     Indonesia Tobacco Sales Grow, Raising Health Fears
     South Korea Allows Top Defector to Travel Overseas by VOA News

Title: Nucleotide sequence of the Haemophilus influenzae Rd genome, fragments thereof, and uses thereof

Abstract: The present invention provides the sequencing of the entire genome of Haemophilus influenzae Rd, SEQ ID NO:1. The present invention further provides the sequence information stored on computer readable media, and computer-based systems and methods which facilitate its use. In addition to the entire genomic sequence, the present invention identifies over 1700 protein encoding fragments of the genome and identifies, by position relative to a unique Not I restriction endonuclease site, any regulatory elements which modulate the expression of the protein encoding fragments of the Haemophilus genome.

Patent Number: 6,846,651 Issued on 01/25/2005 to Fleischmann,   et al.


Inventors: Fleischmann; Robert D. (Gaithersburg, MD); Adams; Mark D. (Rockville, MD); White; Owen (Gaithersburg, MD); Smith; Hamilton O. (Reistertown, MD); Venter; J. Craig (Queenstown, MD)
Assignee: Human Genome Sciences, Inc. (Rockville, MD); Johns Hopkins University (Baltimore, MD)
Appl. No.: 158865
Filed: June 3, 2002

Current U.S. Class: 435/69.1; 435/252.3; 435/320.1; 536/23.7
Intern'l Class: C12N 015/63; C12N001/21; C12N015/31
Field of Search: 435/69.1,320.1,252.3 536/23.7


References Cited [Referenced By]

U.S. Patent Documents
6528289Mar., 2003Fleischmann et al.435/91.


Other References

Schein Production of soluble recombinant proteins in bacteria Biotechnology vol. 7, pp. 1141-1147 (1989).

Primary Examiner: Brusca; John S.
Attorney, Agent or Firm: Human Genome Sciences, Inc.

Goverment Interests



STATEMENT REGARDING FED SPONSORED R & D

Part of the work performed during development of this invention utilized U.S. Government funds. The government may have certain rights in this invention. NIH-5R01GM48251
Parent Case Text



This appln is a DIV of Ser. No. 09/557,884, filed Apr. 25, 2000, now U.S. Pat. No. 6,506,881 which is a con of Ser. No. 08/476,102 filed Jun. 7, 1995, now U.S. Pat. No. 6,355,450 which is a CIP of Ser. No. 08/426,787 filed Apr. 21, 1995, abandoned.
Claims



What is claimed is:

1. An isolated polynucleotide comprising a nucleic acid sequence encoding an amino acid sequence encoded by ORF HI0270, represented by nucleotides 301-267 of SEQ ID NO:1.

2. The isolated polynucleotide of claim 1, wherein said polynucleotide comprises a heterologous polynucleotide sequence.

3. The isolated polynucleotide of claim 2, wherein said heterologous polynucleotide sequence encodes a heterologous polypeptide.

4. A nucleic acid sequence complementary to the polynucleotide of claim 1.

5. A method for making a recombinant vector comprising inserting the isolated polynucleotide of claim 1 into a vector.

6. A recombinant vector comprising the isolated polynucleotide of claim 1.

7. The recombinant vector of claim 6, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

8. A recombinant host cell comprising the isolated polynucleotide of claim 1.

9. The recombinant host cell of claim 8, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

10. A method for producing a polypeptide, comprising:

(a) culturing a cell under conditions suitable to produce a polypeptide encoded by the polynucleotide of claim 1; and

(b) recovering the polypeptide.

11. An isolated polynucleotide comprising a nucleic acid sequence encoding a fragment of the amino acid sequence encoded by ORF HI0270, represented by nucleotides 301245-302267 of SEQ ID NO:1, wherein said fragment specifically binds an antibody which specifically binds a polypeptide consisting of the amino acid sequence of HI0270.

12. The isolated polynucleotide of claim 11, wherein said polynucleotide comprises a heterologous polynucleotide sequence.

13. The isolated polynucleotide of claim 12, wherein said heterologous polynucleotide sequence encodes a heterologous polypeptide.

14. An isolated polynucleotide complementary to the polynucleotide of claim 11.

15. A method for making a recombinant vector comprising inserting the isolated polynucleotide of claim 11, into a vector.

16. A recombinant vector comprising the isolated polynucleotide of claim 11.

17. The recombinant vector of claim 16, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

18. A recombinant host cell comprising the isolated polynucleotide of claim 11.

19. The recombinant host cell of claim 18, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

20. A method for producing a polypeptide, comprising:

(a) culturing a host cell under conditions suitable to produce a polypeptide encoded by the polynucleotide of claim 11; and

(b) recovering the polypeptide from the cell culture.

21. An isolated polynucleotide fragment comprising a nucleic acid sequence which hybridizes under hybridization conditions, comprising hybridization in 5.times.SSC and 50% formamide at 50-65.degree. C. and washing in a wash buffer consisting of 0.5.times.SSC at 50-65.degree. C., to the complementary strand of ORF HI0270, represented by nucleotides 301245-302267 of SEQ ID NO:1.

22. The isolated polynucleotide of claim 21, wherein said polynucleotide comprises a heterologous polynucleotide sequence.

23. The isolated polynucleotide of claim 22, wherein said heterologous polynucleotide sequence encodes a heterologous polypeptide.

24. An isolated polynucleotide complementary to the polynucleotide of claim 21.

25. A method for making a recombinant vector comprising inserting the isolated polynucleotide of claim 21 into a vector.

26. A recombinant vector comprising the isolated polynucleotide of claim 21.

27. The recombinant vector of claim 26, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

28. A recombinant host cell comprising the isolated polynucleotide of claim 21.

29. The recombinant host cell of claim 28, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

30. A method for producing a polypeptide, comprising:

(a) culturing a host cell under conditions suitable to produce a polypeptide encoded by the polynucleotide of claim 21; and

(b) recovering the polypeptide from the cell culture.

31. An isolated polynucleotide comprising a nucleic acid sequence encoding a polypeptide fragment consisting of at least 10 contiguous amino acid residues and no more than 100 amino acid residues of the amino acid sequence encoded by ORF HI0326, represented by nucleotides 301245-302267 of SEQ ID NO:1.

32. The isolated polynucleotide of claim 31, wherein said polynucleotide comprises a heterologous polynucleotide sequence.

33. The isolated polynucleotide of claim 32, wherein said heterologous polynucleotide sequence encodes a heterologous polypeptide.

34. An isolated polynucleotide complementary to the polynucleotide of claim 31.

35. A method for making a recombinant vector comprising inserting the isolated polynucleotide of claim 31 into a vector.

36. A recombinant vector comprising the isolated polynucleotide of claim 31.

37. The recombinant vector of claim 36, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

38. A recombinant host cell comprising the isolated polynucleotide of claim 31.

39. The recombinant host cell of claim 38, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

40. A method for producing a polypeptide, comprising:

(a) culturing a host cell under conditions suitable to produce a polypeptide encoded by the polynucleotide of claim 31; and

(b) recovering the polypeptide from the cell culture.

41. An isolated polynucleotide fragment comprising a nucleic acid sequence consisting of at least 30 contiguous nucleotide residues and no more than 300 contiguous nucleotide residues of an ORF HI0270, represented by nucleotides 301245-302267 of SEQ ID NO:1.

42. The isolated polynucleotide of claim 41, wherein said polynucleotide comprises a heterologous polynucleotide sequence.

43. The isolated polynucleotide of claim 41, wherein said heterologous polynucleotide sequence encodes a heterologous polypeptide.

44. An isolated polynucleotide complementary to the polynucleotide of claim 41.

45. A method for making a recombinant vector comprising inserting the isolated polynucleotide of claim 41 into a vector.

46. A recombinant vector comprising the isolated polynucleotide of claim 41.

47. The recombinant vector of claim 46, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

48. A recombinant host cell comprising the isolated polynucleotide of claim 41.

49. The recombinant host cell of claim 48, wherein said polynucleotide is operably associated with a heterologous regulatory sequence that controls gene expression.

50. A method for producing a polypeptide, comprising:

(a) culturing a host cell under conditions suitable to produce a polypeptide encoded by the polynucleotide of claim 41; and

(b) recovering the polypeptide from the cell culture.
Description



REFERENCE TO SEQUENCE LISTING, A TABLE, OR A COMPUTER LISTING APPENDIX

This application refers to a "Sequence Listing" listed below, which is provided as an electronic document on two identical compact discs (CD-R), labeled "Copy 1" and "Copy 2." These compact discs each contain the file "PB186P2C1D1.ST25.txt" (2,385,030 bytes, created on May 31, 2002), which is hereby incorporated in its entirety herein.

1. Field of the Invention

The present invention relates to the field of molecular biology. The present invention discloses compositions comprising the nucleotide sequence of Haemophilus influenzae, fragments thereof and usage in industrial fermentation and pharmaceutical development.

2. Background of the Invention

The complete genome sequence from a free living cellular organism has never been determined. The first mycobacterium sequence should be completed by 1996, while E. coli and S. cerevisae are expected to be completed before 1998. These are being done by random and/or directed sequencing of overlapping cosmid clones. No one has attempted to determine sequences of the order of a megabase or more by a random shotgun approach.

H. influenzae is a small (approximately 0.4.times.1 micron) non-motile, non-spore forming germ-negative bacterium whose only natural host is human. It is a resident of the upper respiratory mucosa of children and adults and causes otitis media and respiratory tract infections mostly in children. The most serious complication is meningitis, which produces neurological sequelae in up to 50% of affected children. Six H. influenzae serotypes (a through f) have been identified based on immunologically distinct capsular polysaccharide antigens. A number of non-typeable strains are also known. Serotype b accounts for the majority of human disease.

Interest in the medically important aspects of H. influenzae biology has focused particularly on those genes which determine virulence characteristics of the organism. A number of the genes responsible for the capsular polysaccharide have been mapped and sequenced (Kroll et al., Mol. Microbiol. 5(6):1549-1560 (1991)). Several outer membrane protein (OMP) genes have been identified and sequenced (Langford et al., J. Gen. Microbiol. 138:155-159 (1992)). The lipoligosaccharide (LOS) component of the outer membrane and the genes of its synthetic pathway are under intensive study (Weiser et al., J. Bacteriol. 172:3304-3309 (1990)). While a vaccine has been available since 1984, the study of outer membrane components is motivated to some extent by the need for improved vaccines. Recently, the catalase gene was characterized and sequenced as a possible virulence-related gene (Bishni et al., in press). Elucidation of the H. influenzae genome will enhance the understanding of how H. influenzae causes invasive disease and how best to combat infection.

H. influenzae possesses a highly efficient natural DNA transformation system which has been intensively studied in the non-encapsulated (R), serotype d strain (Kahn and Smith, J. Membrane Biology 81:89-103 (1984)). At least 16 transformation-specific genes have been identified and sequenced. Of these, four are regulatory (Redfield, J. Bacteriol. 173:5612-5618 (1991), and Chandler, Proc. Natl. Acad. Sci. USA 89:1626-1630 (1992)), at least two are involved in recombination processes (Barouki and Smith, J. Bacteriol 163(2):629-634 (1985)), and at least seven are targeted to the membranes and periplasmic space (Tomb et al., Gene 104:1-10 (1991), and Tomb, Proc. Natl. Acad. Sci. USA 89:10252-10256 (1992)), where they appear to function as structural components or in the assembly of the DNA transport machinery. H. influenzae Rd transformation shows a number of interesting features including sequence-specific DNA uptake, rapid uptake of several double-stranded DNA molecules per competent cell into a membrane compartment called the transformasome, linear translocation of a single strand of the donor DNA into the cytoplasm, and synapsis and recombination of the strand with the chromosome by a single-strand displacement mechanism. The H. influenzae Rd transformation system is the most thoroughly studied of the gram-negative systems and distinct in a number of ways from the gram-positive systems.

The size of H. influenzae Rd genome has been determined by pulsed-field agarose gel electrophoresis of restriction digests to be approximately 1.9 Mb, making its genome approximately 40% the size of E. coli (Lee and Smith, J. Bacterol. 170:4402-4405 (1988)). The restriction map of H. influenzae is circular (Lee et al., J. Bacteriol. 171:3016-3024 (1989), and Redfield and Lee, "Haemophilus influenzae Rd", pp. 2110-2112, In O'Brien, S. J. (ed), Genetic Maps: Locus Maps of Complex Genomes, Cold Spring Harbor Press, New York). Various genes have been mapped to restriction fragments by Southern hybridization probing of restriction digest DNA bands. This map will be valuable in verification of the assembly of a complete genome sequence from randomly sequenced fragments. GenBank currently contains about 100 kb of non-redundant H. influenzae DNA sequences. About half are from serotype b and half from Rd.

SUMMARY OF THE INVENTION

The present invention is based on the sequencing of the Haemophilus influenzae Rd genome. The primary nucleotide sequence which was generated is provided in SEQ ID NO:1.

The present invention provides the generated nucleotide sequence of the Haemophilus influenzae Rd genome, or a representative fragment thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan. In one embodiment, present invention is provided as a contiguous string of primary sequence information corresponding to the nucleotide sequence depicted in SEQ ID NO:1.

The present invention further provides nucleotide sequences which are at least 99.9% identical to the nucleotide sequence of SEQ ID NO:1.

The nucleotide sequence of SEQ ID NO:1, a representative fragment thereof, or a nucleotide sequence which is at least 99.9% identical to the nucleotide sequence of SEQ ID NO:1 may be provided in a variety of mediums to facilitate its use. In one application of this embodiment, the sequences of the present invention are recorded on computer readable media. Such media includes, but is not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media.

The present invention further provides systems, particularly computer-based systems which contain the sequence information herein described stored in a data storage means. Such systems are designed to identify commercially important fragments of the Haemophilus influenzae Rd genome.

Another embodiment of the present invention is directed to isolated fragments of the Haemophilus influenzae Rd genome. The fragments of the Haemophilus influenzae Rd genome of the present invention-include, but are not limited to, fragments which encode peptides, hereinafter open reading frames (ORFs), fragments which modulate the expression of an operably linked ORF, hereinafter expression modulating fragments (EMFs), fragments which mediate the uptake of a linked DNA fragment into a cell, hereinafter uptake modulating fragments (UMFs), and fragments which can be used to diagnose the presence of Haemophilus influenzae Rd in a sample, hereinafter, diagnostic fragments (DFs).

Each of the ORF fragments of the Haemophilus influenzae Rd genome disclosed in Tables 1(a) and 2, and the EMF found 5' to the ORF, can be used in numerous ways as polynucleotide reagents. The sequences can be used as diagnostic probes or diagnostic amplification primers for the presence of a specific microbe in a sample, for the production of commercially important pharmaceutical agents, and to selectively control gene expression.

The present invention further includes recombinant constructs comprising one or more fragments of the Haemophilus influenzae Rd genome of the present invention. The recombinant constructs of the present invention comprise vectors, such as a plasmid or viral vector, into which a fragment of the Haemophilus influenzae Rd has been inserted.

The present invention further provides host cells containing any one of the isolated fragments of the Haemophilus influenzae Rd genome of the present invention. The host cells can be a higher eukaryotic host such as a mammalian cell, a lower eukaryotic cell such as a yeast cell, or can be a procaryotic cell such as a bacterial cell.

The present invention is further directed to isolated proteins encoded by the ORFs of the present invention. A variety of methodologies known in the art can be utilized to obtain any one of the proteins of the present invention. At the simplest level, the amino acid sequence can be synthesized using commercially available peptide synthesizers. In an alternative method, the protein is purified from bacterial cells which naturally produce the protein. Lastly, the proteins of the present invention can alternatively be purified from cells which have been altered to express the desired protein.

The invention further provides methods of obtaining homologs of the fragments of the Haemophilus influenzae Rd genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. Specifically, by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs.

The invention further provides antibodies which selectively bind one of the proteins of the present invention. Such antibodies include both monoclonal and polyclonal antibodies.

The invention further provides hybridomas which produce the above-described antibodies. A hybridoma is an immortalized cell line which is capable of secreting a specific monoclonal antibody.

The present invention further provides methods of identifying test samples derived from cells which express one of the ORF of the present invention, or homolog thereof. Such methods comprise incubating a test sample with one or more of the antibodies of the present invention, or one or more of the DFs of the present invention, under conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced therefrom.

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry out the above-described assays.

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the antibodies, or one of the DFs of the present invention; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting presence of bound antibodies or hybridized DFs.

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining and identifying agents capable of binding to a protein encoded by one of the ORFs of the present invention. Specifically, such agents include antibodies (described above), peptides, carbohydrates, pharmaceutical agents and the like. Such methods comprise the steps of:

(a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention; and

(b) determining whether the agent binds to said protein.

The complete genomic sequence of H. influenzae will be of great value to all laboratories working with this organism and for a variety of commercial purposes. Many fragments of the Haemophilus influenzae Rd genome will be immediately identified by similarity searches against GenBank or protein databases and will be of immediate value to Haemophilus researchers and for immediate commercial value for the production of proteins or to control gene expression. A specific example concerns PHA synthase. It has been reported that polyhydroxybutyrate is present in the membranes of H. influenzae Rd and that the amount correlates with the level of competence for transformation. The PHA synthase that synthesizes this polymer has been identified and sequenced in a number of bacteria, none of which are evolutionarily close to H. influenzae. This gene has yet to be isolated from H. influenzae by use of hybridization probes or PCR techniques. However, the genomic sequence of the present invention allows the identification of the gene by utilizing search means described below.

Developing the methodology and technology for elucidating the entire genomic sequence of bacterial and other small genomes has and will greatly enhance the ability to analyze and understand chromosomal organization. In particular, sequenced genomes will provide the models for developing tools for the analysis of chromosome structure and function, including the ability to identify genes within large segments of genomic DNA, the structure, position, and spacing of regulatory elements, the identification of genes with potential industrial applications, and the ability to do comparative genomic and molecular phylogeny.

DESCRIPTION OF THE FIGURES

FIG. 1--restriction map of the Haemophilus influenzae Rd genome.

FIG. 2--Block diagram of a computer system 102 that can be used to implement the computer-based systems of present invention.

FIG. 3--A comparison of experimental coverage of up to approximately 4000 random sequence fragments assembled with AutoAssembler (squares) as compared to lander-Waterman prediction for a 2.5 Mb genome (triangles) and a 1.6 Mb genome (circles) with a 460 bp average sequence length and a 25 bp overlap.

FIG. 4--Data flow and computer programs used to manage, assemble, edit, and annotate the H. influenzae genome. Both Macintosh and Unix platforms are used to handle the AB 373 sequence data files (Kerlavage et al., Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, IEEE Computer Society Press, Washington D.C., 585 (1993)). Factura (AB) is a Macintosh program designed for automatic vector sequence removal and end trimming of sequence files. The program esp runs on a Macintosh platform and parses the feature data extracted from the sequence files by Factura to the Unix based H. influenzae relational database. Assembly is accomplished by retrieving a specific set of sequence files and their associated features using stp, an X-windows graphical interface and control program which can retrieve sequences from the H. influenzae database using user-defined or standard SQL queries. The sequence files were assembled using TIGR Assembler, an assembly engine designed at TIGR for rapid and accurate assembly of thousands of sequence fragments. TIGR Editor is a graphical interface which can parse the aligned sequence files from TIGR Assembler output and display the alignment and associated electropherograms for contig editing. Identification of putative coding regions was performed with Genemark (Borodovsky and McIninch, Computers Chem. 17(2):123 (1993)), a Markov and Bayes modeled program for predicting gene locations, and trained on a H. influenzae sequence data set. Peptide searches were performed against the three reading frames of each Genemark predicted coding region using blaze (Brutlag et al., Computers Chem. 17:203 (1993)) run on a Maspar MP-2 massively parallel computer with 4096 microprocessors. Results from each frame were combined into a single output file by mblzt. Optimal protein alignments were obtained using the program praze which extends alignments across potential frameshifts. The output was inspected using a custom graphic viewing program, gbyob, that interacts directly with the H. influenzae database. The alignments were further used to identify potential frameshift errors and were targeted for additional editing.

FIG. 5--A circular representation of the H. influenzae Rd chromosome illustrating the location of each predicted coding region containing a database match as well as selected global features of the genome. Outer perimeter: The location of the unique NotI restriction site (designated as nucleotide 1), the RsrII sites, and the SmaI sites. Outer concentric circle: The location of each identified coding region for which a gene identification was made. Second concentric circle: Regions of high G/C content and high A/T content. High G/C content regions are specifically associated with the 6 ribosomal operons and the mu-like prophage. Third concentric circle: Coverage by lambda clones. Over 300 lambda clones were sequenced from each end to confirm the overall structure of the genome and identify the 6 ribosomal-operons. Fourth concentric circle: The locations of the 6 ribosomal operons, the tRNAs and the cryptic mu-like prophage. Fifth concentric circle: Simple tandem repeats. The locations of the following repeats are shown: CTGGCT, GTCT, ATT, AATGGC, TTGA, TTGG, TTTA, TTATC TGAC, TCGTC, AACC, TTGC, CAAT, CCAA. The putative origin of replication is illustrated by the outward pointing arrows originating near base 603,000. Two potential termination sequences are shown near the opposite midpoint of the circle.

FIGS. 6(A) to 6(AN) Complete map of the H. influenzae Rd genome. Predicted coding regions are shown on each strand. rRNA and tRNA genes are shown as lines and triangles, respectively. GeneID numbers correspond to those in Tables 1(a), 1(b) and 2. Where possible, three-letter designations are also provided.

FIG. 7--A comparison of the region of the H. influenzae chromosome containing the 8 genes of the fimbrial gene cluster present in H. influenzae type b and the same region in H. influenzae Rd. The region is flanked by the pepN and purE genes in both organisms. However in the non-infectious Rd strain the 8 genes of the fimbrial gene cluster have been excised. A 172 bp spacer region is located in this region in the Rd strain and continues to be flanked by the pepN and purE genes.

FIG. 8--Hydrophobicity analysis of five predicted channel-proteins. The amino acid sequences of five predicted coding regions that do not display homology with known peptide sequences (GenBank release 87), each exhibit multiple hydrophobic domains that are characteristic of channel-forming proteins. The predicted coding region sequences were analyzed by the Kyte-Doolittle algorithm (Kyte and Doolittle, J. Mol. Biol. 157:105 (1982)) (with a range of 11 residues) using the GeneWorks software package (Intelligenetics).

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention is based on the sequencing of the Haemophilus influenzae Rd genome. The primary nucleotide sequence which was generated is provided in SEQ ID NO:1. As used herein, the "primary sequence" refers to the nucleotide sequence represented by the IUPAC nomenclature system.

The sequence provided in SEQ ID NO:1 is oriented relative to a unique Not I restriction endonuclease site found in the Haemophilus influenzae Rd genome. A skilled artisan will readily recognize that this start/stop point was chosen for convenience and does not reflect a structural significance.

The present invention provides the nucleotide sequence of SEQ ID NO:1, or a representative fragment thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan. In one embodiment, the sequence is provided as a contiguous string of primary sequence information corresponding to the nucleotide sequence provided in SEQ ID NO:1.

As used herein, a "representative fragment of the nucleotide sequence depicted in SEQ ID NO:1" refers to any portion of SEQ ID NO:1 which is not presently represented within a publicly available database. Preferred representative fragments of the present invention are Haemophilus influenzae open reading frames, expression modulating fragments, uptake modulating fragments, and fragments which can be used to diagnose the presence of Haemophilus influenzae Rd in sample. A non-limiting identification of such preferred representative fragments is provided in Tables 1(a) and and 2.

The nucleotide sequence information provided in SEQ ID NO:1 was obtained by sequencing the Haemophilus influenzae Rd genome using a megabase shotgun sequencing method. Using three parameters of accuracy discussed in the Examples below, the present inventors have calculated that the sequence in SEQ ID NO:1 has a maximum accuracy of 99.98%. Thus, the nucleotide sequence provided in SEQ ID NO:1 is a highly accurate, although not necessarily a 100% perfect, representation of the nucleotide sequence of the Haemophilus influenzae Rd genome.

As discussed in detail below, using the information provided in SEQ ID NO:1 and in Tables 1(a) and 2 together with routine cloning and sequencing methods, one of ordinary skill in the art will be able to clone and sequence all "representative fragments" of interest including open reading frames (ORFs) encoding a large variety of Haemophilus influenzae proteins. In very rare instances, this may reveal a nucleotide sequence error present in the nucleotide sequence disclosed in SEQ ID NO: 1. Thus, once the present invention is made available (i.e., once the information in SEQ ID NO:1 and Tables 1(a) and 2 have been made available), resolving a rare sequencing error in SEQ ID NO:1 will be well within the skill of the art. Nucleotide sequence editing software is publicly available. For example, Applied Biosystem's (AB) AutoAssembler.TM. can be used as an aid during visual inspection of nucleotide sequences.

Even if all of the very rare sequencing errors in SEQ ID NO:1 were corrected, the resulting nucleotide sequence would still beat least 99.9% identical to the nucleotide sequence in SEQ ID NO:1.

The nucleotide sequences of the genomes from different strains of Haemophilus influenzae differ slightly. However, the nucleotide sequence of the genomes of all Haemophilus influenzae strains will be at least 99.9% identical to the nucleotide sequence provided in SEQ ID NO:1.

Thus, the present invention further provides nucleotide sequences which are at least 99.9% identical to the nucleotide sequence of SEQ ID NO:1 in a form which can be readily used, analyzed and interpreted by the skilled artisan. Methods for determining whether a nucleotide sequence is at least 99.9 % identical to the nucleotide sequence of SEQ ID NO:1 are routine and readily available to the skilled artisan. For example, the well known fasta algothrithm (Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85:2444 (1988)) can be used to generate the percent identity of nucleotide sequences.

Computer Related Embodiments

The nucleotide sequence provided in SEQ ID NO:1, a representative fragment thereof, or a nucleotide sequence at least 99.9% identical to SEQ ID NO:1 may be "provided" in a variety of mediums to facilitate use thereof. As used herein, provided refers to a manufacture, other than an isolated nucleic acid molecule, which contains a nucleotide sequence of the present invention, i.e., the nucleotide sequence provided in SEQ ID NO:1, a representative fragment thereof, or a nucleotide sequence at least 99.9% identical to SEQ ID NO:1. Such a manufacture provides the Haemophilus influenzae Rd genome or a subset thereof (e.g., a Haemophilus Influenzae Rd open reading frame (ORF)) in a form which allows a skilled artisan to examine the manufacture using means not directly applicable to examining the Haemophilas influenzae Rd genome or a subset thereof as it exists in nature or in purified form.

In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer readable media. As used herein, "computer readable media" refers to any medium which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled artisan can readily appreciate how any of the presently known computer readable mediums can be used to create a manufacture comprising computer readable medium having recorded thereon a nucleotide sequence of the present invention.

As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled artisan can readily adopt any of the presently know methods for recording information on computer readable medium to generate manufactures comprising the nucleotide sequence information of the present invention.

A variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and MicroSoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of dataprocessor structuring formats (e.g. text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

By providing the nucleotide sequence of SEQ ID NO: 1, a representative fragment thereof, or a nucleotide sequence at least 99.9% identical to SEQ ID NO:1 in computer readable form, a skilled artisan can routinely access the sequence information for a variety of purposes. Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium. The examples which follow demonstrate how software which implements the BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system was used to identify open reading frames (ORFs) within the Haemophilus influenzae Rd genome which contain homology to ORFs or proteins from other organisms. Such ORFs are protein encoding fragments within the Haemophilus influenzae Rd genome and are useful in producing commercially important proteins such as enzymes used in fermentation reactions and in the production of commercially useful metabolites.

The present invention further provides systems, particularly computer-based systems, which contain the sequence information described herein. Such systems are designed to identify commercially important fragments of the Haemophilus influenzae Rd genome.

As used herein, "a computer-based system" refers to the hardware means, software means, and data storage means used to analyze the nucleotide sequence information of the present invention. The minimum hardware means of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently available computer-based system are suitable for use in the present invention.

As stated above, the computer-based systems of the present invention comprise a data storage means having stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means for supporting and implementing a search means. As used herein, "data storage means" refers to memory which can store nucleotide sequence information of the present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide sequence information of the present invention.

As used herein, "search means" refers to one or more programs which are implemented on the computer-based system to compare a target sequence or target structural motif with the sequence information stored within the data storage means. Search means are used to identify fragments or regions of the Haemophilus influenzae Rd genome which match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software includes, but is not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing software packages for conducting homology searches can be adapted for use in the present computer-based systems.

As used herein, a "target sequence" can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. The most preferred sequence length of a target sequence is from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that searches for commercially important fragments of the Haemophilus influenzae Rd genome, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed upon the folding of the target motif. There are a variety of target motifs known in the art. Protein target motifs include, but are not limited to, enzymic active sites and signal sequences. Nucleic acid target motifs include, but are not limited to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences).

A variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention. A preferred format for an output means ranks fragments of the Haemophilus influenzae Rd genome possessing varying degrees of homology to the target sequence or target motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the target sequence or target motif and identifies the degree of homology contained in the identified fragment.

A variety of comparing means can be used to compare a target sequence or target motif with the data storage means to identify sequence fragments of the Haemophilus influenzae Rd genome. In the present examples, implementing software which implement the BLAST and BLAZE algorithms (Altschul et al., J. Mol. Biol. 215:403-410. (1990)) was used to identify open reading frames within the Haemophilus influenzae Rd genome. A skilled artisan can readily recognize that any one of the publicly available homology search programs can be used as the search means for the computer-based systems of the present invention.

One application of this embodiment is provided in FIG. 2. FIG. 2 provides a block diagram of a computer system 102 that can be used to implement the present invention. The computer system 102 includes a processor 106 connected to a bus 104. Also connected to the bus 104 are a main memory 108 (preferably implemented as random access memory, RAM) and a variety of secondary storage devices 110, such as a hard drive 112 and a removable medium storage device 114. The removable medium storage device 114 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A removable storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape, etc.) containing control logic and/or data recorded therein may be inserted into the removable medium storage device 114. The computer system 102 includes appropriate software for reading the control logic and/or the data from the removable medium storage device 114 once inserted in the removable medium storage device 114.

A nucleotide sequence of the present invention may be stored in a well known manner in the main memory 108, any of the secondary storage devices 110, and/or a removable storage medium 116. Software for accessing and processing the genomic sequence (such as search tools, comparing tools, etc.) reside in main memory 108 during execution.

Biochemical Embodiments

Another embodiment of the present invention is directed to isolated fragments of the Haemophilus influenzae Rd genome. The fragments of the Haemophilus influenzae Rd genome of the present invention include, but are not limited to fragments which encode peptides, hereinafter open reading frames (ORFs), fragments which modulate the expression of an operably linked ORF, hereinafter expression modulating fragments (EMFs), fragments which mediate the uptake of a linked DNA fragment into a cell, hereinafter uptake modulating fragments (UMFs), and fragments which can be used to diagnose the presence of Haemophilus influenzae Rd in a sample, hereinafter diagnostic fragments (DFs).

As used herein, an "isolated nucleic acid molecule" or an "isolated fragment of the Haemophilus influenzae Rd genome" refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification means to reduce, from the composition, the number of compounds which are normally associated with the composition. A variety of purification means can be used to generated the isolated fragments of the present invention. These include, but are not limited to methods which separate constituents of a solution based on charge, solubility, or size.

In one embodiment, Haemophilus influenaze Rd DNA can be mechanically sheared to produce fragments of 15-20 kb in length. These fragments can then be used to generate an Haemophilus influenzae Rd library by inserting them into labda clones as described in the Examples below. Primers flanking, for example, an ORF provided in Table 1(a) can then be generated using nucleotide sequence information provided in SEQ ID NO:1. PCR cloning can then be used to isolate the ORF from the lambda DNA library. PCR cloning is well known in the art. Thus, given the availability of SEQ ID NO:1, Table 1(a) and Table 2, it would be routine to isolate any ORF or other nucleic acid fragment of the present invention.

The isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and double stranded DNA, and single stranded RNA.

As used herein, an "open reading frame," ORF, means a series of triplets coding for amino acids without any termination codons and is a sequence translatable into protein. Tables 1a, 1b and 2 identify ORFs in the Haemophilus influenzae Rd genome. In particular, Table 1a indicates the location of ORFs within the Haemophilus influenzae genome which encode the recited protein based on homology matching with protein sequences from the organism appearing in parentheticals (see the fourth column of Table 1(a)).

The first column of Table 1(a) provides the "GeneID" of a particular ORF. This information is useful for two reasons. First, the complete map of the Haemophilus influenzae Rd genome provided in FIGS. 6(A) 6(AN) refers to the ORFs according to their GeneID numbers. Second, Table 1(b) uses the GeneID numbers to indicate which ORFs were provided previously in a public database.

The second and third columns in Table 1(a) indicate an ORFs position in the nucleotide sequence provided in SEQ ID NO:1. One of ordinary skill will recognize that ORFs may be oriented in opposite directions in the Haemophilus influenae genome. This is reflected in columns 2 and 3.

The fifth column of Table 1(a) indicates the percent identity of the protein encoded for by an ORF to the corresponding protein from the orgaism appearing in parentheticals in the fourth column.

The sixth column of Table 1(a) indicates the percent similarity of the protein encoded for by an ORF to the corresponding protein from the organism appearing in parentheticals in the fourth column. The concepts of percent identity and percent similarity of two polypeptide sequences is well understood in the art. For example, two polypeptides 10 amino acids in length which differ at three amino acid positions (e.g., at positions 1, 3 and 5) are said to have a percent identity of 70%. However, the same two polypeptides would be deemed to have a percent similarity of 80% if, for example at position 5, the amino acids moieties, although not identical, were "similar" (i.e., possessed similar biochemical characteristics).

The seventh column in Table 1(a) indicates the length of the amino acid homology match.

Table 2 provides ORFs of the Haemophilus influenzae Rd genome which encode polypeptide sequences which did not elicit a "homology match" with a known protein sequence from another organism. Further details concerning the algorithms and criteria used for homology searches are provided in the Examples below.

A skilled artisan can readily identify ORFs in the Haemophilus influenzae Rd genome other than those listed in Tables 1(a), 1(b) and 2, such as ORFs which are overlapping or encoded by the opposite strand of an identified ORF in addition to those ascertainable using the computer-based systems of the present invention.

As used herein, an "expression modulating fragment," EMF, means a series of nucleotide molecules which modulates the expression of an operably linked ORF or EMF.

As used herein, a sequence is said to "modulate the expression of an operably linked sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs include, but are not limited to, promoters, and promoter modulating sequences (inducible elements). One class of EMFs are fragments which induce the expression or an operably linked ORF in response to a specific regulatory factor or physiological event. A review of known EMFs from Haemophilus are described by (Tomb et al. Gene 104:1-10 (1991), Chandler, M. S., Proc. Natl. Acad. Sci. USA 89:1626-1630 (1992).

EMF sequences can be identified within the Haemophilus influenzae Rd genome by their proximity to the ORFs provided in Tables 1(a), 1(b) and 2. An intergenic segment, or a fragment of the intergenic segment, from about 10 to 200 nucleotides in length, taken 5' from any one of the ORFs of Tables 1(a), 1(b), or 2 will modulate the expression of an operably linked 3' ORF in a fashion similar to that found with the naturally linked ORF sequence. As used herein, an "intergenic segment" refers to the fragments of the Haemophilus genome which are between two ORF(s) herein described. Alternatively, EMFs can be identified using known EMFs as a target sequence or target motif in the computer-based systems of the present invention.

The presence and activity of an EMF can be confirmed using an EMF trap vector. An EMF trap vector contains a cloning site 5' to a marker sequence. A marker sequence encodes an identifiable phenotype, such as antibiotic resistance or a complementing nutrition auxotrophic factor, which can be identified or assayed when the EMF trap vector is placed within an appropriate host under appropriate conditions. As described above, a EMF will modulate the expression of an operably linked marker sequence. A more detailed discussion of various marker sequences is provided below.

A sequence which is suspected as being a EMF is cloned in all three reading frames in one or more restriction sites upstream from the marker sequence in the EMF trap vector. The vector is then transformed into an appropriate host using known procedures and the phenotype of the transformed host in examined under appropriate conditions. As described above, an EMF will modulate the expression of an operably linked marker sequence.

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotide molecules which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified using known UMFs as a target sequence or target motif with the computer-based systems described above.

The presence and activity of a UMF can be confirmed by attaching the suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated with an appropriate host under appropriate conditions and the uptake of the marker sequence is determined. As described above, a UMF will increase the frequency of uptake of a linked marker sequence. A review of DNA uptake in Haemophilus is provided by Goodgall, S. H., et al., J. Bact. 172:5924-5928 (1990).

As used herein, a "diagnostic fragment," DF, means a series of nucleotide molecules which selectively hybridize to Haemophilus influenzae sequences. DFs can be readily identified by identifying unique sequences within the Haemophilus influenzae Rd genome, or by generating and testing probes or amplification primers consisting of the DF sequence in an appropriate diagnostic format which determines amplification or hybridization selectivity.

The sequences falling within the scope of the present invention are not limited to the specific sequences herein described, but also include allelic and species variations thereof. Allelic and species variations can be routinely determined by comparing the sequence provided in SEQ ID NO:1, a representative fragment thereof, or a nucleotide sequence at least 99.9% identical to SEQ ID NO:1 with a sequence from another isolate of the same species. Furthermore, to accommodate codon variability, the invention includes nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of one codon for another which encodes the same amino acid is expressly contemplated.

Any specific sequence disclosed herein can be readily screened for errors by resequencing a particular fragment, such as an ORF, in both directions (i.e., sequence both strands). Alternatively, error screening can be performed by sequencing corresponding polynucleotides of Haemophilus influenzae origin isolated by using part or all of the fragments in question as a probe or primer. Each of the ORFs of the Haemophilus influenzae Rd genome disclosed in Tables 1(a), 1(b) and 2, and the EMF found 5' to the ORF, can be used in numerous ways as polynucleotide reagents. The sequences can be used as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe, such as Haemophilus influenzae RD, in a sample. This is especially the case with the fragments or ORFs of Table 2, which will be highly selective for Haemophilus influenzae.

In addition, the fragments of the present invention, as broadly described, can be used to control gene expression through triple helix formation or antisense DNA or RNA, both of which methods are based on-the binding of a polynucleotide sequence to DNA or RNA. Polynucleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary to a region of the gene involved in transcription (triple helix--see Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et al., Science 251:1360 (1991)) or to the mRNA itself (antisense--Okano, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, Fla. (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated to be effective in model systems. Information contained in the sequences of the present invention is necessary for the design of an antisense or triple helix oligonucleotide.

The present invention further provides recombinant constructs comprising one or more fragments of the Haemophilus influenzae Rd genome of the present invention. The recombinant constructs of the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the Haemophilus influenzae Rd has been inserted, in a forward or reverse orientation. In the case of a vector comprising one of the ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a promoter, operably linked to the ORF. For vectors comprising tie EMFs and UMFs of the present invention, the vector may further comprise a marker sequence or heterologous ORF operably linked to the EMF or UMF. Large numbers of suitable vectors and promoters are known to those of skill in the art and are commercially available for generating the recombinant constructs of the present invention. The following vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, pXT1, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia).

Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other vectors with selectable markers. Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include lacI, lacZ, T3, T7, gpt, lambda P.sub.R, and trc. Eukaryotic


Free Web Sudoku Puzzles.
Solve with your browser.
7   8   3        
2               7
  9   8     6   2
  3   2     8   1
                 
8   9     7   3  
5   1     9   4  
3               5
        4   2   8
What is it?



Add Your Site · Terms Of Service · Privacy Policy


DISCLAIMER
Linkgrinder is a free service that searches the Internet and indexes all files found so that you may search quickly and easily for shared files. These files are created and made available individually by users whose identity we are not aware of and who we have no control over. In essence we function like a search engine tool; these files ARE NOT STORED OR SERVED BY OUR NETWORK. We are not responsible for any materials obtained by using our service. We do not monitor any of the contents of these files. These files may contain viruses, illegal materials, materials inappropriate for minors, offensive files and the like. BY USING OUR SERVICE, YOU ASSUME FULL RESPONSIBILITY FOR DOWNLOADING THESE MATERIALS AND WILL INDEMNIFY US FOR ANY DAMAGES THAT MAY BE INCURRED.

For More Specific Information VIEW OUR TERMS OF SERVICE.

Thank you and Enjoy!