Senior Fitness - Exercise and Nutrition for Aging Men and Women
FREE Article Feed for your website.
Bio-Medical Research Article Database
Informative Articles on Life, Love and Happiness
Tutorials on Business to Writing
Famous Quotes from Famous People
Song Lyric Information
New US Patent Information
Comprehensive List of Content by Category
Online Auctions and Shopping Related Articles
Article Search
Most Recent Articles

Method, system, and computer software for the presentation and storage of analysis results Number:7,031,846 from the United States Patent and Trademark Office (PTO) owispatent

Home    Author Login    Submit Article    Article Search    Add Your Link    Edit Your Link    Contact Us    Advertising    Disclaimer

   

Google
 

Top Breaking News
     Anger Erupts in Athens as Bailout Demands Spark Outrage by VOA News
     Zuma’s Plan for South Africa Wins Support by Delia Robertson
     Obama, Chinese Vice President to Meet at White House by Dan Robinson

Title: Method, system, and computer software for the presentation and storage of analysis results

Abstract: A computer program product, and related systems and methods, are described that processes emission intensity data corresponding to probes of a biological probe array. The computer program includes a genotype and statistical analysis manager that determines absolute or relative expression values based, at least in part, on a statistical measure of the emission intensity data and at least one user-selectable statistical parameter. The analysis manager may also determine genotype calls for one or more probes based, at least in part, on the emission intensity data, The analysis manager may further display the absolute or relative expression values based, at least in part, on at least one user-selectable display parameter and/or a measure of normalized change between genotype calls. The measure of normalized change may be based, at least in part, on a comparison of genotype calls and a reference value.

Patent Number: 7,031,846 Issued on 04/18/2006 to Kaushikkar,   et al.


Inventors: Kaushikkar; Shantanu (San Jose, CA); Webster; Teresa (Loma Mar, CA); Mei; Rui (Santa Clara, CA); McAllister; Linda (San Francisco, CA)
Assignee: Affymetrix, Inc. (Santa Clara, CA)
Appl. No.: 219882
Filed: August 15, 2002


Current U.S. Class: 702/19 ; 435/6; 436/501; 536/23.1; 702/20; 707/102
Current International Class: G01N 33/48 (20060101); C12Q 1/68 (20060101); G06F 7/00 (20060101)
Field of Search: 702/19,20 707/102 435/6 536/23.1


References Cited [Referenced By]

U.S. Patent Documents
6300078 October 2001 Friend et al.
6453241 September 2002 Bassett, Jr. et al.
2002/0016680 February 2002 Wang et al.
2002/0029113 March 2002 Wang et al.
2002/0059326 May 2002 Bernhart et al.
2002/0103604 August 2002 Liu et al.
2002/0165674 November 2002 Bassett, Jr. et al.
2003/0009292 January 2003 Mei et al.
2004/0138821 July 2004 Chiles et al.
Foreign Patent Documents
WO 01/21839 Mar., 2001 WO
WO 02/095659 Nov., 2002 WO

Other References

Fan, J-B et al. Parallel Genotyping of Human SNP's Using Generic High-Density Oligonucleotide Tag Arrays. Genome Research, Jun. 2000, vol. 10, pp. 853-860. cited by examiner.

Primary Examiner: Zeman; Mary K.
Attorney, Agent or Firm: McCarthy, III; William R. McGarrigle; Philip L. Sherr; Alan B.

Parent Case Text



RELATED APPLICATIONS

The present application claims priority to U.S. Provisional Patent Application No. 60/312,906, titled "METHODS AND SYSTEMS FOR EVALUATING ALLELIC IMBALANCE AND PERFORMING OTHER GENOMIC ANALYSIS FUNCTIONS" filed Aug. 16, 2001, which is hereby incorporated by reference herein in its entirety for all purposes.
Claims



What is claimed is:

1. A method of presenting a diagnostic call to a user associated with genotype call data from two samples from the same individual, comprising the steps of: (a) receiving a first set of genotype call data derived from a first sample from an individual, a second set of genotype call data derived from a second sample from the individual, and a set of reference data comprising one or more reference values, wherein each of the first and second sets of genotype call data comprise one or more quantitative genotype representations derived from one or more detected intensity values from at least one probe of a corresponding probe set disposed on a probe array; (b) calculating a difference value using a quantitative genotype representation from the first set of genotype call data corresponding to a first probe set and a quantitative genotype representation from the second set of genotype call data corresponding to the first probe set; (c) normalizing the difference value using a first reference value to produce a normalized difference value, wherein the first reference value comprises a measure of variation specific to the first probe set; (d) comparing the normalized difference value with one or more disease data profiles to produce a diagnostic call; and (e) presenting the diagnostic call to a user.

2. The method of claim 1, wherein: the first reference value includes a standard deviation value.

3. The method of claim 1, wherein: the step of presenting further comprises displaying the normalized difference value to the user in a graphical user interface.

4. The method of claim 3, wherein: the graphical user interface includes data displayed in a text format.

5. The method of claim 3, wherein: the graphical user interface includes data displayed in a graphical format.

6. The method of claim 3, wherein: the graphical user interface displays a relationship of one or more identification data with the normalized difference value.

7. The method of claim 6, wherein: the identification data is any one or combination of a probe set identifier, one or more SNP locations, one or more genotype calls, or one or more relative allele signals.

8. The method of claim 7, wherein: the one or more SNP locations include a chromosome number or an estimated genetic distance from a location associated with a chromosome.

9. The method of claim 8, wherein: the location comprises the top of the short arm of the chromosome.

10. The method of claim 8, wherein: the estimated genetic distance is expressed in centimorgans.

11. The method of claim 6, wherein: the relationship is displayed as a geometric association.

12. The method of claim 11, wherein: the geometric association includes a column or row of graphical or textual elements.

13. The method of claim 6, wherein: the relationship is displayed in a color, shade, or intensity associated with the normalized difference value.

14. The method of claim 1, further comprising: (f) generating report data comprising the normalized difference value for first probe set.

15. The method of claim 14, further comprising: (g) storing the report data in one or more databases.

16. The method of claim 1, wherein: the quantitative genotype representations from the first and second sets of genotype call data and the reference values from the set of reference data correspond to the same probe array type.

17. The method of claim 1, wherein: the reference values are experimentally derived.

18. The method of claim 1, wherein: the reference values are modifiable or replaceable by a user selection.

19. The method of claim 1, wherein: first probe set interrogates a SNP.

20. The method of claim 1, wherein: the difference value comprises an absolute value of the difference between the quantitative genotype representation from the first set of genotype call data and the quantitative genotype representation from the second set of genotype call data.

21. The method of claim 1, wherein: the normalized difference value indicates a change in a genotype call if greater than a threshold value.

22. The method of claim 21, wherein: the threshold value comprises a value of 0.2.

23. A computer system for presenting a diagnostic call to a user associated wit genotype call data from two samples from the same individual, comprising: a computer having executable code stored thereon, the executable code performing the method, comprising: (a) receiving a first set of genotype call data derived from a first sample from an individual, a second set of genotype call data derived from a second sample from the individual, and a set of reference data comprising one or more reference values, wherein each of the first and second sets of genotype call data comprise one or more quantitative genotype representations derived from one or more detected intensity values from at least one probe of a corresponding probe set disposed on a probe array; (b) calculating a difference value using a quantitative genotype representation from the first set of genotype call data corresponding to a first probe set and a quantitative genotype representation from the second set of genotype call data corresponding to the first probe set; (c) normalizing the difference value using a first reference value to produce a normalized difference value, wherein the first reference value comprises a measure of variation specific to the first probe set; (d) comparing the normalized difference value with one or more disease data profiles to produce a diagnostic call; and (e) presenting the diagnostic call to a user.

24. A method of producing a diagnostic call from normalized difference values associated with genotype call data made from two samples from the sane individual, comprising the steps of: (a) receiving a first set of genotype call data derived from a first sample from an individual, a second set of genotype call data derived from a second sample from the individual, and a set of reference data comprising one or more reference values, wherein each of the first and second sets of genotype call data comprise one or more quantitative genotype representations derived from one or more detected intensity values from at least one probe of a corresponding probe set disposed on a probe array; (b) calculating a difference value using quantitative genotype representation from the first set of genotype call data corresponding to a first probe set and a quantitative genotype representation from the second set of genotype call data corresponding to the first probe set; (c) normalizing the difference value using a first reference value from the set of reference data to produce a normalized difference value, wherein the first reference value comprises a measure of variation specific to the first probe set; (d) repeating steps (b) (c) for the quantitative genotype representations from the first and second sets of genotype call data and the reference values from the set of reference data corresponding to a plurality of probe sets; and (f) comparing the normalized difference values for the probe sets with one or more disease data profiles to produce a diagnostic call.

25. The method of claim 24, further comprising: (f) generating report data comprising the normalized difference values for the plurality of probe sets.

26. The method of claim 25, further comprising: (g) storing the report data in one or more databases.

27. The method of claim 24, wherein: the probe sets interrogate one or more SNPs.
Description



COPYRIGHT STATEMENT

A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.

BACKGROUND

1. Field of the Invention

The present invention relates to the field of bioinformatics. In particular, the present invention relates to computer systems, methods, and products for the storage and presentation of data resulting from the analysis of microarrays of biological materials.

2. Related Art

Research in molecular biology, biochemistry, and many related health fields increasingly requires organization and analysis of complex data generated by new experimental techniques. The rapidly evolving field of bioinformatics addresses these tasks. See, e.g., H. Rashidi and K. Buehler, Bioinformatics Basics: Applications in Biological Science and Medicine (CRC Press, London, 2000); Bioinformatics: A Practical Guide to the Analysis of Gene and Proteins (B. F. Ouelette and A. D. Bzevanis, eds., Wiley & Sons, Inc.; 2d ed., 2001), both of which are hereby incorporated herein by reference in their entireties. Broadly, one area of bioinformatics applies computational techniques to large genomic databases, often distributed over and accessed through networks such as the Internet, for the purpose of illuminating relationships among gene structure and/or location, protein function, and metabolic processes.

The expanding use of microarray technology is one of the forces driving the development of bioinformatics. Spotted arrays, such as those made using the Affymetrix.RTM. 417.TM. or 427.TM. Arrayer from Affymetrix, Inc. of Santa Clara, Calif., are used to generate information about biological systems. Also, synthesized probe arrays, such as Affymetrix.RTM. GeneChip.RTM. arrays, have been widely used to generate unprecedented amounts of information about biological systems. For example, the GeneChip.RTM. Human Genome U133 Set (HG-U133A and HG-U133B) is made up of two microarrays containing over 1,000,000 unique oligonucleotide features covering more than 39,000 transcript variants that represent more than 33,000 human genes. Experimenters can quickly design follow-on experiments with respect to genes, EST's, or other biological materials of interest by, for example, producing in their own laboratories microscope slides containing dense arrays of probes using the Affymetrix.RTM. 417.TM. or 427.TM. Arrayer, or other spotting device.

Analysis of data from experiments with synthesized and/or spotted probe arrays may lead to the development of new drugs and new diagnostic tools. In some applications, this analysis begins with the capture of fluorescent signals indicating hybridization of labeled target samples with probes on synthesized or spotted probe arrays. The devices used to capture these signals often are referred to as scanners, an example of which is the Affymetrix.RTM. 428.TM. Scanner.

There is a great demand in the art for methods for organizing, accessing and analyzing the vast amount of information collected by scanning microarrays. Computer-based systems and methods have been developed to assist a user to obtain, analyze, and visualize the vast amounts of information generated by the scanners. These commercial and academic software applications typically provide such information as intensities of hybridization reactions or comparisons of hybridization reactions. This information may be displayed to a user in graphical form. In particular, data representing detected emissions conventionally are stored in a memory device of a computer for processing. The processed images may be presented to a user on a video monitor or other device, and/or operated upon by various data processing products or systems.

In particular, microarrays and associated instrumentation and computer systems have been developed for rapid and large-scale collection of data about the expression of genes or expressed sequence tags (EST's) in tissue samples. The data may be used, among other things, to study genetic characteristics and to detect mutations relevant to genetic and other diseases or conditions. More specifically, the data gained through microarray experiments is valuable to researchers because, among other reasons, many disease states can potentially be characterized by differences in the expression levels of various genes, either through changes in the copy number of the genetic DNA or through changes in levels of transcription (e.g., through control of initiation, provision of RNA precursors, or RNA processing) of particular genes. Thus, for example, researchers use microarrays to answer questions such as: Which genes are expressed in cells of a malignant tumor but not expressed in either healthy tissue or tissue treated according to a particular regime? Which genes or EST's are expressed in particular organs but not in others? Which genes or EST's are expressed in particular species but not in others? How does the environment, drugs, or other factors influence gene expression? Data collection is only an initial step, however, in answering these and other questions. Researchers are increasingly challenged to extract biologically meaningful information from the vast amounts of data generated by microarray technologies, and to design follow-on experiments. A need exists to provide researchers with improved tools and information to perform these tasks.

SUMMARY OF THE INVENTION

Systems, methods, and computer program products are described herein to address these and other needs. In accordance with one embodiment, a method is described that includes receiving first emission intensity data and second emission intensity data corresponding to probes of a probe array; determining first and second genotype calls for one or more probe sets, each having one or more probes, based, at least in part, on the first and second emission intensity data; comparing a first of the first genotype calls with a corresponding first of the second genotype calls and with a reference value; and displaying a measure of normalized change between the first and second genotype calls based, at least in part, on the comparison of first and second genotype calls and reference value. The emission intensity data may include a statistical measure of pixel values corresponding to the probes. The probe array may include a synthesized probe array or a spotted probe array. The genotype call may include a biallelic call, which may include combinations of two alleles. Also, the biallelic call may include a relative allele signal that includes a numerical value between a range, wherein calls near one extreme of the range correspond to one type of homozygous call, calls near the opposing extreme of the range correspond to a second type of homozygous call, and intermediate calls in an intermediate sub-range within the range correspond to a heterozygous call. The reference value may include a standard deviation value.

In this and other embodiments, the step of displaying a measure of normalized change may include displaying a graphical user interface, which may display information in text and/or graphical formats. In some implementations, the graphical user interface includes one or more associations of identification data with the measure of normalized change. The identification data may include probe set identifiers, one or more SNP locations, one or more genotype calls, one or more relative allele signals, or any combination thereof. The one or more SNP locations may include chromosome number and/or estimated genetic distance. For example, the estimated genetic distance may be a relative measure of a distance from a SNP location to the top of the short arm of a chromosome, such as may be expressed in centimorgans. The identification data may be displayed in a geometric association with the measure of normalized change, such as by columns or rows of graphical or textual elements. The identification data may also, or in the alternative, be displayed in a color, shade, or intensity association with the measure of normalized change.

In accordance with a further embodiment, a method is described that includes receiving first emission intensity data and second emission intensity data corresponding to probes of a probe array, wherein the first and second emission intensity data include a statistical measure of pixel values corresponding to the probes; determining first and second genotype calls for one or more probe sets, each having one or more probes based, at least in part, on the first and second emission intensity data; comparing a first of the first genotype calls with a corresponding first of the second genotype calls; and displaying a measure of normalized change between the first and second genotype calls. The measure of normalized change may be based, at least in, part, on the comparison of first and second genotype calls and reference value.

A computer program product is described in accordance with another embodiment. The product includes an input manager that receives first emission intensity data and second emission intensity data corresponding to probes of a probe array; a genotype analysis determiner that determines first and second genotype calls for one or more probe sets, each having one or more probes based, at least in part, on the first and second emission intensity data; a genotype comparator that compares a first of the first genotype calls with a corresponding first of the second genotype calls and with a reference value; and an output manager that displays a measure of normalized change between the first and second genotype calls. The measure of normalized change may be based, at least in part, on the comparison of first and second genotype calls and reference value.

In accordance with yet another embodiment, a method is described that includes receiving one or more sets of emission intensity data corresponding to probes of a biological probe array; determining absolute or relative expression values based, at least in part, on a statistical measure of the emission intensity data and at least one user-selectable statistical parameter; and displaying the absolute or relative expression values based, at least in part, on at least one user-selectable display parameter.

A computer program product is described in accordance with a further embodiment. The product includes an input manager that receives one or more sets of emission intensity data corresponding to probes of a biological probe array; a statistical analysis determiner that determines absolute or relative expression values based, at least in part, on a statistical measure of the emission intensity data and at least one user-selectable statistical parameter; and an output manager that displays the absolute or relative expression values based, at least in part, on at least one user-selectable display parameter. In accordance with yet a further embodiment, a computer program product, includes an input manager that receives one or more sets of emission intensity data corresponding to probes of a biological probe array, and a genotype and statistical analysis manager. The manager determines absolute or relative expression values based, at least in part, on a statistical measure of the emission intensity data and at least one user-selectable statistical parameter, and is further constructed and arranged, when the one or more sets of emission intensity data include first and second sets of emission intensity data, to determine first and second genotype calls for one or more probe sets, each having one or more probes based, at least in part, on the first and second sets of emission intensity data, and is yet further constructed and arranged to display the absolute or relative expression values based, at least in part, on at least one user-selectable display parameter and a measure of normalized change between the first and second genotype calls based, at least in part, on the comparison of first and second genotype calls and reference value.

In accordance with another embodiment, a system is described that includes a scanner constructed and arranged to provide emission intensity data corresponding to probes of a biological probe array. The system also has a computer constructed and arranged to execute a computer program product including an input manager that receives one or more sets of the emission intensity data. The computer program product also has a genotype and statistical analysis manager constructed and arranged to determine absolute or relative expression values based, at least in part, on a statistical measure of the emission intensity data and at least one user-selectable statistical parameter. The manager is further constructed and arranged, when the one or more sets of emission intensity data include first and second sets of emission intensity data, to determine first and second genotype calls for one or more probe sets, each having one or more probes based, at least in part, on the first and second sets of emission intensity data. The manager is yet further constructed and arranged to display the absolute or relative expression values based, at least in part, on at least one user-selectable display parameter and a measure of normalized change between the first and second genotype calls based, at least in part, on the comparison of first and second genotype calls and reference value.

The above implementations are not necessarily inclusive or exclusive of each other and may be combined in any manner that is non-conflicting and otherwise possible, whether they be presented in association with a same, or a different, aspect or implementation. The description of one implementation is not intended to be limiting with respect to other implementations. Also, any one or more function, step, operation, or technique described elsewhere in this specification may, in alternative implementations, be combined with any one or more function, step, operation, or technique described in the summary. Thus, the above implementations are illustrative rather than limiting.

BRIEF DESCRIPTION OF THE DRAWINGS

In the drawings, like reference numerals indicate like structures or method steps and the leftmost digit of a reference numeral indicates the number of the figure in which the referenced element first appears (for example, the element 120 appears first in FIG. 1). In functional block diagrams, rectangles generally indicate functional elements, parallelograms generally indicate data, and rectangles with a pair of double borders generally indicate predefined functional elements. These conventions, however, are intended to be typical or illustrative, rather than limiting.

FIG. 1 is a functional block diagram of one implementation of a laboratory information management system that is connected to a plurality of user computers via a network;

FIG. 2 is a functional block diagram of the laboratory information management system and user computers of FIG. 1 including illustrative embodiments of a scanner and hybridized probe arrays;

FIG. 3 is a functional block diagram of one implementation of a user computer system of FIGS. 1 and 2 including illustrative embodiments of probe-array analysis executables and display/output devices including graphical user interfaces;

FIG. 4 is a functional block diagram of the probe-array analysis executables of FIG. 3 including one implementation of a genotype and statistical analysis manager;

FIGS. 5A and 5B are graphical illustration of particular implementations of the report data file of FIG. 4; and

FIG. 6 is a graphical illustration of a particular implementation of the analysis output data file of FIG. 4.

DETAILED DESCRIPTION

Systems, methods, and computer products are now described with reference to an illustrative embodiment referred to as genotype and statistical analysis manager 400. Manager 400 is shown in a computer system environment in FIG. 4 with examples of graphical user interface output presented in FIGS. 5A, 5B and 6. In a typical implementation, manager 400 may be used to provide a user with information related to results from experiments with probe arrays. More specifically, manager 400 determines absolute or relative expression values based, at least in part, on a statistical measure of the emission intensity data and at least one user-selectable statistical parameter. Also, when the one or more sets of emission intensity data include first and second sets of emission intensity data, manager 400 determines first and second genotype calls for one or more probe sets, each having one or more probes based, at least in part, on the first and second sets of emission intensity data. Further, manager 400 may display the absolute or relative expression values based, at least in part, on at least one user-selectable display parameter and a measure of normalized change between the first and second genotype calls based, at least in part, on the comparison of first and second genotype calls and a reference value. The experiments often involve the use of scanning equipment to detect hybridization of probe-target pairs, and the analysis of detected hybridization by various software applications. Illustrative systems and software applications suitable for implementation of the present invention are now described in relation to FIGS. 1 through 3.

FIG. 1 is a simplified schematic diagram of illustrative systems for generating, sharing, and processing data derived from experiments using probe arrays, such as illustrative hybridized spotted arrays 172A and hybridized synthesized arrays 172B (generally and collectively referred to as probe arrays 172). In this example, illustrative scanner systems 100A and 100B (generally and collectively referred to as scanner system 100) are used to scan probe arrays 172. A scanner system 100 typically may include a user computer (e.g., user computers 150A and 150B, generally and collectively referred to as user computer 150) and a scanner (e.g., scanners 170A and 170B, generally and collectively referred to as scanner 170). In this example, data may be communicated between user computer 150 and Laboratory Information Management (LIMS) server 120 over network 125. LIMS server 120 and associated software generally provides data capturing, tracking, and analysis functions from a centralized infrastructure. Aspects of illustrative LIMS are described in U.S. patent applications Ser. Nos. 09/683,912 and 09/682,098; and in U.S. Provisional Patent Applications Nos. 60/220,587 and 60/273,231, all of which are hereby incorporated by reference herein for all purposes. LIMS server 120 and network 125 are optional, and the systems in other implementations may include a scanner for spotted arrays and not synthesized arrays, or vice versa. Also, rather than employing separate user computers 150A and 150B, a single computer may be used in other implementations. Further, user computer 150, or any functional components thereof, may also or in addition be integral to scanner 170 in some implementations so that, for example, it is located within the same housing as the scanner.

More generally, a large variety of computer and/or network architectures and designs may be employed, and it will be understood by those of ordinary skill in the relevant art that many components of typical computer network systems are not shown in FIG. 1. Components of illustrative computers are described in greater detail below in relation to FIGS. 2 and 3.

Probe Arrays 172: Various techniques and technologies may be used for synthesizing dense arrays of biological materials on or in a substrate or support. For example, Affymetrix.RTM. GeneChip.RTM. arrays are synthesized in accordance with techniques sometimes referred to as VLSIPS.TM. (Very Large Scale Immobilized Polymer Synthesis) technologies. Some aspects of VLSIPS.TM. and other microarray manufacturing technologies are described in U.S. Pat. Nos. 5,424,186; 5,143,854; 5,445,934; 5,744,305; 5,831,070; 5,837,832; 6,022,963; 6,083,697; 6,291,183; 6,309,831; and 6,310,189, all of which are hereby incorporated by reference in their entireties for all purposes. The probes of these arrays in some implementations consist of nucleic acids that are synthesized by methods including the steps of activating regions of a substrate and then contacting the substrate with a selected monomer solution. As used herein, nucleic acids may include any polymer or oligomer of nucleosides or nucleotides (polynucleotides or oligonucleotides) that include pyrimidine and/or purine bases, preferably cytosine, thymine, and uracil, and adenine and guanine, respectively. Nucleic acids may include any deoxyribonucleotide, ribonucleotide, and/or peptide nucleic acid component, and/or any chemical variants thereof such as methylated, hydroxymethylated or glucosylated forms of these bases, and the like. The polymers or oligomers may be heterogeneous or homogeneous in composition, and may be isolated from naturally-occurring sources or may be artificially or synthetically produced. In addition, the nucleic acids may be DNA or RNA, or a mixture thereof, and may exist permanently or transitionally in single-stranded or double-stranded form, including homoduplex, heteroduplex, and hybrid states. Probes of other biological materials, such as peptides or polysaccharides as non-limiting examples, may also be formed. For more details regarding possible implementations, see U.S. Pat. No. 6,156,501, which is hereby incorporated by reference herein in its entirety for all purposes.

A system and method for efficiently synthesizing probe arrays using masks is described in U.S. patent application, Ser. No. 09/824,931; a system and method for a rapid and flexible microarray manufacturing and online ordering system is described in U.S. Provisional Patent Application, Ser. No. 60/265,103; and systems and methods for optical photolithography without masks are described in U.S. Pat. No. 6,271,957 and in U.S. patent application Ser. No. 09/683,374, all of which are hereby incorporated by reference herein in their entireties for all purposes.

The probes of synthesized probe arrays typically are used in conjunction with biological target molecules of interest, such as cells, proteins, genes or EST's, other DNA sequences, or other biological elements. More specifically, the biological molecule of interest may be a ligand, receptor, peptide, nucleic acid (oligonucleotide or polynucleotide of RNA or DNA), or any other of the biological molecules listed in U.S. Pat. No. 5,445,934 (incorporated by reference above) at column 5, line 66 to column 7, line 51. For example, if transcripts of genes are the interest of an experiment, the target molecules would be the transcripts. Other examples include protein fragments, small molecules, etc. Target nucleic acid refers to a nucleic acid (often derived from a biological sample) of interest. Frequently, a target molecule is detected using one or more probes. As used herein, a probe is a molecule for detecting a target molecule. A probe may be any of the molecules in the same classes as the target referred to above. As non-limiting examples, a probe may refer to a nucleic acid, such as an oligonucleotide, capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. As noted above, a probe may include natural (i.e. A, G, U, C, or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in probes may be joined by a linkage other than a phosphodiester bond, so long as the bond does not interfere with hybridization. Thus, probes may be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than phosphodiester linkages. Other examples of probes include antibodies used to detect peptides or other molecules, any ligands for detecting its binding partners. When referring to targets or probes as nucleic acids, it should be understood that these are illustrative embodiments that are not to limit the invention in any way.

The samples or target molecules of interest (hereafter, simply targets) are processed so that, typically, they are spatially associated with certain probes in the probe array. For example, one or more tagged targets are distributed over the probe array. In accordance with some implementations, some targets hybridize with probes and remain at the probe locations, while non-hybridized targets are washed away. These hybridized targets, with their tags or labels, are thus spatially associated with the probes. The hybridized probe and target may sometimes be referred to as a probe-target pair. Detection of these pairs can serve a variety of purposes, such as to determine whether a target nucleic acid has a nucleotide sequence identical to or different from a specific reference sequence. See, for example, U.S. Pat. No. 5,837,832, referred to and incorporated above. Other uses include gene expression monitoring and evaluation (see, e.g., U.S. Pat. Nos. 5,800,992 and 6,040,138, and International Application No. PCT/US98/15151, published as WO99/05323), genotyping (U.S. Pat. No. 5,856,092), or other detection of nucleic acids, all of which are hereby incorporated by reference herein in their entireties for all purposes.

Other techniques exist for depositing probes on a substrate or support. For example, "spotted arrays" are commercially fabricated, typically on microscope slides. These arrays consist of liquid spots containing biological material of potentially varying compositions and concentrations. For instance, a spot in the array may include a few strands of short oligonucleotides in a water solution, or it may include a high concentration of long strands of complex proteins. The Affymetrix.RTM. 417.TM. Arrayer and 427.TM. Arrayer are devices that deposit densely packed arrays of biological materials on microscope slides in accordance with these techniques. Aspects of these, and other, spot arrayers are described in U.S. Pat. Nos. 6,040,193 and 6,136,269; in U.S. patent application Ser. No. 09/683,298, in U.S. Provisional Patent Application No. 60/288,403; and in PCT Application No. PCT/US99/00730 (International Publication Number WO 99/36760), all of which are hereby incorporated by reference in their entireties for all purposes. Other techniques for generating spotted arrays also exist. For example, U.S. Pat. No. 6,040,193 to Winkler, et al. is directed to processes for dispensing drops to generate spotted arrays. The '193 patent, and U.S. Pat. No. 5,885,837 to Winkler, also describe the use of micro-channels or micro-grooves on a substrate, or on a block placed on a substrate, to synthesize arrays of biological materials. These patents further describe separating reactive regions of a substrate from each other by inert regions and spotting on the reactive regions. The '193 and '837 patents are hereby incorporated by reference in their entireties. Another technique is based on ejecting jets of biological material to form a spotted array. Other implementations of the jetting technique may use devices such as syringes or piezo electric pumps to propel the biological material. It will be understood that the foregoing are non-limiting examples of techniques for synthesizing, depositing, or positioning biological material onto or within a substrate. For example, although a planar array surface is preferred in some implementations of the foregoing, a probe array may be fabricated on a surface of virtually any shape or even a multiplicity of surfaces. Arrays may comprise probes synthesized or deposited on beads, fibers such as fiber optics, glass or any other appropriate substrate, see U.S. Pat. Nos. 6,361,947, 5,770,358, 5,789,162, 5,708,153 and 5,800,992, all of which are hereby incorporated in their entireties for all purposes. Arrays may be packaged in such a manner as to allow for diagnostics or other manipulation of in an all inclusive device, see for example, U.S. Pat. Nos. 5,856,174 and 5,922,591 incorporated in their entireties by reference for all purposes.

To ensure proper interpretation of the term "probe" as used herein, it is noted that contradictory conventions exist in the relevant literature. The word "probe" is used in some contexts to refer not to the biological material that is synthesized on a substrate or deposited on a slide, as described above, but to what has been referred to herein as the "target." To avoid confusion, the term "probe" is used herein to refer to probes such as those synthesized according to the VLSIPS.TM. technology; the biological materials deposited so as to create spotted arrays; and materials synthesized, deposited, or positioned to form arrays according to other current or future technologies. Thus, microarrays formed in accordance with any of these technologies may be referred to generally and collectively hereafter for convenience as "probe arrays." Moreover, the term "probe" is not limited to probes immobilized in array format. Rather, the functions and methods described herein may also be employed with respect to other parallel assay devices. For example, these functions and methods may be applied with respect to probe-set identifiers that identify probes immobilized on or in beads, optical fibers, or other substrates or media.

Probes typically are able to detect the expression of corresponding genes or EST's by detecting the presence or abundance of mRNA transcripts present in the target. This detection may, in turn, be accomplished in some implementations by detecting labeled cRNA that is derived from cDNA derived from the mRNA in the target. In general, a group of probes, sometimes referred to as a probe set, contains sub-sequences in unique regions of the transcripts and does not correspond to a full gene sequence. Further details regarding the design and use of probes and probe sets are provided in U.S. Pat. No. 6,188,783; in PCT Application Serial No. PCT/US 01/02316, filed Jan. 24, 2001; and in U.S. patent applications Ser. Nos. 09/721,042, 09/718,295, 09/745,965, and 09/764,324, all of which are hereby incorporated herein by reference in their entireties for all purposes.

Probe Set Identifiers: Probe-set identifiers typically come to the attention of a user, represented by user 275 of FIGS. 2 and 3, as a result of experiments conducted on probe arrays. For example, user 275 may select probe-set identifiers that identify microarray probe sets capable of enabling detection of the expression of mRNA transcripts from corresponding genes or EST's of particular interest. As is well known in the relevant art, an EST is a fragment of a gene sequence that may not be fully characterized, whereas a gene sequence generally is complete and fully characterized. The word "gene" is used generally herein to refer both to full size genes of known sequence and to computationally predicted genes. In some implementations, the specific sequences detected by the arrays that represent these genes or EST's may be referred to as, "sequence information fragments (SIF's)" and may be recorded in what may be referred to as a "SIF file." In particular implementations, a SIF is a portion of a consensus sequence that has been deemed to best represent the mRNA transcript from a given gene or EST. The consensus sequence may have been derived by comparing and clustering EST's, and possibly also by comparing the EST's to genomic sequence information. A SIF is a portion of the consensus sequence for which probes on the array are specifically designed. With respect to the operations of genotype and statistical analysis manager 400 of the particular implementation described herein, it is assumed with respect to some aspects that some microarray probe sets may be designed to detect the expression of genes based upon sequences of EST's.

As was described above, the term "probe set" refers in some implementations to one or more probes from an array of probes on a microarray. For example, in an Affymetrix.RTM. GeneChip.RTM. probe array, in which probes are synthesized on a substrate, a probe set may consist of 30 or 40 probes, half of which typically are controls. These probes collectively, or in various combinations of some or all of them, are deemed to be indicative of the expression of a gene or EST. In a spotted probe array, one or more spots may similarly constitute a "probe set."

The term "probe-set identifiers" is used broadly herein in that a number of types of such identifiers are possible and may be included within the meaning of this term in various implementations. One type of probe-set identifier is a name, number, or other symbol that is assigned for the purpose of identifying a probe set. This name, number, or symbol may be arbitrarily assigned to the probe set by, for example, the manufacturer of the probe array. A user may select this type of probe-set identifier by, for example, highlighting or typing the name. Another type of probe-set identifier as intended herein is a graphical representation of a probe set. For example, dots may be displayed on a scatter plot or other diagram wherein each dot represents a probe set, as described for example in U.S. Pat. No. 6,420,108, which is hereby incorporated herein in its entirety for all purposes. Typically, the dot's placement on the plot represents the intensity of the signal from hybridized, tagged, targets (as described in greater detail below) in one or more experiments. In these cases, a user may select a probe-set identifier by clicking on, drawing a loop around, or otherwise selecting one or more of the dots. In another example, user 275 may select a probe-set identifier by selecting a row or column in a table or spreadsheet that correlates probe sets with accession numbers and other genomic information.

Yet another type of probe-set identifier, as that term is used herein, includes a nucleotide or amino acid sequence. For example, it is illustratively assumed that a particular SIF is a unique sequence of 500 bases that is a portion of a consensus sequence or exemplar sequence gleaned from EST and/or genomic sequence information. It further is assumed that one or more probe sets are designed to represent the SIF. A user who specifies all or part of the 500-base sequence thus may be considered to have specified all or some of the corresponding probe sets.

As a further example with respect to a particular implementation, a user may specify a portion of the 500-base sequence noted above, which may be unique to that SIF, or, alternatively, may also identify another SIF, EST, cluster of EST's, consensus sequence, and/or gene or protein. The user thus specifies a probe-set identifier for one or more genes or EST's. In another variation, it is illustratively assumed that a particular SIF is a portion of a particular consensus sequence. It is further assumed that a user specifies a portion of the consensus sequence that is not included in the SIF but that is unique to the consensus sequence or the gene or EST's the consensus sequence is intended to represent. In that case, the sequence specified by the user is a probe-set identifier that identifies the probe set corresponding to the SIF, even though the user-specified sequence is not included in the SIF. Parallel cases are possible with respect to user specifications of partial sequences of EST's and genes or EST's, as those skilled in the relevant art will now appreciate.

A further example of a probe-set identifier is an accession number of a gene or EST. Gene and EST accession numbers are publicly available. A probe set may therefore be identified by the accession number or numbers of one or more EST's and/or genes corresponding to the probe set. The correspondence between a probe set and EST's or genes may be maintained in a suitable database from which the correspondence may be provided to the user. Similarly, gene fragments or sequences other than EST's may be mapped (e.g., by reference to a suitable database) to corresponding genes or EST's for the purpose of using their publicly available accession numbers as probe-set identifiers. For example, a user may be interested in product or genomic information related to a particular SIF that is derived from EST-1 and EST-2. The user may be provided with the correspondence between that SIF (or part or all of the sequence of the SIF) and EST-1 or EST-2, or both. To obtain product or genomic data related to the SIF, or a partial sequence of it, the user may select the accession numbers of EST-1, EST-2, or both.

Additional examples of probe-set identifiers include one or more terms that may be associated with the annotation of one or more gene or EST sequences, where the gene or EST sequences may be associated with one or more probe sets. For convenience, such terms may hereafter be referred to as "annotation terms" and will be understood to potentially include, in various implementations, one or more words, graphical elements, characters, or other representational forms that provide information that typically is biologically relevant to or related to the gene or EST sequence. Associations between the probe-set identifier terms and gene or EST sequences may be stored in a database such as a local genomic database, or they may be transferred from one or more remote databases. Examples of such terms associated with annotations include those of molecular function (e.g. transcription initiation), cellular location (e.g. nuclear membrane), biological process (e.g. immune response), tissue type (e.g. kidney), or other annotation terms known to those in the relevant art.

LIMS Server 120: FIG. 2 shows in greater detail a typical configuration of a server computer, such as server 120 of FIG. 1, coupled to a workstation computer via a network. For convenience, the server computer is referred to herein as LIMS server 120, although this computer may carry out a variety of functions in addition to those described below with respect to LIMS and LIMS-SDK software applications. Moreover, in some implementations any function ascribed to LIMS server 120 may be carried out by one or more other computers, and/or the functions may be performed in parallel by a group of computers. Network 125 may include a local area network, a wide area network, the Internet, another network, any combination thereof, or another computer system and network configuration.

Typically, LIMS server 120 is a network-server class of computer designed for servicing a number of workstations or other computer platforms over a network. However, server 120 may be any of a variety of types of general-purpose computers such as a personal computer, workstation, main frame computer, or other computer platform now or later developed. Server 120 typically includes known components such as a processor 205, an operating system 210, a system memory 220, memory storage devices 225, and input-output controllers 230. It will be understood by those skilled in the relevant art that there are many possible configurations of the components of server 120 and that some components that may typically be included are not shown, such as cache memory, a data backup unit, and many other devices. Similarly, many hardware and associated software or firmware components that may be implemented in a network server are not shown in FIG. 2. For example, components to implement one or more firewalls to protect data and applications, uninterruptable power supplies, LAN switches, web-server routing software, and many other components are not shown. Those of ordinary skill in the art will readily appreciate how these and other conventional components may be implemented.

Processor 205 may include multiple processors; e.g., multiple Intel Xeon.RTM. 700 MHz. As further examples, processor 205 may include one or more of a variety of other commercially available processors such as Pentium.RTM. processors from Intel, SPARC.RTM. processors made by Sun Microsystems, or other processors that are or will become available. Processor 205 executes operating system 210, which may be, for example, a Windows.RTM.-type operating system (such as Windows.RTM. 2000 with SP 1, Windows NT.RTM. 4.0 with SP6a) from the Microsoft Corporation; the Solaris operating system from Sun Microsystems, the Tru64 Unix from Compaq, other Unix.RTM. or Linux-type operating systems available from many vendors; another or a future operating system; or some combination thereof. Operating system 210 interfaces with firmware and hardware in a well-known manner, and facilitates processor 205 in coordinating and executing the functions of various computer programs that may be written in a variety of programming languages. Operating system 210, typically in cooperation with processor 205, coordinates and executes functions of the other components of server 120. Operating system 210 also provides scheduling, input-output control, file and data management, memory management, and communication control and related services, all in accordance with known techniques.

System memory 220 may be any of a variety of known or future memory storage devices. Examples include any commonly available random access memory (RAM), magnetic medium such as a resident hard disk or tape, an optical medium such as a read and write compact disc, or other memory storage device. Memory storage device 225 may be any of a variety of known or future devices, including a compact disk drive, a tape drive, a removable hard disk drive, or a diskette drive. Such types of memory storage device 225 typically read from, and/or write to, a program storage medium (not shown) such as, respectively, a compact disk, magnetic tape, removable hard disk, or floppy diskette. Any of these program storage media, or others now in use or that may later be developed, may be considered a computer program product. As will be appreciated, these program storage media typically store a computer software program and/or data. Computer software programs, also called computer control logic, typically are stored in system memory 220 and/or the program storage device used in conjunction with memory storage device 225.

In some embodiments, a computer program product is described comprising a computer usable medium having control logic (computer software program, including program code) stored therein. The control logic, when executed by processor 205, causes processor 205 to perform functions described herein. In other embodiments, some functions are implemented primarily in hardware using, for example, a hardware state machine. Implementation of the hardware state machine so as to perform the functions described herein will be apparent to those skilled in the relevant arts.

Input-output controllers 230 could include any of a variety of known devices for accepting and processing information from a user, whether a human or a machine, whether local or remote. Such devices include, for example, modem cards, network interface cards, sound cards, or other types of controllers for any of a variety of known input or output devices. In the illustrated embodiment, the functional elements of server 120 communicate with each other via system bus 204. Some of these communications may be accomplished in alternative embodiments using network or other types of remote communications.

As will be evident to those skilled in the relevant art, LIMS server application 280, as well as LIMS Objects 290 including LIMS servers 292 and LIMS API's 294 (described below), if implemented in software, may be loaded into system memory 220 and/or memory storage device 225 through one of input devices 202. LIMS server application 280 as loaded into system memory 220 is shown in FIG. 2 as LIMS server application executables 280A. Similarly, objects 290 are shown as LIMS server executables 292A and LIMS API object type libraries 294A after they have been loaded into system memory 220. All or portions of these loaded elements may also reside in a read-only memory or similar device of memory storage device 225, such devices not requiring that the elements first be loaded through input devices 202. It will be understood by those skilled in the relevant art that any of the loaded elements, or portions of them, may be loaded by processor 205 in a known manner into system memory 220, or cache memory (not shown), or both, as advantageous for execution.

LIMS Server Application 280: Details regarding the operations of illustrative implementations of application 280 are provided in U.S. patent applications Ser. No. 09/682,098 (hereby incorporated by reference herein in its entirety for all purposes) and No. 60/220,587, incorporated by reference above. It will be understood that the particular LIMS implementation described in this patent application is illustrative only, and that many other implementations may be used with LIMS objects 290 and other aspects of the present or alternative embodiments.

Application 280, and other software applications referred to herein, may be implemented using Microsoft Visual C++ or any of a variety of other programming languages. For example, applications may also be written in Java, C++, Visual Basic, any other high-level or low-level programming language, or any combination thereof.

As noted, certain implementations may be illustrated herein with respect to a particular, non-limiting, implementation of application 280, sometimes referred to as Affymetrix.RTM. LIMS. Full database functionality is intended to provide a data streaming solution and a single infrastructure to manage information from probe array experiments. Application 280 provides all the functionality of database storage and retrieval system for accessing and manipulating all system data. A database server provides an automated and integrated data management environment for the end user. All process data, raw data and derived data are stored as elements of the database, providing an alternative to a file-based storage mechanism. A database back end also provides integration of application 280 into a customer's overall information system infrastructure. Data is accessible through standard interfaces and can be tracked, queried, archived, exported, imported and administered.

Application 280 of the illustrated implementation, supports process tracking for a genetic assay, adds enhanced administration functionality for managing GeneChip, spotted array, and AADM data (GeneChip data that has been published to the Affynietrix.RTM. Analysis Data Model standard), provides a full Oracle.RTM. database management software or SQL Server solution, supports publishing of genotype and sequence data, and provide a high level of security for the LIMS system. Aspects of illusirative publishing operations are described in U.S. Pat. No. 6,804,679, which is hereby incorporated herein in its entirety for all purposes.

Application 280 of the illustrated example provides the following functionality. The Generic assay, supported by process tracking from enhancements to data management. The processes include but are not limited to the following: sample definition, experiment setup, hybridization, scanning, grid alignment, cell intensity analysis, probe array analysis, and publishing. The generic assay supports multiple experiments per sample definition via a re-queuing process, multiple hybridization and scan operations for a single experiment, data re-analysis, and publishing to more than one database. The Process Database, either an Oracle or SQL Server DBMS (Database management system) solution, fully supported by enhancements to CasoAffy (COM Communication layer to the process database). The GeneInfo Database, where enhancements provide additional support for storing chromosome and probe sequence information about the biological item on the probe array. The AADM Database, a database that stores the published GeneChip data, where enhancements provide full support for either an Oracle or SQL server DBMS. Additional tables to AADM provide support for genotype data, and modifications to the publishing components include data load performance improvements as well as bi-directional communication with GeneChip during publishing operations. The Security Database, a LIMS security database provides a role-based security level that is integrated with the Windows NT.RTM. user authentication security. The security database supports role definition, functional access within a role and assigning NT groups and users to those roles. A role is a collection of users, which have a common set of access rights to GeneChip data. Roles are defined per server/database and a role member can be a member of multiple roles, where the software determines a user's access rights. A function is a predetermined action that is common to all roles. Each role is defined by the functions it can and cannot perform. Functions explicitly describe the type of action that a member of the role can perform. The functions supported by a newly created role includes but is not limited to the following: read process data, delete process


Free Web Sudoku Puzzles.
Solve with your browser.
  3   4     1    
8   1 2   7   6  
  5         2    
9     8   2      
    8       5    
      1   5     9
    4         3  
  2   7   3 8   5
    6     1   7  
What is it?



Add Your Site · Terms Of Service · Privacy Policy


DISCLAIMER
Linkgrinder is a free service that searches the Internet and indexes all files found so that you may search quickly and easily for shared files. These files are created and made available individually by users whose identity we are not aware of and who we have no control over. In essence we function like a search engine tool; these files ARE NOT STORED OR SERVED BY OUR NETWORK. We are not responsible for any materials obtained by using our service. We do not monitor any of the contents of these files. These files may contain viruses, illegal materials, materials inappropriate for minors, offensive files and the like. BY USING OUR SERVICE, YOU ASSUME FULL RESPONSIBILITY FOR DOWNLOADING THESE MATERIALS AND WILL INDEMNIFY US FOR ANY DAMAGES THAT MAY BE INCURRED.

For More Specific Information VIEW OUR TERMS OF SERVICE.

Thank you and Enjoy!