Senior Fitness - Exercise and Nutrition for Aging Men and Women
FREE Article Feed for your website.
Home Ownership Magazine
Party Planning Information
Article Marketing Resources
Bio-Medical Research Article Database
Informative Articles on Life, Love and Happiness
Tutorials on Business to Writing
Famous Quotes from Famous People
Song Lyric Information
New US Patent Information
Comprehensive List of Content by Category
Online Auctions and Shopping Related Articles
Article Search
Most Recent Articles
Title: Streptococcus pneumoniae open reading frames encoding polypeptide antigens and uses thereof
Patent Number: 7,384,775 Issued on 06/10/2008 to Zagursky,   et al.

Title: Mast cell surface antigen, DNA thereof, and antibody against the antigen
Patent Number: 7,045,597 Issued on 05/16/2006 to Kawai,   et al.

Title: Rehabilitation stroller
Patent Number: 7,044,498 Issued on 05/16/2006 to Chen

Title: Modular multiple disk drive apparatus
Patent Number: 7,042,720 Issued on 05/09/2006 to Konshak,   et al.

Title: Projector device
Patent Number: 6,913,361 Issued on 07/05/2005 to Gishi,   et al.

Title: Edge detection and sharpening process for an image
Patent Number: 7,068,852 Issued on 06/27/2006 to Braica

Title: Systems and methods for enhanced error concealment in a video decoder
Patent Number: 6,990,151 Issued on 01/24/2006 to Kim,   et al.

Title: Optical substrate and display device using the same
Patent Number: 7,042,644 Issued on 05/09/2006 to Nishikawa

Title: Audio visual architecture
Patent Number: 7,039,943 Issued on 05/02/2006 to Hasha

Title: Apparatus for surface modification of polymer, metal and ceramic materials using ion beam
Patent Number: 6,841,789 Issued on 01/11/2005 to Koh,   et al.

Title: Insulation shielding for glass fiber making equipment
Patent Number: 7,021,084 Issued on 04/04/2006 to Lembo

Title: Mounting assembly
Patent Number: 7,063,297 Issued on 06/20/2006 to Jopling

Title: Golf club head and manufacturing method therefor
Patent Number: 7,022,032 Issued on 04/04/2006 to Chen

Title: Multi-resolution image data management system and method based on tiled wavelet-like transform and sparse data coding
Patent Number: 6,978,049 Issued on 12/20/2005 to Chui,   et al.

Title: Hydraulic turbine draft tube with enhanced dissolved oxygen
Patent Number: 6,971,843 Issued on 12/06/2005 to Desy,   et al.

Title: Information data multiplex transmission system, its multiplexer and demultiplexer, and error correction encoder and decoder
Patent Number: 7,020,824 Issued on 03/28/2006 to Tanaka,   et al.

Title: Method and system for controlling a robot
Patent Number: 7,069,111 Issued on 06/27/2006 to Glenn,   et al.

Title: Method and system for verifying the integrity of normal sinus rhythm templates
Patent Number: 6,996,434 Issued on 02/07/2006 to Marcovecchio,   et al.

Title: Ocular fundus auto imager
Patent Number: 7,025,459 Issued on 04/11/2006 to Cornsweet,   et al.

Title: System and method for calibrating electronic circuitry
Patent Number: 7,045,995 Issued on 05/16/2006 to Summers

Title: Red colorant composition and magenta inkjet ink composition with stable pH
Patent Number: 7,125,445 Issued on 10/24/2006 to Chou,   et al.

Title: Heat exchanger for liquid vaporization
Patent Number: 7,036,463 Issued on 05/02/2006 to Moody

Title: Energization cycle counter for induction heating tool
Patent Number: 7,041,946 Issued on 05/09/2006 to Bartz

Title: Upper bearing support assembly for internal turret
Patent Number: 7,063,032 Issued on 06/20/2006 to Lindblade,   et al.

Title: Semiconductor device and manufacturing the same
Patent Number: 6,847,112 Issued on 01/25/2005 to Ito

Title: Adaptive filtering of visual image using auxiliary image information
Patent Number: 7,072,525 Issued on 07/04/2006 to Covell

Title: Data bank providing connectivity among multiple mass storage media devices using daisy chained universal bus interface
Patent Number: 6,875,023 Issued on 04/05/2005 to Brown

Title: Over-pressurization protection system for cryogenic vessel
Patent Number: 7,028,489 Issued on 04/18/2006 to Hall

Title: Methods of preparing a polymeric material composite
Patent Number: 7,041,780 Issued on 05/09/2006 to Buckley,   et al.

Title: Back light module
Patent Number: 6,843,582 Issued on 01/18/2005 to Chang

Title: Electronic sign enclosure having a rail
Patent Number: 6,851,210 Issued on 02/08/2005 to Rose

Title: Cuvette arrays
Patent Number: 6,887,432 Issued on 05/03/2005 to Kansy,   et al.

Title: Method and apparatus for hosting a network camera including a heartbeat mechanism
Patent Number: 7,076,085 Issued on 07/11/2006 to Sah

Title: Component mounting control method
Patent Number: 7,051,431 Issued on 05/30/2006 to Ueda,   et al.

Title: Semiconductor devices and methods of manufacturing the same
Patent Number: 6,967,142 Issued on 11/22/2005 to Sohn

Title: 3-D map data visualization
Patent Number: 6,836,270 Issued on 12/28/2004 to Du

Title: Service broker for processing data from a data network
Patent Number: 7,032,002 Issued on 04/18/2006 to Rezvani,   et al.

Title: Method for applying a substrate
Patent Number: 6,841,027 Issued on 01/11/2005 to Muffler

Title: Impact load transfer element
Patent Number: 6,969,110 Issued on 11/29/2005 to Ali,   et al.

Title: Method for propagating vibratory motion into a conductive fluid and using the method to solidify a melted metal
Patent Number: 6,852,178 Issued on 02/08/2005 to Iwai,   et al.

Title: Phase determination of a radiation wave field
Patent Number: 6,885,442 Issued on 04/26/2005 to Nugent,   et al.

Title: Manganese, bismuth mixed metal oxide cathode for rechargeable lithium electrochemical systems
Patent Number: 7,011,908 Issued on 03/14/2006 to Atwater,   et al.

Title: Conveyor for changing the angular orientation of conveyed articles
Patent Number: 7,036,654 Issued on 05/02/2006 to Frost

Title: Methods and systems for reducing erase times in flash memory devices
Patent Number: 7,079,424 Issued on 07/18/2006 to Lee,   et al.

Title: Attachment of a sling
Patent Number: 7,069,624 Issued on 07/04/2006 to Johnson

Title: Print media feed alignment mechanism
Patent Number: 7,032,899 Issued on 04/25/2006 to Jensen

Title: System and apparatus for receiving an application
Patent Number: 6,983,304 Issued on 01/03/2006 to Sato

Title: Actuation membrane for application to a card slot of a system
Patent Number: 7,023,697 Issued on 04/04/2006 to Pokharna,   et al.

Title: Two-stage charging device
Patent Number: 6,853,165 Issued on 02/08/2005 to Chen

Title: Selective calling receiver and method of switching alert operation thereof
Patent Number: 7,026,914 Issued on 04/11/2006 to Amma

Title: Aquarium filter having self-priming arrangement
Patent Number: 7,001,509 Issued on 02/21/2006 to Lin

Title: Color switching projection apparatus with two liquid crystal panels
Patent Number: 7,048,380 Issued on 05/23/2006 to Sokolov

Title: User interface rendering component environment
Patent Number: 7,032,180 Issued on 04/18/2006 to Wilkinson,   et al.

Title: Low power dissipating sense amplifier
Patent Number: 6,975,549 Issued on 12/13/2005 to Lin

Title: Relay with a core having an enlarged cross-section
Patent Number: 7,026,896 Issued on 04/11/2006 to Mikl,   et al.

Title: Optical filters
Patent Number: 6,838,183 Issued on 01/04/2005 to Yializis

Title: Surface-treating agents, anti-fogging sheets, and trays using thereof
Patent Number: 7,052,539 Issued on 05/30/2006 to Okumura,   et al.

Title: Assembly structure of rimless eyeglasses
Patent Number: 6,896,367 Issued on 05/24/2005 to Sohn

Title: Method to prevent metal oxide formation during polycide reoxidation
Patent Number: 7,067,411 Issued on 06/27/2006 to Schuegraf,   et al.

Title: Cast-in anchor attachment apparatus
Patent Number: 6,789,776 Issued on 09/14/2004 to Gavin

Title: Electronic tracking system for a combination of sporting articles consisting of more than one sporting article and the use of same
Patent Number: 7,017,808 Issued on 03/28/2006 to Holzer

Title: Shape lockable apparatus and method for advancing an instrument through unsupported anatomy
Patent Number: 6,837,847 Issued on 01/04/2005 to Ewers,   et al.

Title: Semiconductor device having a ball grid array and a fabrication process thereof
Patent Number: 6,784,542 Issued on 08/31/2004 to Fukasawa,   et al.

Title: Power-actuated lathe chuck
Patent Number: 6,880,831 Issued on 04/19/2005 to Taglang

Title: Two-dimensional stepwise-controlled microstructure
Patent Number: 7,042,609 Issued on 05/09/2006 to Buzzetta

Title: Optical wavelength converting device and process for producing the same
Patent Number: 6,998,223 Issued on 02/14/2006 to Nihei,   et al.

Title: Breast biopsy and therapy system for magnetic resonance imagers
Patent Number: 6,889,073 Issued on 05/03/2005 to Lampman,   et al.

Title: Programmable AED-CPR training device
Patent Number: 6,969,259 Issued on 11/29/2005 to Pastrick,   et al.

Title: Method for plating of printed circuit board strip
Patent Number: 7,065,869 Issued on 06/27/2006 to Kang,   et al.

Title: Inkjet recording medium
Patent Number: 6,896,364 Issued on 05/24/2005 to Nakazawa,   et al.

Title: Use of amplified spontaneous emission from a semiconductor optical amplifier to minimize channel interference during initialization of an externally modulated DWDM transmitter
Patent Number: 6,842,587 Issued on 01/11/2005 to McGhan,   et al.

Title: Belt type continuously variable transmission
Patent Number: 7,039,516 Issued on 05/02/2006 to Yamaguchi,   et al.

Title: Gas flow meter and method for measuring gas flow rate
Patent Number: 7,082,826 Issued on 08/01/2006 to Robertson

Title: Apparatus and methods for remote monitoring of flow conduits
Patent Number: 6,891,477 Issued on 05/10/2005 to Aronstam

Title: Methods and apparatus of signal demodulation combining with different modulations and coding for wireless communications
Patent Number: 6,996,762 Issued on 02/07/2006 to Kuo,   et al.

Disk mirror architecture for database appliance Number:7,089,448 from the United States Patent and Trademark Office (PTO) owispatent

Home    Author Login    Submit Article    Article Search    Add Your Link    Edit Your Link    Contact Us    Advertising    Disclaimer

   

 
Web LinkGrinder.com

Top Breaking News
     Colombian Military Releases Video of Hostage Rescue by VOA News
     Former DRC Warlord Brought Before ICC Amid Doubts by Brent Latham
     Tanzania Devises Plan to Cope with Avian Flu Outbreak (Part 1/5) by Douglas Mpuga

Title: Disk mirror architecture for database appliance

Abstract: A disk is segmented into a first data segment and a secondary data segment. The secondary data segment stores a logical mirror of the first data segment of another disk. Fast access to data stored on the disk is provided by partitioning the disk such that the first data segment includes the fast tracks of the disk and the secondary data segment includes the slow tracks of the disk and forwarding all data requests to the first data segment. Upon detecting a failure, the logical mirror of data stored in the first data segment of the failed disk is accessible from the secondary data segment of a non-failed disk. The first data segment can be rebuilt quickly on another disk from the logical mirror stored in the secondary data segment.

Patent Number: 7,089,448 Issued on 08/08/2006 to Hinshaw,   et al.


Inventors: Hinshaw; Foster D. (Somerville, MA), Femia; Vincent F. (Northboro, MA), Harris; Craig S. (Acton, MA), Metzger; John K. (Westborough, MA), Meyers; David L. (Shrewsbury, MA), Zane; Barry M. (Wayland, MA)
Assignee: Netezza Corporation (Framingham, MA)
Appl. No.: 10/667,127
Filed: September 18, 2003


Current U.S. Class: 714/6 ; 711/114; 714/42; 714/5; 714/7; 714/8
Current International Class: G06F 11/16 (20060101)
Field of Search: 714/5,6,7,8,42 711/114


References Cited [Referenced By]

U.S. Patent Documents
4077059 February 1978 Cordi et al.
4558413 December 1985 Schmidt et al.
4631673 December 1986 Haas et al.
4635189 January 1987 Kendall
4646229 February 1987 Boyle
4648036 March 1987 Gallant
4714992 December 1987 Gladney et al.
4853843 August 1989 Ecklund et al.
4875159 October 1989 Cary et al.
5544347 August 1996 Yanai et al.
5737601 April 1998 Jain et al.
5764903 June 1998 Yu
5828820 October 1998 Onishi et al.
6023584 February 2000 Barton et al.
6167531 December 2000 Sliwinski
6298425 October 2001 Whitaker et al.
6389459 May 2002 McDowell
6405284 June 2002 Bridge
6530035 March 2003 Bridge
6606694 August 2003 Carteau
6654862 November 2003 Morris
6681290 January 2004 Brower et al.
6694406 February 2004 Kodama et al.
6792486 September 2004 Hanan et al.
6801914 October 2004 Barga et al.
6801921 October 2004 Tsuchida et al.
2001/0051955 December 2001 Wong
2002/0144068 October 2002 Ohran
2002/0156971 October 2002 Jones et al.
Primary Examiner: Beausoliel; Robert
Assistant Examiner: Manoskey; Joseph D
Attorney, Agent or Firm: Hamilton, Brook, Smith & Reynolds, P.C.

Parent Case Text



RELATED APPLICATION(S)

This application claims the benefit of U.S. Provisional Application No. 60/411,743, filed on Sep. 18, 2002. The entire teachings of the above-application are incorporated herein.
Claims



What is claimed is:

1. A disk mirroring apparatus comprising two or more processing assemblies each processing assembly further comprising: A processing unit, further comprising one or more general purpose processors, a memory, one or more disk controllers and a network interface; and A disk coupled to the processing unit, the storage of the disk being logically divided into at least two data segments, wherein a first data segment of a first disk in a first processing assembly is mirrored by a secondary data segment of a second disk in a second processing assembly; and wherein the disk mirror apparatus further comprises: a spare processing assembly, activated by a system manager upon detecting failure of the disk or the processing unit of one of the plurality of processing assemblies; and wherein the system manager creates the spare processing assembly by redistributing data stored on the disk of a first processing assembly among the disks of a subset of the plurality of processing assemblies, not including the first processing assembly after which the first processing assembly can serve as the spare processing assembly.

2. The disk mirroring apparatus of claim 1 in which the processing assemblies further comprise additional disks coupled to the processing unit of the processing assembly, each disk logically divided into at least two segments, wherein a first data segment of each of the plurality of additional disks in a first processing assembly is mirrored by a secondary data segment of one of the plurality of additional disks in a second processing assembly.

3. The disk mirroring apparatus of claim 1 in which the disk is logically divided into at least three data segments, wherein data in a third data segment is not mirrored.

4. The disk mirroring apparatus of claim 3 in which a choice can be made about whether a new data item is to be mirrored or not mirrored by specifying whether the new data item is to be stored in a first data segment or a third data segment.

5. The disk mirroring apparatus of claim 1 further comprising: a plurality of host computers that request modifications to data stored on the disk of a processing assembly by communicating with the processing unit of the processing assembly via its network interface.

6. The disk mirroring apparatus of claim 5 wherein the modifications requested by one of the host computers to data on the disk of a first processing assembly are also automatically performed on the disk of a second processing assembly, without intervention from the host computer.

7. The disk mirroring apparatus of claim 6 wherein the processing unit of the first processing assembly stores the data requested by the host computer in the first data segment of its disk and forwards the data to the processing unit of the second processing assembly to mirror the data on the secondary data segment of its disk.

8. The disk mirroring apparatus of claim 7 wherein the processing unit of the second processing assembly receives the data that was forwarded by the processing unit of the first processing assembly, and writes that data to the secondary data segment of its disk.

9. The disk mirroring apparatus of claim 7, wherein the processing unit of the second processing assembly may defer writing data to the secondary data segment of its disk until it receives a commit command.

10. The disk mirroring apparatus of claim 1, wherein the processing unit further comprises: a mirror manager that manages mirroring of a block of data in the first segment of its disk into a secondary data segment of a disk of a second processing assembly.

11. The disk mirroring apparatus of claim 10, wherein the mirror manager manages mirroring of one or more database records in the first data segment of its disk into the secondary data segment of the disk of a the second processing assembly.

12. The disk mirroring apparatus of claim 10 wherein the mirror manager operates autonomously from any host computer.

13. The disk mirroring apparatus of claim 1, wherein a processing unit further comprises: a storage manager, wherein the storage manager assigns logical blocks which map to disk tracks having the fastest data transfer time to the first data segment of its disk and also assigns logical blocks which map to disk tracks having a slower data transfer time to the secondary data segment of its disk.

14. The disk mirroring apparatus of claim 13 wherein the assignment of blocks to segments is made once by the storage manager when the processing assembly is first made available for use.

15. The disk mirroring apparatus of claim 13 wherein the assignment of blocks to segments made by the storage manager may occur dynamically in response to data storage requests, such that blocks are allocated in the first data segment in response to a request to store data in the first data segment and blocks are allocated in the secondary data segment in response to a request to store a mirror copy of the data in the secondary data segment.

16. The disk mirroring apparatus of claim 1, wherein the secondary data segment of the disk of the second processing assembly is a logical mirror of the first data segment of the disk of the first processing assembly.

17. The disk mirroring apparatus of claim 1 further comprising: a system manager, wherein the system manager controls a distribution map so as to evenly distribute data between the disks.

18. The disk mirroring apparatus of claim 17 wherein the system manager runs on a plurality of host computers.

19. The disk mirroring apparatus of claim 1 wherein a first segment of the disk of the second processing assembly is mirrored in a secondary data segment of the disk of the first processing assembly.

20. The disk mirroring apparatus of claim 1 wherein a first segment of the disk of the second processing assembly is mirrored in a secondary data segment of a disk of a third processing assembly.

21. The disk mirroring apparatus of claim 1 wherein the first data segment of the disk of the first processing assembly is mirrored in secondary data segments of the disks of two or more other processing assemblies.

22. The disk mirroring apparatus of claim 1 wherein a second processing unit transmits to the processing unit of the spare processing assembly data stored in a secondary data segment of the disk coupled to the second processing unit, this secondary data segment having been a mirror of a first data segment of the disk of the failed processing assembly; and the processing unit of the spare processing assembly stores the data it receives from the second processing unit in the first data segment of the disk of the spare processing assembly.

23. The disk mirroring apparatus of claim 1 wherein a processing unit of a first processing assembly transmits to the processing unit of the spare processing assembly the data stored in a first data segment of the disk coupled to the first processing unit, this first data segment having been mirrored by the secondary data segment of the disk of the failed processing assembly; and the processing unit of the spare processing assembly stores the data it receives from the first processing unit in the secondary data segment of the disk of the spare processing assembly.

24. The disk mirroring apparatus of claim 1 wherein the step of redistributing data from the first disk further comprises reassigning blocks in a distribution map.

25. The disk mirroring apparatus of claim 1 wherein the plurality of processing assemblies is subdivided into at least two sets.

26. The disk mirroring apparatus of claim 25 wherein first segments of disks in a first set of processing assemblies are mirrored in secondary data segments of disks in a second set of processing assemblies, and wherein first data segments of the disks in the second set of processing assemblies are mirrored in secondary data segments of the disks in the first set of processing assemblies.

27. The disk mirroring apparatus of claim 26 wherein the probability of a double failure of both a processing assembly in the first set and a processing assembly in the second set within a given period of time is less than the probability of a double failure of two processing assemblies in the first set or the probability of a double failure of two processing assemblies in the second set within the same period of time.

28. The disk mirroring apparatus of claim 26 wherein the processing assemblies in the different sets are powered separately.

29. The disk mirroring apparatus of claim 26 wherein each set of processing assemblies is served by a separate network switch to which the network interfaces of the processing units of the processing assemblies in that set are coupled.

30. A method for disk mirroring in a system of multiple disks coupled to multiple processing units, said method comprising: writing, by a first processing unit, first data to a first segment of a first disk coupled to the first processing unit; forwarding, by the first processing unit, the first data to a second processing unit; writing, by the second processing unit, the first data to a secondary segment of a second disk coupled to the second processing unit; rebuilding, in case of a disk failure, data on a spare disk; and creating a spare disk by redistributing data stored on the first disk among a subset of the plurality of disks.

31. A method for disk mirroring of claim 30 wherein the step of creating a spare disk further comprises reassigning blocks in a distribution map.

32. A method for disk mirroring in a system of multiple disks coupled to multiple processing units, said method comprising: writing, by a first processing unit, first data to a first segment of a first disk coupled to the first processing unit; forwarding, by the first processing unit, the first data to a second processing unit; writing, by the second processing unit, the first data to a secondary segment of a second disk coupled to the second processing unit; and modifying, by a system manager, a mirroring topology in case of a topology-affecting event.

33. A method for disk mirroring of claim 32 further comprising: writing, by the second processing unit, second data to a first segment of the disk coupled to the second processing unit; forwarding, by the second processing unit, the second data to an other processing unit; and writing, by the other processing unit, the second data to a secondary segment of a disk coupled to the other processing unit.

34. A method for disk mirroring of claim 32 wherein mirroring is performed by the multiple processing units under direction of a mirror manager.

35. A method for disk mirroring of claim 32 further comprising: assigning, by a storage manager, logical blocks which map to disk tracks having the fastest data transfer time to the first segment of the first disk; and assigning, by the storage manager, logical blocks which map to disk tracks having a slower data transfer time to the secondary segment of the first disk.

36. A method for disk mirroring of claim 32 wherein the secondary segment of the second disk is a logical mirror of the first segment of the first disk.

37. A method for disk mirroring of claim 32 further comprising: distributing, by a system manager, data between disks.

38. A method for disk mirroring of claim 37 wherein the step of distributing data comprises reassigning blocks in a distribution map.

39. A method for disk mirroring of claim 37 wherein the step of distributing data between disks is performed in case of a fail-over.

40. A method for disk mirroring of claim 32 further comprising: forwarding, by the first processing unit, the first data to a third processing unit; and writing, by the third processing unit, the first data to a secondary segment of a disk coupled to the third processing unit.

41. A method for disk mirroring of claim 32 further comprising: rebuilding, in case of a processing unit failure, data on a disk associated with the failed processing unit using a spare processing unit.

42. A method for disk mirroring of claim 41 wherein the step of rebuilding further comprises: rebuilding data stored on a first segment of the disk coupled to the failed processing unit, using a secondary data segment corresponding to the first data segment of the disk coupled to the failed processing unit.

43. A method for disk mirroring of claim 41 wherein the step of rebuilding further comprises rebuilding data stored on a secondary segment of the disk coupled to the failed processing unit using a primary data segment corresponding to the secondary data segment of the disk coupled to the failed processing unit.

44. A method for disk mirroring of claim 32 further comprising: rebuilding, in case of a disk failure, data on a spare disk.

45. A method for disk mirroring of claim 44 wherein the step of rebuilding further comprises: rebuilding data on a first segment of the spare disk using a secondary data segment corresponding to the first data segment of the failed disk.

46. A method for disk mirroring of claim 44 wherein the step of rebuilding further comprises: rebuilding data on a secondary segment of the spare disk using a first data segment corresponding to the secondary data segment of the failed disk.

47. A method for disk mirroring of claim 32 further comprising: subdividing the plurality of disks into at least two sets of disks, wherein first segments of disks in a first set of disks are mirrored in secondary segments of disks in a second set of disks, and wherein first segments of the disks in the second set of disks are mirrored in secondary segments of the disks in the first set of disks.

48. A method for disk mirroring of claim 32 further comprising: rebuilding, in case of a sector failure on the first segment of the first disk, the failed sector using a corresponding sector on the secondary segment of the second disk.

49. A method for disk mirroring of claim 32 further comprising: rebuilding, in case of a sector failure on the secondary sector of the second disk, the failed sector using a corresponding sector on the first segment of the first disk.

50. A method for disk mirroring of claim 32 wherein the topology-affecting event is at least one of: a disk failure and a processing unit failure.

51. A method for disk mirroring of claim 32 wherein the topology-affecting event is an addition of a new disk to the plurality of disks.

52. A method for disk mirroring of claim 32 wherein the topology-affecting event is a removal of one disk from the plurality of disks.

53. A method for disk mirroring of claim 32 wherein the topology-affecting event is partitioning of the plurality of disks into two or more sets of disks.

54. A method for fail-over processing in a system of multiple disks coupled to multiple processing units, where each of the multiple disks is logically divided into two or more segments, said method comprising: maintaining, by a first and a second processing unit, a mirror of a first segment of a first disk in a secondary segment of a second disk; swapping, in case of a failure of the first disk or the first processing unit, data in a distribution map pointing to the first segment of the first disk and the secondary segment of the second disk; responding, by the second processing unit, to commands directed to data stored on the first segment of the first disk, using data stored in the secondary segment of the second disk; broadcasting a command directed to data stored on first segments of the multiple disks to the multiple processing units; and responding twice, by the second processing unit, to the broadcasted command, one response involving the data stored on a first segment of the second disk and another response involving the data stored on the secondary segment of the second disk.

55. A method for regenerating a disk mirror in case of a disk failure in a system of multiple disks coupled to multiple processing units, said method comprising: forwarding, by a first processing unit coupled to a disk containing a mirror of a first segment of the failed disk, data from a secondary segment of the coupled disk to a spare processing unit coupled to a spare disk; writing, by the spare processing unit coupled to the spare disk, data received from the first processing unit to a first segment of the spare disk; forwarding, by a second processing unit coupled to a disk containing a first segment mirrored by a secondary segment of the failed disk, data from the first segment of the coupled disk to the spare processing unit coupled to the spare disk; writing, by the spare processing unit coupled to the spare disk, data received from the second processing unit to a secondary segment of the spare disk; and generating a spare disk by redistributing data from a third disk to a subset of the multiple of disks, after which the third disk can serve as the spare disk.

56. A method for regenerating a disk mirror of claim 55 wherein the step of redistributing data from the first disk further comprises reassigning blocks in a distribution map.
Description



BACKGROUND OF THE INVENTION

A Redundant Array of Inexpensive Disks (RAID) provides highly available data by distributing data amongst a plurality of disks using a method defined by one of a plurality of RAID levels. In a system implementing RAID level 1, each disk in the system has an associated mirror disk, all data written to the primary disk being replicated on the mirror disk.

SUMMARY OF THE INVENTION

RAID systems typically have a single hardware controller controlling which data goes to which disks. The topology of the disk mirroring is pre-set and, in case of a fault, a replacement disk needs to be connected in order to provide the same level of fault-tolerance. In case of a RAID controller failure, the whole system may be inaccessible while the controller is being replaced. There is a need for a system that adaptively modifies the mirroring topology and can recover from the faults autonomously and transparently to a host computer.

The RAID systems typically configure mirroring architecture without considering the arrangement of the data on the drive. Data is read from and written to the disk at a constant bit rate. However, the tracks of a disk are not all the same size. Inner tracks can be a few inches smaller than outer tracks. If one rotation of the disk takes N seconds, more data is read from or written to the longer track per second than from or to the shorter track. Thus, the data transfer time differs dependent on the physical location of the data on the disk. Thus placing the mirror segments on the shorter tracks, according to one embodiment of the present invention, may increase the general speed of the system.

A method and apparatus are provided for mirroring data. The disk mirror apparatus includes a plurality processing assemblies, each consisting of one or more disks and a processing unit. Each disk has at least two data segments, a first data segment and one or more secondary data segments, and may have one or more system segments. Each processing unit is coupled to one or more of the plurality of disks. A processing unit may consist of a general-purpose processor, a memory, a network interface and a disk controller.

A processing unit that receives a request to write data to a first disk writes the data to the first data segment of the first disk and forwards the data to another processing unit. The data set on the first disk is referred to as the primary data segment or the primary data slice. The other processing unit writes a copy of the data to a secondary data segment of a second disk coupled to the other processing unit. The data set copied to the second disk is referred to as the mirror data set or the mirror data slice. The disk storing the primary data set in its first segment is referred to as the primary disk for that set, and the disk storing the mirror data set in its secondary segment is referred to as the mirror disk for that set. The secondary data segment corresponding to the first data segment is a logical mirror of the first data segment.

The first data segment includes fast tracks of the disk and the secondary data segment includes slow tracks of the disk. The fast tracks may be the outer tracks of the disk and the slow tracks may be the inner tracks of the disk.

When data is written or updated on a given primary data slice, the first processing unit may forward the data to the other processing unit to be stored in the secondary data segment. The disks need not mutually mirror each other. That is, there may be multiple disks and a first segment of a first disk may be mirrored on a second disk, while a first segment of a second disk may be mirrored on a third disk. Multiple topological mirroring arrangements may be employed, taking into account reliability and space availability concerns. One or more primary segments may be mirrored in more than one secondary segments to provide further reliability.

The disk mirror apparatus can also include a database manager that issues the request to write data to the first disk. The database manager may run on a host computer. The data forwarded to the other processing unit to be stored in the secondary data segment may be an update data set consisting of at least one complete database record.

Some data may not need to be mirrored. The data to be mirrored may be selected by not forwarding that data to the secondary processing unit. In an alternative embodiment of the invention, all data stored in the primary segment may be mirrored. In this embodiment, data not to be mirrored may be stored outside the primary segment.

The disk mirror apparatus may also include one or more spare processing units and/or disks activated when failure of one of the plurality of processing units or disks is detected. A spare processing unit may rebuild the data stored on a failed disk, or on the disks coupled to a failed processing unit, using the secondary data segment corresponding to the first data segment of the failed disk and the first data segment corresponding to the secondary data segment of the failed disk. The processing unit coupled to the disk containing the secondary data segment corresponding to the first data segment of the failed disk reads data stored in the secondary data segment and forwards the secondary data segment data to the spare processing unit. The processing unit coupled to the disk containing the first data segment corresponding to the secondary data segment of the failed processing unit reads data stored in the first data segment and forwards the first data segment data to the spare processing unit.

The disk mirror apparatus may also include a system manager, which may create a spare processing unit by redistributing data stored on a processing unit that is actively in use, among a subset of the plurality of other processing units. The system manager may redistribute the data by reassigning blocks in a distribution map. The system manager may also include a background task that performs the redistribution of the data. The system manager may also monitor all processing units and redistribute data among the disks coupled to the processing units upon detecting a change in network topology.

Each first data segment may have a plurality of corresponding secondary data segments, each of the secondary data segments being on different disks. The system manager switches access to one of the secondary data segments upon detecting failure of the disk storing the first data segment. A mirror manager in each of the plurality of processing units shares the rebuilding of the failed partitions.

The system manager may dynamically alter the topology of the mirroring configuration. An alteration may be needed after an occurrence of a topology-affecting event. Such topology-affecting events may be failures of a disk or a processing unit, an introduction of a new disk or a processing unit into the system, or a removal of one or more components. In addition, the mirroring topology may be altered if the disks are subdivided into two or more sets, with disks in one set mirroring the disks in the other set.

The apparatus may also include a distribution map that stores a striping scheme for a table. The striping scheme may distribute the table across a plurality of disks or may stripe the table on one of the plurality of disks.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.

FIG. 1 is a block diagram of a database appliance according to the principles of the present invention;

FIG. 2 is a block diagram of one of the Snippet Processing Units shown in FIG. 1;

FIG. 3 is a block diagram illustrating data stored in the first and secondary data segments;

FIG. 4 is a block diagram of software modules included in the host for managing the SPUs in the data appliance shown in FIG. 1;

FIG. 5 is a block diagram of software modules included in each of the SPUs for managing mirroring of data in the data appliance shown in FIG. 1;

FIG. 6 is a block diagram illustrating two racks of SPUs for mirroring data in the database appliance shown in FIG. 1;

FIG. 7 is a block diagram of another partitioning scheme for assigning first and secondary data segments on the disks shown in FIG. 6;

FIG. 8 is a block diagram of the SPA table and an SPU table in the system manager shown in FIG. 4;

FIG. 9 illustrates a packet of database requests sent by the mirror manager in the primary SPU to the secondary SPU;

FIG. 10 is a flow diagram illustrating the method for mirroring data implemented in any one of the SPUs shown in FIG. 7;

FIG. 11 is a block diagram illustrating access to the database;

FIG. 12 illustrates failover in the database appliance upon detection of the failure of one of the SPUs;

FIG. 13 illustrates regeneration of the first data segment of the failed SPU on a spare SPU;

FIG. 14 illustrates regeneration of the secondary data segment of the failed SPU on a spare SPU; and

FIG. 15 is a flow diagram illustrating the method for creating a new spare SPU.

DETAILED DESCRIPTION OF THE INVENTION

A description of preferred embodiments of the invention follows.

FIG. 1 is a block diagram of a database appliance 100 according to the principles of the present invention. The database appliance 100 includes a host 106 for processing database requests received from a client 102 and a plurality of disk drives 108-1, . . . , 108-n storing the database. Each of the plurality of disk drives 108-1, . . . 108-n is coupled to a respective Controller 120-1, . . . 120-n. Each Snippet Processing Unit (SPU) 110-1, . . . 110-n forms a processing assembly that includes a respective controller 122-1, . . . 122-n and at least one disk drive. In the embodiment shown, controllers 122-2, 122-3 and 122-n are each coupled to one disk drive and controller 122-1 is coupled to two disk drives. Each SPU is coupled to a host 106 through a data communication network 112. The SPU performs the primitive functions of a query to the database, controlling all aspects of reading from and writing to a disk.

The host 106 manages descriptions of tables for the database stored in the plurality of disk drives. Routines for managing and accessing records stored in the database are available to the host 114 and portions of the database can be copied from the disk drives and stored in host memory. The host receives database queries from the client 102 transmitted over a network 112. A network interface component 116 in the host receives the database queries. The network interface component 116 may be a network interface card, switch or router, Fibre Channel transceiver, InfiniBand-enabled device, or other device programmed to transmit and receive messages according to standardized data network protocols. A central processing unit (CPU) 120 in the host processes a received database query by forwarding pieces of the query through the network interface component 116 over the data communications network 112 to the SPU storing the requested record. The piece of the query forwarded to the SPU for processing is referred to a "snippet". The snippet can include a set of database operations such as join, sort, aggregate, restrict, project, expression evaluation, statistical analysis or other operations. Database requests can be processed more efficiently by off-loading some of the processing from the host to the SPU.

FIG. 2 is a block diagram of one of the Snippet Processing Units 110-1 shown in FIG. 1. The controller 122-1 includes memory 206, a central processing unit 202, a network controller 204 coupled to the data communication network and an IDE controller 208 coupled to a disk drive controller 200. The controller 122-1 is coupled to the disk drive controller 200 through a disk controller interface. The disk controller interface includes a connector interface 212 that couples connector 210A on the controller to connector 210B on the disk drive controller 200. In one embodiment, the connector interface is the American National Standards Institute ("ANSI") AT Attachment interface (ATA), commonly referred to as the Integrated Drive Electronics ("IDE") interface. Although this description may only refer to ATA interfaces and connectors throughout, it is understood that such connectors can be IDE, Small Computer Systems Interface ("SCSI"), Serial ATA, Fibre Channel Arbitrated Loop (FC-AL) (optical), or any other Hard Disk Drive ("HDD") connector. Possible embodiments of the connector interface include a printed circuit board or a cable.

A conventional disk drive 108 includes a plurality of cylinders, heads, and sectors. A physical sector on the disk is specified by a cylinder, head, and sector (CHS) address. The head specifies a track within the cylinder of tracks. The sector specifies the location of a physical sector of data within the specified track. The disk drive 220 maps Logical Block Addresses (LBAs) to physical blocks or sectors on the disk. LBAs comprise an abstraction layer above the physical disk. The disk controller 208 in the SPU 110-1 forwards an LBA to the disk drive controller 200 in the disk drive 220. The disk drive controller 200 translates the LBA to a physical cylinder, head, and sector address to locate the corresponding physical sector on the disk.

The disk drive controller 200 may automatically remap an LBA to a new physical cylinder, head and sector address should the original physical sector become unreadable. The disk drive controller 200 may maintain a list of spare sectors for this purpose. When the disk controller 200 encounters an error reading a designated sector, it remembers that LBA, indicating that it should be remapped. When a subsequent attempt to write data to that LBA is received, the disk drive controller automatically remaps the LBA to a spare sector. This capability is exploited by the invention to implement a micro-mirroring capability, in which the data that existed in a physical sector that has become unreadable is retrieved from a mirrored copy and rewritten to the same LBA, relying on the disk controller 200 to remap the LBA to a spare sector.

Returning to FIG. 1, each disk 108 is "partitioned" into at least two segments, a first data segment P and a secondary data segment M. In one embodiment, typical hard drive partitions are used for the segments. A partition is typically defined in terms of a start LBA and a length in sectors. In one embodiment, a partition is a logically contiguous, but not necessarily physically contiguous, portion of a disk. A partition is not aligned to cylinder/head boundaries. In one embodiment, low numbered LBAs may be located on the outer tracks of a disk and the LBA numbers increase towards the inner tracks.

The invention provides an additional layer of abstraction, which maps database table-relative logical blocks to LBAs. In one embodiment, a logical block corresponds to 256 disk blocks. This additional layer of abstraction allows logical blocks to be moved to new disk locations--to facilitate defragmentation to improve performance or repartitioning to adapt to the amount of data being mirrored--without changing the logical address of a logical block used by a database application. Unless otherwise indicated herein, the term "logical" refers to this database application logical mapping to the LBA layer. A database logical block comprises one or more sectors addressed via an LBA. A database logical address is a pointer to a database logical block. Therefore changing LBAs is transparent to the database application.

The LBA location of the data mirrored in a secondary data segment may differ from the LBA location of the data in the first data segment in the preferred embodiment. However, the logical address can be used to access the data stored in both the first data segment and the secondary data segment. Each partition includes a plurality of logical blocks. In one embodiment, a logical block includes an integral number of database recordsbut records may cross disk block boundaries. In a preferred embodiment, a logical block is the unit of transfer to and from the disk. Alternative embodiments for the unit of transfer include individual records, groups of records, or other relations such as tables, views, and indices. Each logical block of a secondary data segment on a disk contains all the data that is contained in the corresponding logical block in the first data segment on another disk. Thus, if an indexed database record comprising row 1 of a table is stored in logical block 48 in the first data segment, it is also stored in logical block 48 in its corresponding secondary data segment.

In one embodiment of the invention, the disk 108 has exactly one first data segment and at least one secondary data segment. There may be more than one secondary data segment. Each secondary data segment stores a logical mirror of the data slice stored in the first data segment of another physical disk. The logical mirror may contain only data that is marked for mirroring, rather than the entire content of the respective first data segment. Thus, upon failure of any disk, a logical mirror of a data slice stored in the first data segment of that disk is available from at least one secondary data segment on another disk. As shown in FIG. 1, P4 is the first data segment of disk 108-n. M4 is the secondary data segment of disk 108-1. The secondary data segment M4 is used to mirror first data segment P4, such that if disk 108-n fails, the data stored in first data segment P4 can be accessed from its logical mirror as stored in the secondary data segment M4 on disk 108-1.

As shown in FIG. 1, both secondary data segment M4 on disk 108-1 and first data segment P4 on disk 108-n include logical blocks 1 4. In this example, logical blocks 1 4 are LBA contiguous in first data segment P4, but in secondary data segment M4 only logical blocks 1 3 are LBA contiguous. Each SPU performs a logical mapping so that even though the logical blocks 1 4 are not stored in the same LBA locations and are non-contiguous, logical block 4 on both secondary data segment M4 and first data segment P4 store the same indexed data.

FIG. 3 is a block diagram illustrating data stored in the first data segment 252 and secondary data segment 254 of a disk 250. The start LBA registers 268, 272 and number of sectors register 270, 274 define the respective partition. The LBA address stored in first start LBA register 270 is of the first sector in the first data segment. The LBA address stored in the secondary start LBA register 272 is the address of the first sector in the secondary data segment. Each data segment is subdivided into a number of fixed-length extents 258. Each extent 258 has a fixed number of sectors 260. Thus, the address of the first sector in each extent can be computed knowing the number of the extent in the data segment and the number of sectors per extent. A plurality of extents are allocated for a table. For example, FIG. 1 shows Table A 256 having a plurality of extents 258 with each extent having a plurality of sectors 260. A mirror of Table A 262 is stored in the secondary data segment. The mirror of Table A is stored at the same entity location in the secondary data segment as Table A in the first data segment. Thus, a record stored in the nth sector in the mth entity in Table A in the first data segment is also stored in the nth sector in the mth entity in mirror of Table A in the secondary data segment even though the sectors have different LBA addresses on the disk.

The logical SPU block, for example, the nth block in the nth entity is converted to a logical block address for a sector on the disk storing the block in both the first data segment and the secondary data segment. The logical block address for the logical SPU address is computed knowing the start LBA for the data segment, the entity number and the number of sectors per entity.

FIG. 4 is a block diagram of software modules included in the host 106 for managing SPUs in the data appliance shown in FIG. 1. The host includes a system manager 300, a system table 302, and a communications module 304.

At system startup, the system manager 300 assigns a logical identifier (LID) to each SPU in the data appliance. For example, for the configuration shown in FIG. 1, the system manager assigns LID 0 to disk 108-1, LID 1 to disk 108-2, LID 3 to disk 108-3, LID 5 to disk 108-4 and LID 4 to disk 108-n. The system manager 300 in the host ("host system manager") creates a logical identifier map mapping logical identifiers to addresses on the storage network. The host system manager 300 provides a copy of the logical identifier map to each SPU in the system by forwarding the map over the network 112 (FIG. 1) through the communications module 304. The logical identifier map includes the logical identifier of the disk containing the secondary data segment corresponding to the first data segment on the disk coupled to the SPU, the logical identifier assigned to the disk containing the first data segment corresponding to the secondary data segment on the disk coupled to the SPU, and the addresses associated with each SPU. The logical identifier map is maintained by the host system manager 300 in the system table 302 and a new copy is forwarded to each SPU upon any change in the configuration.

In a typical commodity disk drive, the disk rotational velocity of a disk remains constant, but the relative velocity between the disk and the head varies with disk radius. Data is read from and written to the disk at a constant bit rate. However, all of the tracks are not the same size. In one embodiment, inner tracks can be a few inches smaller than outer tracks. If one rotation of the disk takes N seconds, more data is read from or written to the longer track per second than from or to the shorter track. Thus, the data transfer time differs dependent on the physical location of the data on the disk.

The storage manager 404 in the SPU 110 takes advantage of these physical properties when it defines first data segments and secondary data segments for its disk. To increase the data transfer rate, the first data segment includes logical blocks that map to tracks on the disk having the fastest data transfer rate. The secondary data segment includes logical blocks that map to tracks on the disk which have a slower data transfer rate than the sectors in the first data segment. Tracks with the slower access rate are allocated to the secondary data segment because data is only read from the secondary data segment in the event of failure of the corresponding first data segment. However, in an alternate embodiment the sectors with the slowest data transfer times may be located on the outer tracks of the disk.

In one embodiment, the storage manager 300 defines the start address and final size of the first data segment and the second data segment when the disk is initialized, after which no changes are made to the size or location of the two data segments. When the size of the two segments are the same, this method insures that there will be enough disk space to mirror all of the data on each first data segment.

In another embodiment, the storage manager 300 defines the start address and initial size of the first data segment and the second data segment when the disk is initialized, but allows the size of each segment to grow dynamically in response to demand. When a request is received to store new data in a first data segment, the storage manager 300 expands the size of the first data segment as necessary to accommodate the new data. When a mirroring request is received to store a mirror copy of data in a second data segment, the storage manager 300 expands the size of the second data segment as necessary to accommodate the mirrored data. When the distribution of data across the first data segments of the plurality of disks is uneven, this method makes better utilization of disk space.

The system manager 300 takes other factors into account when configuring the mirroring topology, such as the capabilities of the local controllers to process the data, the complexity of the data, and usage patterns

In the embodiment shown, the system manager is executed in the host. However, in alternate embodiments, the system manager may be executed in one or more of the SPUs.

FIG. 5 is a block diagram of software modules included in each of the SPUs for managing the mirroring of data in the data appliance shown in FIG. 1. The SPU includes an SPU system manager 402, a mirror manager 400, and a storage manager 404.

While the system manager 300 controls the mirroring topology, the storage manager 404 controls partition size and which data should be mirrored, and the mirror managers 400 in the controllers 122 coupled to the disks containing the corresponding first and secondary data segments coordinate the actual mirroring independently. In such a way, mirroring is autonomous from and transparent to the host.

Mirroring is optional for any given data object, but for those objects mirrored, the mirror manager 400 handles mirroring. In such a case, the mirror manager 400 maintains at least one redundant and consistent copy of all indexed data stored on the first data segment of the SPU by communicating any change in that data to the mirror manager in at least one other SPU. The sending SPU communicates any modification to a receiving SPU over the data communication network. The mirror manager in a receiving SPU receives the data to be written in a secondary data segment and calls its respective storage manager 404 to perform the write operation to write the data to the secondary data segment. In one embodiment of the invention, the system may be database-aware. That is, data may be written by the storage manager 404 only in response to a commit command, thus saving a number of writes. Similarly, after the commit request is sent to the primary SPU, the receiving SPU may acknowledge the commit to the host for the sending SPU, thus reducing the number of required acknowledgements.

The storage manager 404 processes all requests to read or write data to a first or secondary data segment of a disk. If the storage manager 404 encounters a singe-sector read error, that sector is invalidated and the data is re-written to that sector from the secondary data segment of the drive mirroring the data associated with that sector. In one embodiment, normal requests to read and write that sector are delayed until the data is re-written. If the mirror copy of the sector's data is successfully re-written, then pending requests are executed in order, starting with the original read request. If the mirror copy of the data cannot be read or if the sector cannot be re-written, the SPU system manager 402 treats the disk as having failed, and contacts the host system manager 300 to initiate failover processing. A further advantage the storage manager leverages is that the logical addresses are the same for both a primary data slice and its mirror data slice. Consistent logical addresses allow only the disk ID and segment ID to change while the logical addresses remain the same.

When an SPU fails to respond, the system manager 300 (FIG. 4) in the host performs a failover operation by switching all requests for data stored on the first data segment of a disk associated with the failed SPU to an SPU storing a logical mirror of the failed first data segment. There can be different failover modes depending on the SPU configuration. For example, where a controller is associated with more than one disk, and the controller itself fails, requests for data stored on the first data segment of each associated disk are switched to one or more SPUs containing the corresponding mirrors. Similarly, in the instance where a controller is associated with multiple disks and one of the disks fails, only requests for data stored on the first data segment of that disk are necessarily rerouted to the SPU containing its mirror.

When failover is detected and is due to an SPU failure, if a spare SPU is available then the system manager 300 (FIG. 4) issues a request to regenerate the first data segment of the disk associated with the failed SPU on the disk associated with the spare SPU using the secondary data segment mirror for the first data segment associated with the failed SPU.

After successfully regenerating the primary data slice on a disk associated with the spare SPU (or, in the case of a disk failure, on a spare disk associated with the same SPU), the mirror manager 400 restores normal operation by redirecting new requests for data from the disk containing the mirror data slice to the disk containing the new primary data slice. In the case of a failed SPU, the mirror manager in the spare SPU can also regenerate the mirror data slice of a disk associated with the failed SPU on the disk associated with the spare SPU. Similarly, in the case of a failed disk, the mirror manager in the original SPU can regenerate the mirror data slice on a spare disk.

FIG. 6 is a block diagram illustrating two racks of SPUs for mirroring data in the database appliance shown in FIG. 1. Each rack 502-1, 502-2 includes a respective Snippet Processing Array (SPA) 504-1, 504-2 and a respective network switch 500-1, 500-2. Each SPA 504-1, 504-1 includes a plurality of SPUs. SPA 504-1 includes SPUs 110-1, . . . 110-5 and SPA 504-2 includes SPUs 110-5, . . . 110-10. Each SPU 110-1, . . . 110-10 includes one of the disks 108-1, . . . 108-10. Each network switch 500-1, 500-2 is coupled to the SPU data communications network 112. In one embodiment, there are 14 SPUs per SPA and 9 SPAs per rack. Each rack has at least one switch, and preferably at least two for redundancy, coupled to each SPA.

As previously discussed, the first data segment of each disk is logically mirrored on a secondary data segment of another disk. In the embodiment shown, the first and secondary data segments are assigned such that the secondary data segment associated with a first data segment is located on a disk in another rack. For example, the secondary data segment M0 on disk 108-5 in rack 502-2 is the logical mirror of the first data segment P0 on disk 108-1 in rack 502-1.

Upon a failure of disk 108-6 in SPA 504-2, controller 122-1 accesses both secondary data segment M5 and first data segment P0. The system manager 300 regenerates both P5 and M0 on a spare disk because M0 and P5 are both on failed disk 108-6. The response time for requests for data stored in disk 108-1 are slower during the regeneration operation. Alternatively, upon failure of rack 502-1, for example, due to a failure in switch 500-1, the data slice stored in first data segment P0 is still accessible on secondary data segment M0 on disk 108-6.

In order to increase the response time for processing database requests upon failure of a first data segment, the segments are assigned so that if a first data segment and a secondary data segment are stored on a first disk, the mirror data slice for the first data segment and the primary data slice for the secondary data segment are not stored on the same disk. FIG. 7 is a block diagram of such a scheme for assigning first and secondary data segments on the disks shown in FIG. 6 to increase response time during a regeneration operation. The first data segment on disk 108-1 is P0 and the secondary data segment on disk 108-1 is M9. M0, the secondary data segment for P0 is stored on disk 108-7 and P9, the first data segment for secondary data segment M9 is stored on disk 108-10. Thus, in the case of a failure of disk 108-1, disk 108-7 is accessed to regenerate P0 and disk 108-10 is accessed to regenerate P9 while disk 108-8 acts as the first data segment for P0 data. Thus, data for the regeneration operation is stored on two disks 108-1, 108-8 increasing the availability of each disk to perform database operations.

Each disk can be further segmented to provide a plurality of secondary data segments and one first data segment on each disk. Data redundancy is increased by storing a plurality of `logical mirrors` of the first data segment, with each `logical mirror` being stored in a secondary data segment of a different disk. The system manager switches access to one of the secondary data segments upon detecting failure of the disk storing the first data segment. The presence of multiple mirrors of a data slice provides redundancy in the case of multiple failures. While in failover mode, several options are available. In one embodiment, one of the multiple mirrors is chosen as the replacement. In another embodiment, the multiple mirrors share the load of providing the data from primary data slice lost in the failure. To minimize the performance degradation that can occur during failover, the rebuild process can be shared among the mirror managers in the SPUs having a secondary data segment for the failed first data segment.

Multiple mirrors also provide the benefit of shared load in rebuilding a failed disk. As a spare SPU is allocated, the data used to recreate the failed disk is pulled from each of the multiple mirrors in parallel, rather than one single mirror bearing the entire load.

FIG. 8 is a block diagram of an SPA table 700 and an SPU table 702 in the system manager 300 shown in FIG. 4. The system manager manages the SPA table and the SPU table. The SPA table includes an SPA entry 704 for each SPA in the database appliance 100. Each SPA entry 704 includes a physical SPA identifier 706, an SPU count field 708, and a state field 710. The SPU count 706 stores a number indicating the number of SPUs installed in the SPA. The state field 710 indicates the current state of the SPA which can be `in use`, `spare`, `recovering`, `damaged`or `dead`. An SPA is `in use` if it is currently working and in use by the database appliance. The SPA is `spare` if it is working but not in use. An SPA is `recovering` if it is in the middle of a recovery operation. An SPA is `damaged` if it is not functional but potentially recoverable. An SPA is `dead` if it is completely non-functional and in need of replacement. The physical SPA identifier 706 stores a number assigned to the SPA which can be assigned by means of an external switch connected to the SPA.

The SPU table 702 includes an SPU en


Free Web Sudoku Puzzles.
Solve with your browser.
  2   7   4     1
9         3 7    
          8 2   4
3 1         8    
  5           9  
    6         1 3
4   2 1          
    9 3         7
6     9   7   4  
What is it?



Add Your Site · Terms Of Service · Privacy Policy


DISCLAIMER
Linkgrinder is a free service that searches the Internet and indexes all files found so that you may search quickly and easily for shared files. These files are created and made available individually by users whose identity we are not aware of and who we have no control over. In essence we function like a search engine tool; these files ARE NOT STORED OR SERVED BY OUR NETWORK. We are not responsible for any materials obtained by using our service. We do not monitor any of the contents of these files. These files may contain viruses, illegal materials, materials inappropriate for minors, offensive files and the like. BY USING OUR SERVICE, YOU ASSUME FULL RESPONSIBILITY FOR DOWNLOADING THESE MATERIALS AND WILL INDEMNIFY US FOR ANY DAMAGES THAT MAY BE INCURRED.

For More Specific Information VIEW OUR TERMS OF SERVICE.

Thank you and Enjoy!