Senior Fitness - Exercise and Nutrition for Aging Men and Women
FREE Article Feed for your website.
Home Ownership Magazine
Party Planning Information
Article Marketing Resources
Bio-Medical Research Article Database
Informative Articles on Life, Love and Happiness
Tutorials on Business to Writing
Famous Quotes from Famous People
Song Lyric Information
New US Patent Information
Comprehensive List of Content by Category
Online Auctions and Shopping Related Articles
Article Search
Most Recent Articles
 

Find a Fire Extinguisher
Category:
Home And Family  

The History of Hilton Hotels
Category:
Travel  

Don t Make These Mistakes With Your LLC or Corporation
Category:
Business  

No Deposit Casinos
Category:
Computers  

What Affiliate Marketing Mentors to Follow and Why
Category:
Marketing  

Blink 182 and Selling Out
Category:
Entertainment / Television  

When you think you may be pregnant
Category:
Home And Family  

Foreclosure is a compound yet very effective recovery system
Category:
Business  

Amazing Antioxidants
Category:
Health / Fitness  

Amazing Antioxidants
Category:
Health / Fitness  

Avoiding Resume Elimination at the Initial Scanning Scan is Vita...
Category:
Business  

How To Determine Which Cell Service Is Best For You
Category:
Business  

A Short History of the Wristwatch
Category:
Business  

Growing Your Own Herbs
Category:
Home And Family  

Herbal Acne Home Cures
Category:
Health / Fitness  

Creating Fresh Content for Search Engines
Category:
Marketing  

That Talking Thing will either make or break a relationship
Category:
Home And Family  

Avoid the Most Common Mistakes in Affiliate Marketing
Category:
Business  

Know the Signs of Childhood Asthma
Category:
Health / Fitness  

The Easiest Weight Loss Program Ever
Category:
Health / Fitness  

How to Expand your Business by Leaps and Bounds
Category:
Business  

Personal Accident Claim The Successful Route
Category:
Business  

Free Advertising
Category:
Marketing  

Free Advertising
Category:
Marketing  

Chicken and the Egg
Category:
Business  

Herbs for hair growth
Category:
Health / Fitness  

Organic Gardening
Category:
Home And Family  

Does Your Cleaning Business Have a Mission Statement
Category:
Business  

Internet Banking Are you online
Category:
Finance / Investment  

3 Things All Affiliate Marketers Need To Survive Online
Category:
Marketing  

How to use your subject to grab the attention of your optin news...
Category:
Marketing  

Choosing the Right Network Marketing Company 4 surprising steps
Category:
Marketing  

Diabetic diet plan guide
Category:
Health / Fitness  

6 POWERFUL VRE Business Models You Can Start Building In 2006 Us...
Category:
Business  

Free Cell Phone Ring Tones Jingling Vibes For Any Occasion
Category:
Entertainment / Television  

Free Ringtone Downloads Dazzling Tunes For Your Pleasure
Category:
Entertainment / Television  

Why choose MLM Leads
Category:
Business  

Vending Machines provide an excellent income
Category:
Business  

Discovers The Secret To The Most Popular Way Of Making Money
Category:
Business  

Internet Marketing Information Overload
Category:
Marketing  

Your New Cat Why Are the First 24 Hours So Important Part 3
Category:
Home And Family  

SearchInform 3 0 Consolidating information from various sources
Category:
Computers  

Health Insurance How to Find An Affordable Quote
Category:
Home And Family  

Brand You The Top Five Ways To Build Your Brand Online
Category:
Marketing  

Bath Salts Some that you can make at home
Category:
Health / Fitness  

Acne Treatment
Category:
Health / Fitness  

Home Business Entrepreneurs Banking On Increased Income
Category:
Business  

Hypnotherapy in Bedfordshire
Category:
Health / Fitness  

An Alaska Cruise Offers Unlimited Fun
Category:
Travel  

Guide To Ceiling Fan Blades
Category:
Home And Family  

Personal Injury Specialist No Win No Fee
Category:
Finance / Investment  

reduce tension
Category:
Business  

How to Use Free Articles to Create Massive Traffic Within Minute...
Category:
Marketing  

LASIK a Cure for Blurry Vision
Category:
Health / Fitness  

The Truth About Debt Consolidation
Category:
Business  

Don t Wait for a Mate Feather Your Nest Now Part 2
Category:
Home And Family  

Camping Water Filters A Vital Necessity
Category:
Health / Fitness  

Hawaii Vacation Accommodation and Holiday Homes in Oahu Maui Kau...
Category:
Travel  

Mortgage Lenders Making The Right Choice
Category:
Business  

Hawaii Vacation Accommodation and Holiday Homes in Oahu Maui Kau...
Category:
Travel  

Changing Face Of Holidays In The UK
Category:
Travel  

Make Your Business Memorable with Business Cards
Category:
Marketing  

Network Marketing The Organic Way
Category:
Marketing  

Finally Revealed The Secret To Explode Your Home Based Business
Category:
Business  

8 Ways to Grow Your Business During a Summer Lull
Category:
Marketing  

Benefits of Being an Affiliate Marketer
Category:
Marketing  

You Don t Need to be a Computer Scientist to Profit Online
Category:
Marketing  

Information Retrieval Systems IRS and Search Engines SEO
Category:
Marketing  

ADHD Treatments
Category:
Health / Fitness  

Getting Started Online 101
Category:
Marketing  

What To Look For In An Instant Approval Credit Card
Category:
Business  

Home Business System
Category:
Business  

Top Tips to Dramatically Increase Traffic to Your Website
Category:
Business  

Selecting The Right Home Builder
Category:
Home And Family  

Plasma TVs are Hot
Category:
Computers

Method, apparatus, program and recording medium for memory access serialization and lock management Number:6,938,131 from the United States Patent and Trademark Office (PTO) owispatent

Home    Author Login    Submit Article    Article Search    Add Your Link    Edit Your Link    Contact Us    Advertising    Disclaimer

   

 
Web LinkGrinder.com

Top Breaking News
     Greek, Cypriot Leaders Resume Unification Talks in Nicosia by Nathan Morley
     Indonesia Tobacco Sales Grow, Raising Health Fears
     South Korea Allows Top Defector to Travel Overseas by VOA News

Title: Method, apparatus, program and recording medium for memory access serialization and lock management

Abstract: To improve the processing speed in a case where a program requiring memory access serialization is executed in a multiprocessor environment with a load instruction out-of-order execution function.Each of CPUs has the function of performing store forwarding (SF) when the address regions of a pair of store and load instructions coincide with each other. Each CPU stops SF and performs store forwarding avoidance (SFA) when the two address regions do not coincide with each other but have an overlap therebetween. Each of SF and SFA is executed with priority over out-of-order execution. Each CPU performs SFA effective in limiting out-of-order execution with respect to a predetermined store instruction on a program and a load instruction mated to the store instruction and given after the store instruction, and ensures that data relating to the store instruction can be observed from other CPUs.

Patent Number: 6,938,131 Issued on 08/30/2005 to Ogasawara


Inventors: Ogasawara; Takeshi (Hachioji, JP)
Assignee: International Business Machines Corporation (Armonk, NY)
Appl. No.: 413850
Filed: April 15, 2003

Foreign Application Priority Data

Apr 26, 2002[JP]2002-125343

Current U.S. Class: 711/151
Intern'l Class: G06F 012/06
Field of Search: 711/150,151,152


References Cited [Referenced By]

U.S. Patent Documents
5968135Oct., 1999Teramoto et al.
6006325Dec., 1999Burger et al.
6466988Oct., 2002Sukegawa et al.
6594736Jul., 2003Parks.
6708269Mar., 2004Tiruvallur et al.


Other References

Bacon et al., "Thin Locks: Featherweight Synchronization for Java", IBM Research Center, 1998, pp. 258-268.
Hoare, "Monitors: An Operating System Structuring Concept", Communications of the ACM, vol. 17, No. 10,1974, pp. 549-557.

Primary Examiner: Ellis; Kevin L.
Attorney, Agent or Firm: McGinn & Gibb, PLLC, Jennings; Derek S.

Claims



1. A method of memory access serialization in a computer having a plurality of CPUs and a memory shared by the plurality of CPUs, each CPU having an out-of-order execution slop mechanism, in which, if address regions on the memory relating to a pair of a store instruction and a load instruction in preceding and subsequent positions, respectively, in written order of a program executed by each of the CPUs have an overlap therebetween, the out-of-order execution stop mechanism stops our-of-order execution of the load instruction and enables data relating to the store instruction to be observed from ether CPUs before execution of the load instruction, said method comprising:

a detection step of enabling each CPU to detect the pair of the store instruction and the load instruction, which need to be serialized in a before-after relationship with respect to memory access, which are in the preceding and subsequent positions in written order on the program, and which are placed on the program by being set so that their address regions have an overlap therebetween and respectively include address regions for data which needs to be stored and loaded; and

an activation step of activating, with respect to a detected pair of instructions, the out-of-order execution stop mechanism of one of the CPUs executing the program relating to the detected pair of instructions.

2. The method according to claim 1, wherein each CPU has a store buffer for writing data to the memory according to the store instruction of the program;

wherein the out-of-order execution stop mechanism has a store forwarding mechanism which stops out-of-order execution of the load instruction in the pair of

instructions and notifies, as data B, of data A written to the store buffer according to the store instruction in the pair of instructions from the store buffer to the program if the address regions on the memory relating to the pair of instructions coincide with each other, and a store forwarding avoidance mechanism which stops out-of-order execution of the load instruction in the pair of instructions and performs, instead of store forwarding, storing of data A from the store buffer into a memory hierarchy and notifying of data B from the memory hierarchy to the program on the basis of the load instruction in the pair of instructions if the address regions on the memory relating to the pair of instructions do not coincide with each other but overlap each other; and

wherein, in said activation step, with respect to the detected pair of instructions, the store forwarding avoidance mechanism in the out-of-order execution stop mechanism of one of the CPUs executing the program relating to the detected pair of instructions is activated.

3. The method according to claim 1, further comprising:

a load instruction execution step of executing load of data on the basis of the load instruction in the pair of instructions, the address region of the store instruction being set so as to store only data which essentially needs to be stored, the address region of the load instruction being set so as to load data which does not need to be loaded as well as data which needs to be loaded; and

an extraction step of extracting the data which needs to be loaded from the data loaded by the load instruction.

4. The method according to claim 1, wherein the program includes first and second program portions relating to

first and second processings, the first and second processings including a store processing portion, an observation processing portion for observing data that might have been stored by the other processing after execution of the store processing portion, and a branching processing portion for branching to a corresponding process according to a result of observation by the observation processing portion; and

wherein the pair of the store instruction and the load instruction are included in the store processing portion and the observation processing portion in the first and second processings.

5. The method according to claim 4, wherein the first and second processings are processing relating to object lock release and processing relating to object acquisition, respectively, in an object management method.

6. A method of lock management executed by a computer having a plurality of CPUs and a memory shared by the plurality of CPUs, each CPU having an out-of-order execution stop mechanism, in which, if address regions on the memory relating to a pair of a store instruction and a load instruction in preceding and following positions, respectively, in written order of a program executed by each of the CPUs have an overlap therebetween, the out-of-order execution stop mechanism stops out-of-order execution of the load instruction and enables data relating to the store instruction to be observed from other CPUs before execution of the load instruction, said method comprising:

managing a lock on an object by preparing, in a storage area provided in correspondence with the object, a field for storing a bit indicating the kind of lock and an identifier for a thread acquiring a lock according to a first kind of lock or an identifier relating to a second kind of lock, in a situation where a plurality of threads exist, said method also comprising:

an object acquisition method of enabling a second thread to obtain a lock on a certain object held by a first thread, and a lock release method of enabling the first thread to release the lock from the certain object, said object acquisition method including:

(a1) a step of determining whether or not the bit indicating the kind of lock on the certain object indicates the first kind of lock; and

(a2) a step of setting a contention bit when the first kind of lock is indicated, said lock release method including:

(b1) a step of determining whether or not the bit indicating the kind of lock indicates the first kind of lock;

(b2) a step of storing, in the storage area, information indicating that no thread holding the lock on the certain abject exists;

(b3) a step of determining whether or not the contention bit is set when the reading of the bit indicating the kind of lock indicates the first kind of lock; and

(b4) a step of terminating processing if it is determined that the contention bit is not set,

wherein address regions on the memory relating to a store instruction S corresponding to storage in said step (b2) and a load instruction L corresponding to reading of the bit in said step (b3) are set so that the out-of-order execution stop mechanism is activated with respect to the store instruction S and the load instruction L; and

wherein said steps (b2) and (b3) are executed on the basis of said setting of the address regions.

7. The method according to claim 6, wherein each CPU has a store buffer for writing data to the memory according to the stare instruction of the program; wherein the out-of-order execution stop mechanism has a store forwarding mechanism which stops out-of-order execution of the load instruction in the pair of instructions and notifies, as data B, of data A written to the store buffer according to the store instruction in the pair of instructions from the store buffer to the program if the address regions on the memory relating to the pair of instructions coincide with each other, and a store forwarding avoidance mechanism which stops out-of-order execution of the load instruction in the pair of instructions and performs, instead of store forwarding, storing of data A from the store buffer into a memory hierarchy and notifying of data B from the memory hierarchy to the program on the basis of the load instruction in the pair of instructions if the address regions on the memory relating to the pair of instructions do not coincide with each other but overlap each other; and

wherein the address regions on the memory relating to the store instruction S and the load instruction L are set to activate the store forwarding avoidance mechanism; and wherein said steps (b2) and (b3) are executed on the basis of setting of the address regions.

8. The method according to claim 6, wherein the store instruction S corresponding to the storage in said step (b2) is set with its address region selected as an address region A1 relating to the storage region in which information indicating that no thread holding the lock on the certain object exists is stored;

wherein the load instruction L corresponding to the read of the bit in said step (b3) is set with its address region selected as an address region A2 containing the entire address region relating to the bit and at least a portion or the address region A1; and

wherein said steps (b2) and (b3) are executed on the basis of setting of the address regions.

9. The method according to claim 6, wherein said lock release method further includes:

(b5) a step of enabling the first thread to enter a state under exclusive control performed by a mechanism when it is determined that the contention bit is set, the mechanism enabling exclusive control of access to the object and, if a predetermined condition is satisfied, a waiting operation of the thread and a notification operation to the thread in the waiting state;

(b6) a step of executing a notification operation to the thread in the waiting state; and

(b7) a step of enabling the first thread to exit from the state under exclusive control.

10. A method of lock management executed by a computer having a plurality of CPUs and a memory shared by the plurality of CPUs, each CPU having an out-of-order execution stop mechanism, in which, if address regions on the memory relating to a pair of a store instruction and a load instruction in preceding and following positions, respectively, in written order of a program executed by each of the CPUs have an overlap therebetween, the out-of-order execution stop mechanism stops out-of-order execution of the load instruction and enables data relating to the store instruction to be observed from other CPUs before execution of the load instruction, said method comprising:

managing a lock on an object by preparing, in a storage area provided in correspondence with the object, a field for storing a bit indicating the kind of lock and an identifier for a thread acquiring a lock according to a first kind of lock or an identifier relating to a second kind of lock which is a lock method of managing the thread making access to the object by wing a queue, in a situation where a plurality of threads exist, said method also comprising:

(c1) a step of enabling a first thread to enter a state under exclusive control performed by a mechanism enabling exclusive control of access to the object and, if a predetermined condition is satisfied, a waiting operation of the thread and a notification operation to the thread in the waiting state;

(c2) a step of determining whether or not the bit indicating the kind of lock on a certain object indicates the first kind of lock;

(c3) a step of setting a contention bit when the bit indicating the kind of lock on the certain object indicates the first kind of lock;

(c4) a step of determining, after said step (c3), whether or not the first thread can acquire the first kind of lock on the basis of the contents of the storage area provided in correspondence with the certain object; and

(c5) a step of storing the bit indicating the second kind of lock and the identifier relating to the second kind of lock in the storage area provided in correspondence with the certain kind of object when the first thread can acquire the first kind of lock,

wherein the first thread exits from the state under exclusive control after the completion of necessary processing on the certain object;

wherein address regions on the memory relating to a store instruction S corresponding to the setting of the contention bit in said step (c3) and a load instruction L corresponding to react of the contents of the storage area in said step (c4) are set so that the out-of-order execution stop mechanism is activated with respect to the store instruction S and the load instruction L; and

wherein said steps (c3) and (c4) are executed on the basis of setting of the address regions.

11. The method according to claim 10, wherein each CPU has a store buffer for writing data to the memory according to the store instruction of the program;

wherein the out-of-order execution stop mechanism has a store forwarding mechanism which stops out-of-order execution of the load instruction in the pair of instructions and notifies, as data B, of data A written to the store buffer according to the store instruction in the pair of instructions from the store buffer to the program if the address regions on the memory relating to the pair of instructions coincide with each other, and a store forwarding avoidance mechanism which stops out-of-order execution of the load instruction in the pair of instructions and performs, instead of store forwarding, storing of data A from the store buffer into a memory hierarchy and notifying of data B from the memory hierarchy to the program on the basis of the load instruction in the pair of instructions if the address regions on the memory relating to the pair of instructions do not coincide with each other but overlap each other; and

wherein the address regions on the memory relating to the store instruction S and the load instruction L are set to activate the store forwarding avoidance mechanism; and

wherein said steps (c3) and (c4) are executed on the basis of said setting of the address regions.

12. The method according to claim 10, wherein the store instruction S corresponding to the setting of the contention bit in said step (c3) is set with its address region selected as an address region A1 relating to the contention bit;

wherein the load instruction L corresponding to the read of the contents of the storage area in said step (c4) is set with its address region selected as an address region A2 containing the entire address region of the storage area and at least a portion of the address region A1; and

wherein said steps (c3) and (c4) are executed on the basis of said setting of the address regions.

13. An apparatus which performs memory access serialization in a computer having a plurality of CPUs and a memory shared by the plurality of CPUs, each CPU having an out-of-order execution stop mechanism, in which, if address regions on the memory relating to a pair of a store instruction and a load instruction in preceding and subsequent positions, respectively, in written order of a program executed by each of the CPUs have an overlap therebetween, the out-of-order execution stop mechanism stops out-of-order execution of the load instruction and enables data relating to the store instruction to be observed from the other CPUs before execution of the load instruction, said apparatus comprising:

wherein each CPU is adapted to detect the pair of the store instruction and the load instruction, which need to be serialized in a before-after relationship with respect to memory access, which are in the preceding and subsequent positions in written order on the program, and which are placed on the program by being set so that their address regions have an overlap therebetween and respectively include address regions for data which needs to be stored and loaded; and

wherein each of said CPUs is adapted to activate, with respect a the detected pair of instructions, the out-of-order execution stop mechanism of one of the CPUs executing the program relating to the detected pair of instructions.

14. The apparatus according to claim 13, further comprising a plurality of CPUs and a memory shared by the plurality of CPUs;

wherein each CPU has a store buffer for writing data to the memory according to the store instruction of the program;

wherein the out-of-order execution stop mechanism has a store forwarding mechanism which stops out-of-order execution of the load instruction in the pair of instructions and notifies, as data B, of data A written to the store buffer according to the store instruction in the pair of instructions from the store buffer to the program if the address regions on the memory relating to the pair of instructions coincide with each other, and a store forwarding avoidance mechanism which stops out-of-order execution of the load instruction in the pair of instructions and performs, instead of store forwarding, staring of data A from the store buffer into a memory hierarchy and notifying of data B from the memory hierarchy to the program on the basis of the load instruction in the pair of instructions if the address regions on the memory relating to the pair of instructions do not coincide with each other but overlap each other; and

wherein said CPUs activate, with respect to the detected pair of instructions, the store forwarding avoidance mechanism in the out-of-order execution stop mechanism of one or the CPUs executing the program relating to the detected pair of instructions.

15. The apparatus according to claim 13, further comprising:

wherein each of said CPUs is adapted to load data on the basis of the load instruction in the pair of instructions, the address region of the store instruction being set so as to store only data which essentially needs to be stored, the address region of the load instruction being set so as to load data which does not need to be loaded as well as data which needs to be loaded; and

wherein each of said CPUs is adapted to extract the data which needs to be loaded from the data loaded by the load instruction.

16. The apparatus according to claim 13, wherein each of said CPUs is adapted to perform first and second program portions relating to first and second processings, the first and second processings including a store processing portion, an observation processing portion for observing data that might have been stored by the other processing after execution of the store processing portion, and a branching processing portion for branching to a corresponding process according to a result of observation by the observation processing portion; and

wherein the pair of the store instruction and the load instruction are included in the store processing portion and the observation processing portion in the first and second processing.

17. The apparatus according to claim 16, wherein the first and second processings are processing relating to object lock release and processing relating to object acquisition, respectively, in an object management apparatus.

18. An apparatus which executes lock management with a computer having a plurality of CPUs and a memory shared by the plurality of CPUs, each CPU having an out-of-order execution stop mechanism, in which, if address regions on the memory relating to a pair of a store instruction and a load instruction in preceding and following positions, respectively, in written order of a program executed by each of the CPUs have an overlap therebetween, the out-of-order execution stop mechanism stops out-of-order execution of the load instruction and enables data relating to the store instruction to be observed from other CPUs before execution of the load instruction, said apparatus managing a look on an object by preparing, in a storage area provided in correspondence with the object, a field for storing a bit indicating the kind of lock and an identifier for a thread acquiring a lock according to the first kind of lock or an identifier relating to the second kind of lock, in a situation where a plurality of threads exist, said apparatus comprising:

an object acquisition device which enables a second thread to obtain a lock on a certain object held by a first thread, and a lock release device which enables the first thread to release the lock from the certain object, wherein each of said CPUs is adapted to:

(a1) determine whether or not the bit indicating the kind of lock on the certain object indicates the first kind of lock; and

(a2) set a contention bit when the first kind of lock is indicated, said lock release device including:

(b1) determine whether or not the bit indicating the kind of lock indicates the first kind of lock;

(b2) store in the storage area, information indicating that no thread holding the lock on the certain object exists;

(b3) determine whether or not the contention bit is set when the bit indicating the kind of lock indicates the first kind of lock; and

(b4) terminate processing if it is determined that the contention bit is not set,

wherein address regions on the memory relating to a store instruction S corresponding to storage performed in (b2) and a load instruction L corresponding to read of the bit performed in (b3) are set so that the out-of-order execution stop mechanism is activated with respect to the store instruction S and the load instruction L; and

wherein (b2) and (b3) perform processing on the basis of setting of the address regions.

19. The apparatus according to claim 18, wherein each CPU has a store buffer for writing data to the memory according to the store instruction of the program;

wherein the out-of-order execution stop mechanism has a store forwarding mechanism which stops out-of-order execution of the load instruction in the pair of instructions and notifies, as data B, of data A written to said store buffer according to the store instruction in the pair of instructions from said store buffer to the program if the address regions on the memory relating to the pair of instructions coincide with each other, and a store forwarding avoidance mechanism which stops out-of-order execution of the load instruction in the pair of instructions and performs, instead of store forwarding, storing of data A front the store buffer into a memory hierarchy and notifying of data B from the memory hierarchy to the program on the basis of the load instruction in the pair of instructions if the address regions on the memory relating to the pair of instructions do not coincide with each other but overlap each other and

wherein the address regions on the memory relating to the store instruction S and the load instruction L are set to activate said stare forwarding avoidance mechanism; and

wherein (b2) and (b3) perform processing on the basis of said setting of the address regions.

20. The apparatus according to claim 18, wherein the store instruction S corresponding to the storage performed by (b2) is set with its address region selected as an address region A1 relating to the storage region in which information indicating that no thread holding the lock on the certain object exists is stored;

wherein the load instruction L corresponding to the read of the bit performed by (b3) is set with its address region selected as an address region A2 containing the entire address region relating to the bit and at least a portion of the address region A1; and

wherein (b2) and (b3) perform processing on the basis of said setting of the address regions.

21. The apparatus according to claim 18, wherein each of said CPUs is adapted to:

(b5) enable the first thread to enter a state under exclusive control performed by a mechanism when it is determined that the contention bit is set, the mechanism enabling exclusive control of access to the object and, if a predetermined condition is satisfied, a waiting operation of the thread and a notification operation to the thread in the waiting state;

(b6) execute a notification operation to the thread in the waiting state; and

(b7) enable the first thread to exit from the stare under exclusive control.

22. An apparatus which performs lock management with a computer having a plurality of CPUs and a memory shared by the plurality of CPUs, each CPU having an out-of-order execution stop mechanism, in which, if address regions on the memory relating to a pair of a store instruction and a load instruction in preceding and following positions, respectively, in written order of a program executed by each of the CPUs have an overlap therebetween, the out-of-order execution stop mechanism stops out-of-order execution of the load instruction and enables data relating to the store instruction to be observed from other CPUs before execution of the load instruction, said apparatus managing a lock on an object by preparing, in a storage area provided in correspondence with the object, a field for storing a bit indicating the kind of lock and an identifier for a thread acquiring a lock according to the first kind of lock or an identifier relating to the second kind of lock which is a lock method of managing the thread making access to the object by using a queue, in a situation where a plurality of threads exist, wherein each of said CPUs is adapted to:

(c1) enable a first thread to enter a state under exclusive control performed by a mechanism enabling exclusive control of access to the object and, if a predetermined condition is satisfied, a waiting operation of the thread and a notification operation to the thread in the waiting state;

(c2) determine whether or not the bit indicating the kind of lock on a certain object indicates the first kind of lock;

(c3) set a contention bit when the bit indicating the kind of lock on the certain object indicates the first kind of lock;

(c4) determine after the operation (c3), whether or not the first thread can acquire the first kind of lock on the basis of the contents of the storage area provided in correspondence with the certain object; and

(c5) store the bit indicating the second kind of lock and the identifier relating to the second kind of lock in the storage area provided in correspondence with the certain kind of object when the first thread can acquire the first kind of lock, wherein the first thread exits from the state under exclusive control after the completion of necessary processing on the certain object;

wherein address regions on the memory relating to a store instruction S corresponding to the setting of the contention bit made by (c3) and a load instruction L corresponding to read of the contents of the storage area performed by (c4) are set so that the out-of-order execution stop mechanism is activated with respect to the store instruction S and the load instruction L; and

wherein (c3) and (c4) perform processing on the basis of setting of the address regions.

23. The apparatus according to claim 22, wherein each CPU has a store buffer for writing data to the memory according to the store instruction of the program;

wherein the out-of-order execution slop mechanism has a store forwarding mechanism which stops out-of-order execution of the load instruction in the pair of instructions and notifies, as data B, of data A written to said store buffer according to the store instruction in the pair of instructions from said store buffer to the program if the address regions on the memory relating to the pair of instructions coincide with each other, and a store forwarding avoidance mechanism which stops out-of-order execution of the load instruction in the pair of instructions and performs, instead of store forwarding, storing of data A from the store buffer into a memory hierarchy and notifying of data B from the memory hierarchy to the program on the basis of the load instruction in the pair of instructions if the address regions on the memory relating to the pair of instructions do not coincide with each other but overlap each other; and

wherein the address regions on the memory relating to the store instruction S and the load instruction L are set to activate the store forwarding avoidance mechanism; and

wherein (c3) and (c4) perform processing on the basis of said setting of the address regions.

24. The apparatus according to claim 22, wherein the store instruction S corresponding to the setting of the contention bit made by (c3) is set with its address region selected as an address region A1 relating to the contention bit;

wherein the load instruction L corresponding to the read of the contents of the storage area performed by (c4) is set with its address region selected as an address region A2 containing the entire address region of the storage area and at least a portion of the address region A1; and

wherein (c3) and (c4) perform processing on the basis of setting of the address regions.

25. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform a method of memory access serialization in a computer having a plurality of CPUs and a memory shared by the plurality of CPUs, each CPU having an out-of-order execution stop mechanism, in which, if address regions on the memory relating to a pair of a store instruction and a load instruction in preceding and subsequent positions, respectively, in written order of a program executed by each of the CPUs have an overlap therebetween, the out-of-order execution stop mechanism stops out-of-order execution of the load instruction and enables data relating to the store instruction to be observed from the other CPUs before execution of the load instruction, said memory access serialization method including:

a detection step of enabling each CPU to detect the pair of the store instruction and the load instruction, which need to be serialized in a before-after relationship with respect to memory access, which are in the preceding and subsequent positions in written order on the program, and which are placed on the program by being set so that their address regions have an overlap therebetween and respectively include address regions for data which needs to be stored and loaded; and

an activation step of activating, with respect to the detected pair of instructions, the out-of-order execution stop mechanism of one of the CPUs executing the program relating to the detected pair of instructions.

26. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform, method of lock management executed by a computer having a plurality of CPUs and a memory shared by the plurality of CPUs, each CPU having an out-of-order execution stop mechanism, in which, if address regions on the memory relating to a pair of a store instruction and a load instruction in preceding and following positions, respectively, in written order of a program executed by each of the CPUs have an overlap therebetween, the out-of-order execution stop mechanism stops out-of-order execution of the load instruction and enables data relating to the store instruction to be observed from other CPUs before execution of the load instruction, said lock management method including managing a lock on an object by preparing, in a storage area provided in correspondence with the object, a field for storing a bit indicating the kind of lock and an identifier for a thread acquiring a lock according to the first kind of lock or an identifier relating to the second kind of lock, in a situation where a plurality of threads exist, said lock management method also including an object acquisition method of enabling a second thread to obtain a lock on a certain object held by a first thread, and a lock release method of enabling the first thread to release the lock from the certain object, said object acquisition method including:

(a1) a step of determining whether or not the bit indicating the kind of lock on the certain object indicates the first kind of lock; and

(a2) a stop of setting a contention bit when the first kind of lock is indicated, said lock release method including:

(b1) a stop of determining whether or not the bit indicating the kind of lock indicates the first kind of lock;

(b2) a step of storing, in the storage area, information indicating that no thread holding the lock on the certain object exists;

(b3) a stop of determining whether or not the contention bit is set when the reading of the bit indicating the kind or lock indicates the first kind of lock; and

(b4) a step of terminating processing if it is determined that the contention bit is not set, address regions on the memory relating to a store instruction S corresponding to storage in said step (b2) and a load instruction L corresponding to reading of the bit in said step (b3) being set so that the out-of-order execution stop mechanism is activated with respect to the store instruction S and the load instruction L, said steps (b2) and (b3) being executed on the basis of said setting of the address regions.

27. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform a method of lock management executed by a computer having a plurality of CPUs and a memory shared by the plurality of CPUs, each CPU having an out-of-order execution stop mechanism, in which, if address regions on the memory relating to a pair of a store instruction and a load instruction in preceding and following positions, respectively, in written order of a program executed by each of the CPUs have an overlap therebetween, the out-of-order execution stop mechanism stops out-of-order execution of the load instruction and enables data relating to the stare instruction to be observed from other CPUs before execution of the load instruction, said lock management method including managing a lock on an object by preparing, in a storage area provided in correspondence with the object, a field for storing a bit indicating the kind of lock and an identifier for a thread acquiring a lock according to a first kind of lock or an identifier relating to a second kind of lock which is a lock method of managing the thread making access to the object by using a queue, in a situation where a plurality of threads exist, said lock management method also including:

(c1) a step of enabling a first thread to enter a state under exclusive control performed by a mechanism enabling exclusive control of access to the object and, if a predetermined condition is satisfied, a waiting operation of the thread and a notification operation to the thread in the waiting state;

(c2) a step of determining whether or not the bit indicating the kind of lock on a certain object indicates the first kind of lock;

(c3) a step of setting a contention bit when the bit indicating the kind of lock on the certain object indicates the first kind of lock;

(c4) a step of determining, after said step (c3), whether or not the first thread can acquire the first kind or lock on the basis of the contents of the storage area provided in correspondence with the certain object; and

(c5) a step of storing the bit indicating the second kind of lock and the identifier relating to the second kind of lock in the storage area provided in correspondence with the certain kind of object when the first thread can acquire the first kind of lock, the first thread exiting from the state under exclusive control after the completion of necessary processing on the certain object, address regions on the memory relating to a store instruction S corresponding to the setting of the contention bit in said step (c3) and a load instruction L corresponding to read of the contents of the storage area in said step (c4) being set so that the out-of-order execution stop mechanism is activated with respect to the store instruction S and the load instruction L, and said steps (c3) and (c4) being executed on the basis of setting of the address regions.
Description



DETAILED DESCRIPTION OF THE INVENTION

1. Field of the Invention

The present invention relates to a method, an apparatus, a program and a recording medium for memory access serialization and lock management and, more particularly, to a method, an apparatus, a program and a recording medium for performing memory access serialization and lock management so as to ensure the desired reliability with which store instructions and load instructions relating to memory access are serialized no matter what the load instruction out-of-order execution function of the CPU.

2. Background Art

Software for servers uses multiple threads for processing requests from a plurality of clients. In general, such thread-parallel software enables parallel processing using multiple processors, i.e., simultaneous execution of a number of processings. In multithread processing, updating of the same data through a plurality of threads can occur frequently. In such a situation, a synchronization mechanism may be used for ensuring data consistency. In Java® (a trademark of Sun Microsystems, Inc.) that is multithread-capable on language level, two kinds of synchronization mechanisms are supported: the synchronized method and the synchronized block. Today, many software programs for servers such as middleware are written in Java®. Therefore the performance of the two synchronization mechanisms in Java® greatly influences the performance of servers.

The applicant of the present invention proposed the tasuki lock as an algorithm for an improved synchronization mechanism in an object management system (Published Unexamined Patent Application No. 2000-76086). The tasuki lock is characterized in that spin-wait, which is a weak point in a thin lock, is eliminated while the advantage of the thin lock, i.e., the advantage that only one atomic instruction suffices, is maintained. The tasuki lock is the best algorithm among those presently conceivable, and Java® virtual machines in which the tasuki lock is implemented are well known.

The tasuki lock in an object management system will be described along with other lock methods. To synchronize operations for access to an object in a program in which a plurality of threads are operated, codes of the programs are formed in such a manner that the object is locked before access, access is then made, and the object unlocked after the access. As a method of implementation of this object locking, a spin lock and a queue lock are well known. A combination of these methods (hereinafter referred to as a composite lock) has also been proposed.

(1) Spin Lock

A spin lock is a lock method in which the state of a lock is managed by storing identifiers for threads executing locking on an object in correspondence with the object. In the spin lock, when a thread T fails to acquire a lock on an object o, that is, if another thread S has already locked the object o, locking is repeatedly attempted until success is attained in locking. Typically, locking or unlocking is performed as described below by using an atomic machine instruction such as compare_and_swap.
TABLE 1
10 /* lock */
20 while (compare_and_swap(&o->lock, 0, thread_id ( ))= =0)
30  yield( );
40 /* access to o */
. . .
50 /* unlock*/
60 o->lock=0;


Referring to Table 1, locking is performed at the twentieth and thirtieth lines. Yield( ) is effected until a lock is acquired. Yield( ) comprises stopping the execution of the current thread and handing over control to the scheduler. Ordinarily, the scheduler selects one of the other executable threads and makes the selected thread run. Thereafter, the scheduler again makes the preceding thread run, and the execution of the while statement is repeated until success is attained in lock acquisition. In the case where "yield" exists, not only wasting of CPU resources but also a dependence of the implementation on the scheduling method in the platform cannot be avoided and, therefore, it is difficult to write a program so that the program operates as expected. The condition compare_and_swap in the while statement at the twentieth line is such that the content of the field o->lock prepared for the object o and 0 is compared and, if the result of comparison is true, the ID (thread_id( )) of the thread is written to the field. The state where 0 is stored in the field prepared for the object o indicates that there is no thread locking the object. Therefore, when unlocking is performed at the sixtieth line, 0 is stored in o->lock. This field is, for example, a filed of one word for, but may have any number of bits sufficient for storing the identifier of the thread.

(2) Queue Lock

A queue lock is a lock method in which threads executing access to an object are managed by using a queue. In the queue lock, when a thread T fails to lock an object o, it suspends by setting itself in a queue of the object o. An unlocking code includes a code for checking whether or not the queue is empty. If the queue is not empty, one thread is taken out from the queue and is resumed. Such a queue lock is implemented integrally with the scheduling mechanism of an operating system (OS) and is provided as an application programming interface (API) of the OS. Semaphore and Mutex are a typical example of a means for such a queue lock. In a queue lock, one word does not suffice as a space overhead. Ordinary, several ten kilobytes are required for a space overhead. It is also to be noted that some lock is acquired or released in a locking or unlocking function since the queue, which is a common resource, is operated.

(3) Composite Lock

A multithread-compatible program is written so that access to a common resource is protected with a lock, by considering execution by multiple threads. However, a case is conceivable in which a multithread-compatible library is used by a single-thread program. A case is also possible in which the occurrence of lock contention is substantially zero even at the time of execution by multiple threads. In fact, a report according to an execution record of Java® (a trademark of Sun Microsystems, Inc.) says that there have been substantially no instances of contention in access to objects in many applications.

Therefore "locking an unlocked object, obtaining access, and unlocking the object" is considered a path executed with increased frequency. This path is executed with markedly high efficiently in the case of a spin lock, but is executed with low efficiency both in terms of time and in terms of space in the case of a queue lock. On the other hand, a CPU resource is wastefully consumed in the case of a spin lock when contention occurs actually, although the frequency of contention is low. There is no such problem with a queue lock.

The basic idea of a composite lock resides in using a suitable combination of a lock using simple processing such as spin lock (referred to as "thin lock") and a lock using complicated processing such as a queue lock (referred to as "fat lock") to execute the above-described high-frequency path while maintaining the efficiency at the time of contention. More specifically, locking by a thin lock is first attempted, a transition to locking by a fat lock is made if contention occurs when locking by the thin lock is attempted, and the fat lock is thereafter used.

This composite lock has a field for locking in an object, as does a spin lock. The value of a "thread identifier" or a "fat lock identifier" and a Boolean value indicating which one of the thread identifier value and the fat lock identifier value is stored are stored in the field.

The procedure of locking is as described below.

1) Thin lock acquisition is attempted by an atomic instruction (e.g., compare_and_swap). If success is attained in acquiring the thin lock, access to the object is executed. In the case of failure, it can be found that transition to the fat lock has already been made, or that the thin lock is being maintained by some other thread.

2) If transition to the fat lock has already been made, the fat lock is acquired.

3) In a case where contention occurs in locking by the thin lock, the thin lock is acquired, transition to the fat lock is made and the fat lock is acquired (executed by an inflate function according to a description made below).

The composite lock is implemented in one of two ways according to whether or not yield is effected in "acquisition of thin lock" in the procedure 3, as described below in detail. It is assumed here that the field for locking is one word and, for further simplification, that "thread identifier" or "fat lock identifier" is always an even number other than 0; "thread identifier" is stored if the least significant bit in the locking field is 0; and "fat lock identifier" is stored if the least significant bit in the locking field is 1.

Example 1 of Composite Lock

A case of the composite lock in which yield is effected in "acquisition of thin lock". A locking function can be written in accordance with the above-described procedure, as shown below.

TABLE 2
10 :void lock(o) {
20 :   if (compare_and_swap(&o->lock, 0, thread_id( ))
30 :    return;
40 :   while (! (o->lock & FAT_LOCK)) {
50 :    yield( );
60 :    if (compare_and_swap (&o->lock, 0, thread_id( ))){
70 :     inflate(o);
80 :     return;
90 :    }
100:   }
110:    fat_lock(o->lock)
120:    return;
130: }
150: void unlock (o){
160:   If (o->lock= =thread_id( ))
170:    o->lock=0;
180:   else
190:     fat_unlock(o->lock);
200: }
220: void inflate(o){
230;   o->lock=alloc_fat_lock( ) | FAT_LOCK;
240:   fat_lock(o->lock);
250: }


In Table 2, pseudo codes at the lines 10 to 130 represent a locking function, pseudo codes at the lines 150 to 200 represent an unlocking function, and pseudo codes at the lines 220 to 250 represent an inflate function used in the locking function. In the locking function, locking by the thin lock is attempted at the line 20. If the lock is acquired, access to the object is executed. Unlocking with respect to a thread identifier input at the line 160 to the field for locking of the object is performed by inputting 0 to the field at the line 170. Thus, the high-frequency path can be executed at a high speed as in the case of a spin lock. On the other hand, if the lock cannot be acquired at the line 20, a determination is made at the line 40 as to whether a condition defined by a while statement, i.e., the result of bit-by-bit AND operation on the FAT_LOCK bit, which is the least significant bit of the locking field, and the bits in the locking field, is zero, that is, whether the FAT_LOCK bit is zero (more specifically, whether locking by the thin lock is designated). If this condition is satisfied, yield is effected until the thin lock is acquired at the line 60. When the thin lock is acquired, the inflate function from the line 220 is executed. In the inflate function, a fat lock identifier and the logical value 1 FAT_LOCK bit are input to the locking field o->lock (line 230). The fat lock is then acquired (line 240). If the FAT_LOCK bit is already 1 at the line 40, the fat lock is immediately acquired (line 110). Unlocking of the fat lock is performed at the line 190. Acquisition and unlocking of the fat lock are not particularly specified in the present invention and, therefore, will not be described.

It is to be noted here that only the thread maintaining the thin lock rewrites the locking field. The same can be said with respect to unlocking. Yield occurs only at the time of contention for the thin lock.

Example 2 of Composite Lock

Another example of the composite lock without yield in acquisition of the thin lock will be described. A wait is required in the case of contention in attempting locking by the thin lock. When the thin lock is released, it is necessary to notify the waiting thread of the release of the thin lock. For this wait and notification, a condition variable, a monitor or Semaphore is required. A description will be made below with respect to a case of using a monitor.

TABLE 3
10 :void lock (o) {
20 :   if (compare_and_swap (&o->lock, 0, thread_id( ))
30 :    return; 40 : monitor_enter (o);
50 :    while (! (o->lock, & FAT_LOCK)){
60 :     if (compare_and_swap(&o->lock, 0, thread_id( )){
70 :      inflate(o);
80 :      monitor_exit(o);
90 :      return;
100:     } else
110:       monitor_wait(o);
120:   }
130:   monitor_exit(o);
140:   fat_lock(o->lock);
150:   return;
160: }
180: void unlock (o) {
190:   if (o->lock = = thread_id( )) {
200:    o->lock=0;
210:    monitor_enter(o);
220:    monitor_notify(o);
230:    monitor_exit(o);
240:   } else
250:    fat_unlock(o->lock);
260: }
280: void inflate (o) {
290:   o->lock = alloc_fat_lock( ) | FAT_LOCK
300:   fat_lock(o->lock);
310:   monitor_notify_all(o);
320: }


The monitor is a sync mechanism devised by Hoare to enable exclusive control of access to an object (enter and exit), a waiting operation of a thread when a predetermined condition is satisfied (wait), an operation for notifying the waiting thread (notify and notify_all) (see Hoare, C. A. R. Monitors: An operating system structuring concept. Communications of ACM 17, 10 (October 1974), 549-557). One thread at most is permitted to enter the monitor. When a thread T is about to enter a monitor m, it is made to wait if some other thread S has already entered. The thread T waits at least before the thread S exits from the monitor m. Exclusive control is thus performed. The thread T having entered the monitor m can wait at the monitor m for a situation in which a certain condition is satisfied. More specifically, the thread T implicitly exits from the monitor m and suspends. It is to be noted here that after the thread T has implicitly exited from the monitor m, another thread can enter the monitor m. On the other hand, the thread S having its entry in the monitor m can give a notice to the monitor m of the entry after satisfying a certain condition. Then, for example, one thread U in threads waiting at the monitor m is waked up. The thread U resumes thereby and attempts to implicitly enter the monitor m. It is to be noted that since the thread A has its entry in the monitor m, the thread U is made to wait at least before the thread S exits from the monitor m. If there is no thread waiting a the monitor m, nothing happens. Instruction "notify_all" is the same as "notify" except for waking up all of waiting threads.

Referring to Table 3, the lines 10 to 160 represent a locking function, the lines 180 to 260 represent an unlocking function, and the lines 280 to 320 represent an inflate function. The difference from the composite lock example 1 with respect to the locking function resides in entering the monitor at the line 40, waiting without effecting yield in the case of contention for the thin lock (line 110), and exiting from the monitor when transition to the fat lock is made (line 80) and when transition to the fat lock is confirmed. It is to be noted here that an exit from the monitor is made at the line 130 and the fat lock is acquired at the line 140.

The difference from the composite lock example 1 with respect to the unlocking function resides in processing for entering the monitor at the lines 210 to 230, giving a notice to the monitor, and exiting from the monitor. This is because yield is not effected and waiting at the monitor is performed instead of yield. In the inflate function, notify_all is added. This is also because yield is not effected and waiting at the monitor is performed instead of yield. The line 290 represents an OR operation on the fat lock identifier obtained by alloc_fat_lock( ) and FAT_LOCK bit set to a logical value 1, and an operation for inputting the result of the OR operation.

Referring to Table 3, while yield is removed, an operation for notification (notify) is provided since there is a possibility of existence of a thread waiting at the time of unlocking, resulting in a reduction in the performance of the high-frequency path. Also, for spatial efficiency, the monitor or an additional function equivalent to the monitor is required. However, the monitor or the additional function become unnecessary after transition to the fat lock. In other words, it is necessary to separately prepare the monitor and the fat lock.

Example 3 of Composite Lock

This example of the composite lock, unlike Example 1, does not have the fat lock and the monitor separately provided. In this example, FAT_LOCK bit indicates transition to the fat lock and, after entry in the monitor, processing is performed by assuming that the fat lock has been acquired. See, for example, David F. Bacon, Ravi Konuru, Chet Murthy, and Mauricio Serrano. Thin


Free Web Sudoku Puzzles.
Solve with your browser.
      6 8   2    
          2 1 8  
  7     1     5  
1     9          
  5 3       7 2  
          1     4
  6     3     4  
  2 1 4          
    8   5 7      
What is it?



Add Your Site · Terms Of Service · Privacy Policy


DISCLAIMER
Linkgrinder is a free service that searches the Internet and indexes all files found so that you may search quickly and easily for shared files. These files are created and made available individually by users whose identity we are not aware of and who we have no control over. In essence we function like a search engine tool; these files ARE NOT STORED OR SERVED BY OUR NETWORK. We are not responsible for any materials obtained by using our service. We do not monitor any of the contents of these files. These files may contain viruses, illegal materials, materials inappropriate for minors, offensive files and the like. BY USING OUR SERVICE, YOU ASSUME FULL RESPONSIBILITY FOR DOWNLOADING THESE MATERIALS AND WILL INDEMNIFY US FOR ANY DAMAGES THAT MAY BE INCURRED.

For More Specific Information VIEW OUR TERMS OF SERVICE.

Thank you and Enjoy!