Senior Fitness - Exercise and Nutrition for Aging Men and Women
FREE Article Feed for your website.
Home Ownership Magazine
Party Planning Information
Article Marketing Resources
Bio-Medical Research Article Database
Informative Articles on Life, Love and Happiness
Tutorials on Business to Writing
Famous Quotes from Famous People
Song Lyric Information
New US Patent Information
Comprehensive List of Content by Category
Online Auctions and Shopping Related Articles
Article Search
Most Recent Articles
 

Be Jeweled
Category:
Travel  

Netting Women Meeting the Perfect Girl Online
Category:
Self Help  

Affiliate Marketing Why it Works
Category:
Business  

More Than a Needle in the Hay Stack Good SEO
Category:
Computers  

You Can Save Money On Health Insurance
Category:
Business  

Why advertisers should use Google AdWords and Adsense
Category:
Marketing  

The Buzz About Viral Marketing
Category:
Marketing  

How to cure your incurable nasal allergy
Category:
Health / Fitness  

How dental insurance plans can benefit employees
Category:
Health / Fitness  

RSS And Multi Media Content Delivery
Category:
Marketing  

How Do Male Enhancement Pills Work
Category:
Health / Fitness  

Energy Healing 101 Pranic Tantric and Reiki
Category:
Health / Fitness  

The Secrets Of No Money Down Real Estate Investing
Category:
Real Estate  

Take Advantage of Outsourcing through Elance com
Category:
Business  

The Four Rules of Home Computer Security
Category:
Computers  

Creating Ocean Art with Pastels
Category:
Entertainment / Television  

The Rise of Corporate Chair Massages
Category:
Home And Family  

Swimming With Dolphins
Category:
Travel  

Dental Implant
Category:
Health / Fitness  

Interracial Dating For You Check It Out
Category:
Home And Family  

The Four Most Important Factors For Building Muscle Fast
Category:
Health / Fitness  

Generic Cialis Branded Solution For Your Problem
Category:
Health / Fitness  

IQ Lights allows for unique creative way to light one s home
Category:
Home And Family  

7 Simple Tips For Building Trust
Category:
Business  

SEO India Search Marketing Agency India Mumbai Delhi
Category:
Computers  

Google AdSense Tips
Category:
Marketing  

Tips You Can Use To Based Crm Software Web
Category:
Business  

Flower care 101
Category:
Business  

Blog Your Way To Riches
Category:
Business  

The Keys to Obtaining and Refinancing Your College Loan
Category:
Business  

How to Buy a Cheap Unlocked Cell Phone
Category:
Computers  

Home Hair Care Tips for Dry Hair
Category:
Health / Fitness  

Get on the Vintage Computer Bus System
Category:
Computers  

Broadband Just The Facts
Category:
Computers  

Debt Management Credit Card Curse
Category:
Business  

The Truth About Red Wine and Heart Disease
Category:
Health / Fitness  

What do you need to know about stem cells
Category:
Health / Fitness  

A Vital Leadership Question What Does Our Organization REALLY Re...
Category:
Self Help  

Reassuring Reasons Why Hypnosis is your Friend
Category:
Self Help  

Why a good mattress in important for your health
Category:
Health / Fitness  

Easy Way to Fight Depression
Category:
Health / Fitness  

Who was St Patrick and Why Do We Celebrate His Life
Category:
Home And Family  

An Effective And Free Internet Marketing Method
Category:
Marketing  

Yahoo Small Business Why is Yahoo the Number 1 Small Business We...
Category:
Computers  

Types of Self Defeating Communication
Category:
Self Help  

Stop Look and Listen
Category:
Self Help  

ERP Accounting Selection Microsoft Dynamics Oracle SAP expansion...
Category:
Computers  

Golf Equipment
Category:
Sports  

What Is A Second Mortgage
Category:
Business  

Who Else Wants To Make 500 Per Day Thats Right 500 A Day
Category:
Business  

International Adoption and Guatemala
Category:
Home And Family  

6 Top Fashion Tips To Cultivate Your Charisma
Category:
Business  

Becoming Successful in Life
Category:
Self Help  

Spirituality of Youth Violence
Category:
Self Help  

Inadequate FDI Confine Japanese Food Processing Sector
Category:
Food / Drink  

Job Interviews Give Them What They Want to Hear
Category:
Business  

Rayon Thread
Category:
Hobbies / Pastimes  

All You Need To Know About Motorcycle Spark Plugs
Category:
Business  

A Great Way To Generate All The Motivation You Need To Get Fit
Category:
Health / Fitness  

You Deserve More Money
Category:
Business  

Home Loans for Credit Challenged Borrowers
Category:
Finance / Investment  

Understanding The Real Estate Inflation Game
Category:
Business  

Do You Know Your Dog
Category:
Pets  

Ways In Which You Can Lose Weight And Eat as Much As You Want
Category:
Health / Fitness  

2 Doggy Drooling Dog Treat Recipes
Category:
Pets  

Why Should You Get A Humidifier Today
Category:
Home And Family  

Intrusion detection guide
Category:
Computers  

Subcontracting your SEO and Web development
Category:
Marketing  

If You Want To Make Real Money Working At Home Then Follow Me
Category:
Business  

Craft Ideas For Grandparents Day
Category:
Education  

Three Reasons For Becoming A Foster Parent
Category:
Home And Family  

Home Equity Theft Through Contractors Still a Problem
Category:
Finance / Investment  

Article Writing for the Nervous
Category:
Marketing  

Petals For Your Tea
Category:
Health / Fitness  

Facts to Know Before Going for Weight Loss Surgery
Category:
Health / Fitness

Method and apparatus for detecting a cache wrap condition Number:7,386,684 from the United States Patent and Trademark Office (PTO) owispatent

Home    Author Login    Submit Article    Article Search    Add Your Link    Edit Your Link    Contact Us    Advertising    Disclaimer

   

 
Web LinkGrinder.com

Top Breaking News
     Greek, Cypriot Leaders Resume Unification Talks in Nicosia by Nathan Morley
     Indonesia Tobacco Sales Grow, Raising Health Fears
     South Korea Allows Top Defector to Travel Overseas by VOA News

Title: Method and apparatus for detecting a cache wrap condition

Abstract: A method and apparatus for detecting a cache wrap condition in a computing environment having a processor and a cache. A cache wrap condition is detected when the entire contents of a cache have been replaced, relative to a particular starting state. A set-associative cache is considered to have wrapped when all of the sets within the cache have been replaced. The starting point for cache wrap detection is the state of the cache sets at the time of the previous cache wrap. The method and apparatus is preferably implemented in a snoop filter having filter mechanisms that rely upon detecting the cache wrap condition. These snoop filter mechanisms requiring this information are operatively coupled with cache wrap detection logic adapted to detect the cache wrap event, and perform an indication step to the snoop filter mechanisms. In the various embodiments, cache wrap detection logic is implemented using registers and comparators, loadable counters, or a scoreboard data structure.

Patent Number: 7,386,684 Issued on 06/10/2008 to Blumrich,   et al.


Inventors: Blumrich; Matthias A. (Ridgefield, CT), Gara; Alan G. (Mount Kisco, NY), Giampapa; Mark E. (Irvington, NY), Ohmacht; Martin (Yorktown Heights, NY), Salapura; Valentina (Chappaqua, NY)
Assignee: International Business Machines Corporation (Armonk, NY)
Appl. No.: 11/093,132
Filed: March 29, 2005


Current U.S. Class: 711/146 ; 711/100; 711/118; 711/154
Field of Search: 711/128,133,135,144,154,159,57


References Cited [Referenced By]

U.S. Patent Documents
5572701 November 1996 Ishida et al.
5737748 April 1998 Shigeeda
5752261 May 1998 Cochcroft, Jr.
5829030 October 1998 Ishida et al.
5860153 January 1999 Matena et al.
5966729 October 1999 Phelps
6295582 September 2001 Spencer
6389517 May 2002 Moudgal et al.
6405287 June 2002 Lesartre
6490654 December 2002 Wickeraad et al.
6640286 October 2003 Kawamoto et al.
6704845 March 2004 Anderson et al.
7117312 October 2006 Cypher
2003/0065843 April 2003 Jones et al.
2003/0070016 April 2003 Jones et al.
2003/0084250 May 2003 Gaither et al.
2003/0135696 July 2003 Rankin et al.
2004/0003184 January 2004 Safranek et al.
2005/0144391 June 2005 Hassane

Other References

Moshovos, et al., "JETTY: Filtering Snoops for Reduced Energy Consumption in SMP Servers", The Proceedings of the 7th International Symposium on High-Performance Computer Architecture. cited by other.

Primary Examiner: Thai; Tuan V.
Attorney, Agent or Firm: Scully, Scott, Murphy & Presser, P.C. Morris, Esq.; Daniel P.

Claims



The invention claimed is:

1. A method for detecting when entire contents of a cache memory device associated with a processor device in a computing environment have been replaced relative to an identified starting state, said cache memory device comprising an N-way set associative cache, said cache wrap detection apparatus comprising: monitoring signals asserted by said processor device when performing processor cache updates, said signals including update indicator signals to indicate that a cache update is occurring, an update indicator signal for each cache update being associated with a particular cache line in a set i, where i .di-elect cons.{1, . . .,N} within said N-way set-associative cache; for each set i, responding to said update indicator signals for detecting a cache wrap condition in said set i, and, asserting a set_wrap(i) signal when all lines within that set have been replaced; and, receiving each said asserted set_wrap(i) signal and generating a cache wrap detection signal when a cache wrap has occurred for all sets of said N-way set associative cache and all lines of the cache memory device have been replaced relative to said identified starting state; and wherein said monitored signals further comprise a way signal indicating a current cache line being replaced in a particular set i, said method further comprising: loading a register device with data indicating a way that must be updated to complete a set wrap in said set i; and, comparing said received way signals against said data and asserting a signal when a received way signal matches the loaded data content of the register device.

2. The method as claimed in claim 1, further comprising the step of: providing a device for responding to said output signal from said comparator and asserting said set_wrap(i) signal when said comparator indicates an exact match.

3. The method as claimed in claim 2, further comprising responding to said asserted set_wrap(i) signal for subsequently tracking said cache way updates of the wrapped set by storing data indicating a current way signal input thereto.

4. The method as claimed in claim 3, further comprising the step of: resetting said responding device to no longer assert said set_wrap(i) signal upon detection of a cache wrap condition.

5. The method as claimed in claim 1, wherein each said cache set is defined as comprising one or more partitionable subsets of contiguous cache ways, said method further comprising: detecting when a cache update falls within a partition that is being monitored for wrapping, said detecting including determining when a received way signal fails between an upper way and lower way partition specification.

6. The method as claimed in claim 5, wherein said cache update detecting step for a partition is performed for each set i of said cache.

7. The method as claimed in claim 1, further comprising the steps of: loading a counter means with data indicating the number of unique lines in a cache set i that must be updated to complete a cache wrap condition in said set i, said counter means responding to received update indicator signals for asserting a set_wrap(i) signal when all lines within a set have been replaced; and, receiving each said asserted set_wrap(i) signal and generating said cache wrap detection signal when a cache wrap has occurred for all sets of said cache.

8. The method as claimed in claim 7, wherein said counter means comprises a loadable count down counter, said counter counting down when each update indicator signal is received, and asserting a set_wrap(i) signal when said counter reaches zero.

9. The method as claimed in claim 8, further comprising the step of: reloading said loadable count down counter with data indicating the unique number of lines in a cache set i that must be updated to complete a cache wrap condition in response to said cache wrap detection signal.

10. The method as claimed in claim 7, wherein said monitored signals further comprise a way signal indicating a current cache line being replaced in a particular set i, and, each said cache set is defined as comprising one or more partitionable subsets of contiguous cache lines, said method further comprising: detecting when a cache update falls within a partition that is being monitored for wrapping, said detecting including determining when a received way signal fails between an upper way and lower way partition specification.

11. The method as claimed in claim 10, wherein said cache update detecting step for a partition is performed for each set i of said cache.

12. The method as claimed in claim 10, further comprising the step of: loading a counter means with data indicating a partition size of a cache set i.

13. The method as claimed in claim 1, further comprising the step of: providing a scoreboard data structure having bit locations adapted to track data indicating each cache way being replaced; setting said bit locations in said scoreboard data structure for each cache way update; and, determining when all scoreboard bits have been set, indicating that a cache wrap has occurred.

14. The method as claimed in claim 13, wherein said determining when all scoreboard bits have been set comprises: implementing a counter means for tracking a number of times that a scoreboard data structure bit was first set after a reset, said counter means counting a number of writes to said scoreboard data structure.

15. The method as claimed in claim 14, wherein said counter means comprises a loadable count down counter, said counter step including counting down when each update indicator signal is received, and asserting said cache wrap detection signal when said counter reaches zero.

16. The method as claimed in claim 15, further comprising the step of: resetting said loadable count down counter to the number of cache lines in the cache upon detection of a cache wrap condition.

17. The method as claimed in claim 16, wherein said monitored signals further comprise a way signal indicating a current cache line being replaced in a particular set i, and said cache is partitionable into one or more subsets of contiguous cache lines, said method further comprising: detecting when a cache update falls within a partition that is being monitored for wrapping by determining when a received way signal falls between an upper way and lower way partition specification.

18. The method as claimed in claim 1, wherein said computing environment is a multiprocessor system having plural processing units each having an associated cache memory in operatively associated therewith, said method implemented in a snoop filter apparatus each associated with a single processing unit of said multiprocessor system, wherein each snoop filtering apparatus is operative for supporting cache coherency in said computing environment.
Description



CROSS-REFERENCE TO RELATED APPLICATIONS

This application relates to commonly-owned, co-pending U.S. patent application Ser. Nos. 11/093,130; 11/093,131; 11/093,154; 11/093,152; 11/093,127; and 11/093,160 all filed on even date herewith and incorporated by reference as if fully set forth herein.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention generally relates to computer systems having multiprocessor architectures and, more particularly, to a novel multi-processor computer system for processing memory accesses requests and the implementation of cache coherence in such multiprocessor systems.

2. Description of the Prior Art

To achieve high performance computing, multiple individual processors have been interconnected to form multiprocessor computer system capable of parallel processing. Multiple processors can be placed on a single chip, or several chips--each containing one or several processors--interconnected into a multiprocessor computer system.

Processors in a multiprocessor computer system use private cache memories because of their short access time (a cache is local to a processor and provides fast access to data) and to reduce number of memory requests to the main memory. However, managing caches in multiprocessor system is complex. Multiple private caches introduce the multi-cache coherency problem (or stale data problem) due to multiple copies of main memory data that can concurrently exist in the multiprocessor system.

Small scale shared memory multiprocessing system have processors (or groups thereof) interconnected by a single bus. However, with the increasing speed of processors, the feasible number of processors which can share the bus effectively decreases.

The protocols that maintain the coherence between multiple processors are called cache coherence protocols. Cache coherence protocols track any sharing of data block between the processors. Depending upon how data sharing is tracked, cache coherence protocols can be grouped into two classes: 1) Directory based and 2) Snooping.

In directory based approach, the sharing status of a block of physical memory is kept in just one location called the coherency directory. Coherency directories are generally large blocks of memory which keep track of which processor in the multiprocessor computer system owns which lines of memory. Disadvantageously, coherency directories are typically large and slow. They can severely degrade overall system performance since they introduce additional latency for every memory access request by requiring that each access to the memory go through the common directory.

FIG. 1 illustrates a typical prior art multiprocessor system 10 using the coherence directory approach for cache coherency. The multiprocessor system 10 includes a number of processors 15a, . . . , 15d interconnected via a shared bus 24 to the main memory 20a, 20b via memory controllers 22a, 22b, respectively. Each processor 15a, . . . , 15d has its own private cache 17a, . . . , 17d, respectively, which is N-way set associative. Each request to the memory from a processor is placed on the processor bus 24 and directed to the coherency directory 26. Frequently, in the coherency controller, a module is contained which tracks the location of cache lines held in particular subsystems to eliminated the need to broadcast unneeded snoop request to all caching agents. This unit is frequently labeled "snoop controller" or "snoop filter". All memory access requests from the I/O subsystem 28 are also directed to the coherency controller 26. Instead of the main memory, secondary cache connected to the main memory can be used. Processors can be grouped into processor clusters, where each cluster has its own cluster bus, which is then connected to the coherency controller 26. As each memory request goes through the coherence directory, additional cycles are added to each request for checking the status of the requested memory block.

In a snooping approach, no centralized state is kept, but rather each cache keeps the sharing status of data block locally. The caches are usually on a shared memory bus, and all cache controllers snoop (monitor) the bus to determine whether they have a copy of the data block requested. A commonly used snooping method is the "write-invalidate" protocol. In this protocol, a processor ensures that it has exclusive access to data before it writes that data. On each write, all other copies of the data in all other caches are invalidated. If two or more processors attempt to write the same data simultaneously, only one of them wins the race, causing the other processors' copies to be invalidated.

To perform a write in a write-invalidate protocol based system, a processor acquires the shared bus, and broadcasts the address to be invalidated on the bus. All processors snoop on the bus, and check to see if the data is in their cache. If so, these data are invalidated. Thus, use of the shared bus enforces write serialization.

Disadvantageously, every bus transaction in the snooping approach has to check the cache address tags, which could interfere with CPU cache accesses. In most recent architectures, this is typically reduced by duplicating the address tags, so that the CPU and the snooping requests may proceed in parallel. An alternative approach is to employ a multilevel cache with inclusion, so that every entry in the primary cache is duplicated in the lower level cache. Then, snoop activity is performed at the secondary level cache and does not interfere with the CPU activity.

FIG. 2 illustrates a typical prior art multiprocessor system 50 using the snooping approach for cache coherency. The multiprocessor system 50 contains number of processors 52a, . . . , 52c interconnected via a shared bus 56 to the main memory 58. Each processor 52a, . . . , 52c has its own private cache 54a, . . . , 54c which is N-way set associative. Each write request to the memory from a processor is placed on the processor bus 56. All processors snoop on the bus, and check their caches to see if the address written to is also located in their caches. If so, the data corresponding to this address are invalidated. Several multiprocessor systems add a module locally to each processor to track if a cache line to be invalidated is held in the particular cache, thus effectively reducing the local snooping activity. This unit is frequently labeled "snoop filter". Instead of the main memory, secondary cache connected to the main memory can be used.

With the increasing number of processors on a bus, snooping activity increases as well. Unnecessary snoop requests to a cache can degrade processor performance, and each snoop requests accessing the cache directory consumes power. In addition, duplicating the cache directory for every processor to support snooping activity significantly increases the size of the chip. This is especially important for systems on a single chip with a limited power budget.

What now follows is a description of prior art references that address the various problems of conventional snooping approaches found in multiprocessor systems.

Particularly, U.S. Patent Application US2003/0135696A1 and U.S. Pat. No. 6,704,845B2 both describe replacement policy methods for replacing entries in the snoop filter for a coherence directory based approach including a snoop filter. The snoop filter contains information on cached memory blocks--where the cache line is cached and its status. The U.S. Patent Application US2004/0003184A1 describes a snoop filter containing sub-snoop filters for recording even and odd address lines which record local cache lines accessed by remote nodes (sub-filters use same filtering approach). Each of these disclosures do not teach or suggest a system and method for locally reducing the number of snoop requests presented to each cache in a multiprocessor system. Nor do they teach or suggest coupling several snoop filters with various filtering methods, nor do they teach or suggest providing point-to-point interconnection of snooping information to caches.

U.S. Patent Applications US2003/0070016A1 and US2003/0065843A1 describe a multi-processor system with a central coherency directory containing a snoop filter. The snoop filter described in these applications reduces the number of cycles to process a snoop request, however, does not reduce the number of snoop requests presented to a cache.

U.S. Pat. No. 5,966,729 describes a multi-processor system sharing a bus using a snooping approach for cache coherence and a snoop filter associated locally to each processor group. To reduce snooping activity, a list of remote processor groups "interested" and "not-interested" in particular cache line is kept. Snoop requests are forwarded only to the processor groups marked as "interested" thus reducing the number of broadcasted snoop requests. It does not describe how to reduce the number of snoop requests to a local processor, but rather how to reduce the number of snoop requests sent to other processor groups marked as "not interested". This solution requires keeping a list with information on interested groups for each line in the cache for a processor group, which is comparable in size to duplicating the cache directories of each processor in the processor group thus significantly increasing the size of chip.

U.S. Pat. No. 6,389,517B1 describes a method for snooping cache coherence to allow for concurrent access on the cache from both the processor and the snoop accesses having two access queues. The embodiment disclosed is directed to a shared bus configuration. It does not describe a method for reducing the number of snoop requests presented to the cache.

U.S. Pat. No. 5,572,701 describes a bus-based snoop method for reducing the interference of a low speed bus to a high speed bus and processor. The snoop bus control unit buffers addresses and data from the low speed bus until the processor releases the high speed bus. Then it transfers data and invalidates the corresponding lines in the cache. This disclosure does not describe a multiprocessor system where all components communicate via a high-speed bus.

A. Moshovos, G. Memik, B. Falsafi and A. Choudhary, in a reference entitled "JETTY: filtering snoops for reduced energy consumption in SMP servers" ("Jetty") describe several proposals for reducing snoop requests using hardware filter. It describes the multiprocessor system where snoop requests are distributed via a shared system bus. To reduce the number of snoop requests presented to a processor, one or several various snoop filters are used.

However, the system described in Jetty has significant limitations as to performance, supported system and more specifically interconnect architectures, and lack of support for multiporting. More specifically, the approach described in Jetty is based on a shared system bus which established a common event ordering across the system. While such global time ordering is desirable to simplify the filter architecture, it limited the possible system configurations to those with a single shared bus. Alas, shared bus systems are known to be limited in scalability due to contention to the single global resource. In addition, global buses tend to be slow, due to the high load of multiple components attached to them, and inefficient to place in chip multiprocessors.

Thus, in a highly optimized high-bandwidth system, it is desirable to provide alternate system architectures, such as star, or point-to-point implementations. These are advantageous, as they only have a single sender and transmitter, reducing the load, allowing the use of high speed protocols, and simplifying floor planning in chip multiprocessors. Using point to point protocols also allows to have several transmissions in-progress simultaneously, thereby increasing the data transfer parallelism and overall data throughput.

Other limitations of Jetty include the inability to perform snoop filtering on several requests simultaneously, as in Jetty, simultaneous snoop requests from several processors have to be serialized by the system bus. Allowing the processing of several snoop requests concurrently would provide a significant increase in the number of requests which can be handled at any one time, and thus increase overall system performance.

Having set forth the limitations of the prior art, it is clear that what is required is a system incorporating snoop filters to increase overall performance and power efficiency without limiting the system design options, and more specifically, methods and apparatus to support snoop filtering in systems not requiring a common bus.

Furthermore, there is a need for a snoop filter architecture supporting systems using point-to-point connections to allow the implementation of high performance systems using snoop filtering.

There is a further need for the simultaneous operation of multiple snoop filter units to concurrently filter requests from multiple memory writers to increase system performance.

There is further a need to provide novel, high performance snoop filters which can be implemented in a pipelined fashion to enable high system clock speeds in systems utilizing such snoop filters.

There is an additional need for snoop filters with high filtering efficiency transcending the limitations of prior art.

SUMMARY OF THE INVENTION

It is therefore an object of the present invention to provide a simple method and apparatus for reducing the number of snoop requests presented to a single processor in cache coherent multiprocessor systems.

It is a further object of the present invention to provide a method and apparatus for detecting a cache wrap condition in a computing environment having a processor and a cache. A cache wrap condition is detected when the entire contents of a cache have been replaced, relative to a particular starting state. A set-associative cache is considered to have wrapped when all of the sets within the cache have been replaced.

Thus, according to a first aspect of the invention, there is provided a cache wrap detection apparatus for detecting when entire contents of a cache memory device associated with a processor device in a computing environment have been replaced relative to an identified starting state, said cache wrap detection apparatus comprising:

an interface for monitoring signals asserted by said processor device when performing processor cache updates, said signals including an update indicator signal for each cache update occurring; and,

a cache wrap detection logic means responsive to said update indicator signals for detecting a cache wrap condition, and, generating a cache wrap detection signal when all lines have been overwritten relative to said identified starting state.

According to this first aspect of the invention, it is possible to detect when entire contents of an N-way set associative cache memory device associated with a processor device in a computing environment have been replaced relative to an identified starting state.

The starting point for cache wrap detection is the state of the cache sets at the time of the previous cache wrap.

Further to this first aspect of the invention, the monitored signals further comprise a way signal indicating a current cache line being overwritten in a particular set i. The cache wrap detection logic means comprising a register device loaded with data indicating a way that must be updated to complete a set wrap in the set i; and, comparator means for receiving the way signals and asserting a signal when a received way signal matches the loaded data content of the register device.

Alternatively, the cache wrap detection logic means comprises a counter means loaded with data indicating the number of lines in a cache set(i) that must be updated to complete a set wrap in said set i; the counter means, e.g., a count-down counter, is responsive to received update indicator signals, and asserts a set_wrap(i) signal when all lines within a set have been overwritten.

Alternatively, the cache wrap detection logic means comprises a scoreboard data structure having bit locations adapted to track data indicating each cache way being overwritten; a means for setting the bit locations in the scoreboard data structure for each cache way update; and, means for determining when all scoreboard bits have been set, indicating that a cache wrap has occurred.

A second aspect of the invention is directed to a method for detecting when entire contents of a cache memory device associated with a processor device in a computing environment have been replaced relative to an identified starting state, the method comprising:

monitoring signals asserted by said processor device when performing processor cache updates, said signals including an update indicator signal to indicate that a cache update is occurring;

responding to said update indicator signals for detecting a cache wrap condition; and,

generating a cache wrap detection signal when all lines have been overwritten relative to said identified starting state.

Likewise, according to this second aspect of the invention, it is possible to detect when entire contents of an N-way set associative cache memory device associated with a processor device in a computing environment have been replaced relative to an identified starting state.

In accordance with the first and second aspects of the invention, each cache set is defined as comprising one or more partitionable subsets of contiguous cache lines. The cache wrap detection apparatus and method thus implementing a partition detecting logic means for detecting when a cache update falls within a partition that is being monitored for wrapping by determining when a received way signal falls between an upper way and lower way partition specification.

Preferably, the system and method of the invention is implemented in a snoop filter apparatus associated with a single processing unit of a multiprocessor system having plural processing units each having an associated cache memory operatively associated therewith. The cache wrap detection apparatus and implemented logic enables each snoop filtering apparatus to more efficiently support cache coherency in the multiprocessor system. In such an application, the snoop filter implements filter mechanisms that rely upon detecting the cache wrap condition. These snoop filter mechanisms requiring this information are operatively coupled with cache wrap detection logic adapted to detect the cache wrap event, and perform an indication step to the snoop filter mechanisms requiring the information.

BRIEF DESCRIPTION OF THE DRAWINGS

The objects, features and advantages of the present invention will become apparent to one skilled in the art, in view of the following detailed description taken in combination with the attached drawings, in which:

FIG. 1 depicts a base multiprocessor architecture with the coherence directory for cache coherency according to the prior art;

FIG. 2 depicts a base multiprocessor system using snooping approach for cache coherency according to the prior art;

FIG. 3 depicts a base multiprocessor system using snooping approach for cache coherency using a point-to-point connection described according to the present invention;

FIG. 4 illustrates an alternative embodiment base multiprocessor system using snooping approach for cache coherency using point-to-point connection where snoop filter is placed between the L2 cache and the main memory;

FIG. 5 depicts a high level schematic of a snoop filter block in accordance with a preferred embodiment of the invention;

FIG. 6 is a high level schematic of the snoop block containing multiple snoop filters according to the present invention;

FIG. 7 illustrates a high level schematic of a single snoop port filter according to the present invention;

FIGS. 8(a) and 8(b) depict high level schematics of two alternative embodiments of the snoop block according to the present invention;

FIG. 9 is a is a high level schematic of the snoop block including multiple port snoop filters according to a further embodiment of the present invention;

FIG. 10 depicts the control flow for the snoop filter implementing snoop cache for a single snoop source according to the present invention;

FIG. 11 depicts a control flow logic for adding a new entry to the port snoop cache in accordance with the present invention;

FIG. 12 depicts a control flow logic for removing an entry from the snoop cache in accordance with the present invention;

FIG. 13 depicts a block diagram of the snoop filter implementing stream registers in accordance with the present invention;

FIG. 14 depicts another embodiment of the snoop filter implementing stream registers filtering approach in accordance with the present invention;

FIG. 15 is a block diagram depicting the control flow for the snoop filter using paired stream registers and masks sets according to the invention; and,

FIG. 16 is a block diagram depicting the control flow for updating two stream register sets and the cache wrap detection logic for the replaced cache lines according to the invention;

FIG. 17 illustrates block diagram of signature filters to provide additional filtering capability to stream registers;

FIG. 18 is the block diagram of filtering mechanism using signature files in accordance with the present invention;

FIGS. 19(a) and 19(b) depict exemplary cache wrap detection logic circuitry (registers and comparator) for an N-way set-associative cache;

FIG. 20 depicts an exemplary cache wrap detection logic circuitry for an N-way set-associative cache according to a second embodiment of the invention that is based on a loadable counter; and,

FIG. 21 depicts an exemplary cache wrap detection logic circuitry for an N-way set-associative cache according to a third embodiment of the invention that is based on a scoreboard register.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Referring now to drawings, and more particularly to FIG. 3, there is shown the overall base architecture of the multiprocessor system with the use of snooping approach for cache coherency. In the preferred embodiment, the multiprocessor system is composed of N processors 100a, . . . , 100n (or CPUs labeled DCU.sub.1 to DCU.sub.N) with their local L1 data and instruction caches, and their associated L2 caches 120a, . . . , 120n. The main memory 130 is shared and can be implemented on-chip or off-chip. In the alternative embodiment, instead of main memory, a shared L3 with access to main memory can be used. In the preferred embodiment, the processor cores 100a, . . . , 100n are PowerPC cores such as PPC440 or PPC405, but any other processor core can be used, or some combination of various processors in a single multiprocessor system can be used without departing from the scope of this invention. The processor cores 100a, . . . , 100n are interconnected by a system local bus 150.

To reduce the number of snoop requests presented to a processor, and thus to reduce the impact of snooping on processor and system performance, and to reduce power consumed by unnecessary snoop requests, a snoop filter 140a, . . . , 140n is provided for each respective processor core 100a, . . . , 100n in the multiprocessor system 10. For transferring snooping requests, the preferred embodiment does not use the system bus 150, as typically found in prior art systems, but rather implements a point-to-point interconnection 160 whereby each processor's associated snoop filter is directly connected with each snoop filter associated with every other processor in the system. Thus, snoop requests are decoupled from all other memory requests transferred via the system local bus, reducing the congestion of the bus which is often a system bottleneck. All snoop requests to a single processor are forwarded to the snoop filter 140a, . . . , 140n, which comprises several sub-filters with the same filtering method, or with several different filtering methods, or any combination of the two, as will be described in greater detail herein. The snoop filter processes each snoop request, and presents only a fraction of all requests which are possibly in the processor's cache to the processor.

For each processor, snoop requests are connected directly to all other processors' snoop filters using a point-to-point interconnection 160. Thus, several snoop requests (resulting from write and invalidate attempts) from different processors can occur simultaneously. These requests are no longer serialized, as in the typical snooping approach using the system bus, where this serialization is performed by the bus. That is, multiple snoop requests can be processed in the snoop filter concurrently, as will be described herein in further detail. As a processor has only one snoop port, the snoop requests not filtered out by a snoop filter will be serialized in a queue to be presented to the processor. However, the number of requests passed to the processor is much less than the pre-filtered number of all snoop requests, reducing the impact of cache coherence implementation on system performance.

To prevent queue overflowing condition of the queues contained in the snoop filter block, a token-based flow control system is implemented for each point to point link to limit the number of simultaneously outstanding requests. According to the token-based flow control, each memory writer can send the next write request--which also initiates snoop requests to all other processor units and accompanied snoop filter blocks--only if it has tokens available for all ports of the snoop filter blocks it has a direct point-to-point connection. If there are no tokens available from at least one of the remote ports it is connected to, no snoop requests can be sent out from this memory writer until at least one token from the said snoop filter port gets available again.

FIG. 4 illustrates an alternative embodiment of this invention, with a base multiprocessor system using a snooping approach for cache coherency with point-to-point interconnection for snooping requests, wherein the snoop filter is placed between the L2 cache and the main memory 230. The multiprocessor system according to this embodiment thus comprises N processors 200a, . . . , 200n (or CPUs labeled DCU.sub.1 to DCU.sub.N) with their local L1 data and instruction caches, and their associated L2 caches 220a, . . . , 220n. The main memory 230 is shared and can be implemented on-chip or off-chip. In the alternative embodiment, instead of main memory, a shared L3 cache with access to main memory can be used. All memory access requests from processors 200a, . . . , 200n are transferred via a system local bus 250. In the embodiment depicted in FIG. 4, each of the processors in the multiprocessor system is paired with a respective snoop filter 240a, . . . , 240n. The point-to-point interconnection 260 is used to transfer snoop requests in the preferred embodiment in order to reduce the congestion of the system bus. In this point-to-point connection scheme 260, each processor's associated snoop filter is directly connected with each snoop filter associated with every other processor in the system. All snoop requests to a single processor are forwarded to its snoop filter, which processes each snoop request, and forwards only an appropriate fraction of all requests to the processor. In this embodiment, the snoop requests are filtered at the L2 cache level (not at L1, as in the previous embodiment illustrated in FIG. 3), but the presented invention is applicable to any cache level, and can be used for other levels of the cache hierarchy without departing from the scope of the invention.

Referring now to FIG. 5, there is depicted a high level block diagram of the snoop filter device according to the present invention. Snoop requests from all other processors l to N in a multiprocessor system are forwarded to the snoop block 310 via dedicated point-to-point interconnection inputs 300a, . . . , 300n. The snoop block 310 filters the incoming snoops and forwards the appropriate subset to the processor 320 via the processor snoop interface 340. In addition, the snoop block 310 monitors all memory access requests from the processor and L1 data cache block 320 to the L2 cache 330. These are only requests which miss in the L1 cache. The snoop block monitors all read address and control signals 360 and 362 to update its filters accordingly.

FIG. 6 depicts a high level schematic of the snoop block 310 depicted in FIG. 5. As shown in FIG. 6, the snoop block 310 includes multiple ("N") port snoop filters 400a, . . . , 400n that operate in parallel, with each dedicated only to one source of N memory writers (processors or a DMA engine sub-system, etc.). Each of the port snoop filters 400a, . . . , 400n receive on its dedicated input 410a, . . . , 410n snoop requests from a single source which is directly connected point-to-point. As will be described herein, a single port snoop filter may include a number of various snoop filter methods. The snoop block 310 additionally includes a stream register block 430 and snoop token control block 426. In addition, each port snoop filter 400a, . . . , 400n monitors all memory read access requests 412 from its associated processor which miss in the processor's L1 level cache. This information is also provided to the stream register block 430 for use as will be described in greater detail herein.

In operation, the port snoop filters 400a, . . . , 400n process the incoming snoop requests and forward a subset of all snoop requests to a respective snoop queue 420a, . . . , 420n having one queue associated with each snoop port. A queue arbitration block 422 is provided that arbitrates between all the snoop queues 420 and serializes all snoop requests from the snoop queues 420 fairly. Logic is provided to detect a snoop queue overflow condition, and the status of each queue is an input to a snoop token control unit 426 that controls flow of snoop requests from the remote memory writers. A memory writer--being a processor or a DMA engine--can submit a write to the memory and a snoop request to all snoop filters only if it has a token available from all snoop filters. The only snoop filter from which a processor does not need a token available to submit a write is its own local snoop filter. This mechanism ensures that the snoop queues do not overflow. From the snoop queue selected by arbiter 422, snoop requests are forwarded to the processor via a processor snoop interface 408.

FIG. 7 illustrates a high level schematic of a single snoop port filter 400. The snoop port filter block 400 includes multiple filter units which implement various filtering algorithms. In the preferred embodiment, three snoop filter blocks 440, 444, and 448 operate in parallel, each implementing a different snoop filter algorithm. The snoop filter blocks are labeled snoop cache 440, stream register check unit 444, and range filter 448. In one embodiment, each of the parallel snoop filter blocks receives on its input an identical snoop request 410 from a single source simultaneously. In addition, the snoop cache 440 monitors all memory read access requests 412 from the processor which miss in the L1 level cache, and stream registers check unit 444 receives status input 432 from the stream register unit 430 depicted in FIG. 6.

According to the preferred embodiment, the snoop cache block 440 filters the snoop requests 410 using an algorithm which is based on the temporal locality property of snoop requests, meaning that if a single snoop request for a particular location was made, it is probable that another request to the same location will be made soon. The snoop cache monitors every load made to the local cache, and updates its status, if needed. The stream register check block 444 filters snoop requests 410 using an algorithm that determines a superset of the current local cache content. The approximation of cache content is included in the stream registers block 430 (FIG. 6), and the stream register status 432 is forwarded to each snoop port filter 400. Based on this status, for each new snoop requests 410, a decision is made if the snoop address can possibly be contained in the local cache. The third filtering unit in the snoop port filter is the range filter 448. For this filtering approach, two range addresses are specified, the minimum range address and the maximum range address. The filtering of a snoop request is performed by first determining if the snoop request is within the address range determined by these two range addresses. If this condition is met, the snoop request is discarded; otherwise, the snoop request is forwarded to the decision logic block 450. Conversely, the request can be forwarded when it falls within the address range and discarded otherwise, without departing from the scope of the invention. Particularly, the decision logic block 450 receives results 456 of all three filter units 440, 444 and 448 together with the control signals 454 which enable or disable each individual snoop filter unit. Only results of snoop filter units for which the corresponding control signals are enabled are considered in each filtering decision. If any one of the filtering units 440, 444 or 448 decides that a snoop request 410 should be discarded, the snoop request is discarded. The resulting output of this unit is either to add the snoop request to the corresponding snoop queue 452, or to discard the snoop request and return a snoop token 458 to the remote processor or DMA unit that initiated the discarded snoop request.

In the preferred embodiment, only the three filtering units implementing the algorithms above described are included in a port snoop filter, but one skilled in the art will appreciate that any other number of snoop filter units can be included in a single port snoop filter, or that some other snoop filter algorithm may be implemented in the port snoop filter, or a combination of snoop algorithms can be implemented, without departing from the scope of the invention.

FIGS. 8(a) and 8(b) depict high level schematics of two alternative embodiments of the snoop filter block 310 of FIG. 6. As described herein with respect to FIG. 6, the snoop block may include multiple snoop filters that can use various filtering approaches, the same filtering approach, or a combination of the two. As shown in FIG. 8(a), N port snoop filters 460a, . . . , 460n operate in parallel, one for each of N remote memory writers. Each of the port snoop filters 460a, . . . , 460n receive on its respective input 462a, . . . , 462n snoop requests from a single dedicated source which is connected point-to-point. In addition, each snoop filter 460a, . . . , 460n monitors all of the local processor's memory load requests 464 which have missed in the L1 level cache. Other signals from other units of the snoop block may also be needed to supply to the port snoop filters, if required by the filter algorithm implemented. The exact signals needed are determined by the one or more snoop filter algorithms implemented in a single port snoop filter 460. Additionally, it should be understood that all port snoop filters do not have to implement the same set of filtering algorithms.

The port snoop filters 460a, . . . , 460n filter the incoming snoops and forward the appropriate unfiltered subset of snoop requests into the respective queues 466a, . . . , 466n and the queue arbitration block 468. Here, the snoop requests are serialized and presented to a next snoop filter 470, which handles inputs from all remote memory writers. This shared snoop filter 470 processes all snoop request presented and forwards a subset of all requests to the snoop queue 472. From the snoop queue 472, snoop requests are forwarded to the processor via the processor snoop interface 474. It should be understood that it is possible to have multiple or no shared snoop filters 470 instead of the configuration shown in FIG. 8(a). In the case of multiple shared filters, the filters may be arranged in parallel or in series (in which case the output of one filter is the input to the next, for example). If a filter has inputs from more than one source (i.e., is shared between multiple sources), it has to have its own input queue and an arbiter to serialize snoop requests. A final ordered subset of all snoop requests is placed in the snoop queue 472, and snoop requests are forwarded to the processor via the processor snoop interface 474. Optionally, a snoop queue full indication signal 476 is provided that indicates when the snoop queue is full in order to stop some or all remote memory writers from issuing further snoop requests until the number of snoops in the snoop queue falls below a predetermined level.

Similarly, FIG. 8(b) illustrates another embodiment with an alternative organization of the snoop filters in the snoop block 310. N port snoop filters 480a, . . . , 480n, each receiving only snoop requests from one of N remote memory writers (i.e., excluding the processor where the snoop filter is attached), operate in parallel. Each port snoop filter 480a, . . . , 480n receives on its respective input snoop requests 482a, . . . , 482n from only a single source, respectively. A shared snoop filter 484 is connected in parallel with the port snoop filter devices 480a, . . . , 480n. In an alternative embodiment, more than one shared snoop filter can be attached in parallel. The shared snoop filter 484 handles inputs from all N remote memory writers. Having more than one input, the shared filter 484 has its own input queues 486 and a queue arbiter 488 for serializing snoop requests. Further in the embodiment depicted in FIG. 8(b), all port snoop filters 480a, . . . , 480n and the shared snoop filter 484 monitor all memory read access requests 490 from the local processor which miss in the L1 level cache. The snoop filters 480a, . . . , 480n and 484 filter the incoming snoop requests and forward the appropriate unfiltered subset to the input queue of the next shared snoop filter 492a, . . . , 492n. Here, the unfiltered snoop requests are serialized by the queue arbiter 494, and are forwarded to the processor via the processor snoop interface 496. If one of the snoop queue devices 492a, . . . , 492n or 486 is full, a snoop queue full indication 498 is activated to stop all (or some of) the remote memory writers from issuing further snoop requests until the number of snoops in the snoop queue falls below a the predetermined level.

Referring now to FIG. 9, there is depicted a further embodiment of the snoop filter block 310. The block contains N port snoop filters 500a, . . . , 500n, corresponding to port snoop filters 400, 460a, . . . , 460n, and 480a, . . . , 480n (of FIGS. 8(a) and 8(b)). Each port snoop filter 500a, . . . , 500n includes a snoop cache device 502a, . . . , 502n, and a snoop check logic 504a, . . . , 504n. The snoop cache devices 502a, . . . , 502n implement a snoop filtering algorithm which keeps track of recent snoop requests from one source, where the source of snoop requests can be another processor, a DMA engine, or some other unit. For each new snoop request from a single source, the snoop request's address is checked against the snoop cache in the snoop check logic block 504. If the result of this comparison matches, i.e., the snoop request is found in the snoop cache, the snooped data is guaranteed not to be in the local L1 level cache of the processor. Thus, no snoop request is forwarded to the snoop queue 506 and the snoop queue arbiter 508. If no match is found in the snoop cache 502a, . . . , 502n for the current snoop request, the address of the snoop requests is added to the snoop cache using the signals 514a, . . . , 514n. Concurrently, the snoop request is forwarded to the snoop queue 506.

All snoop cache devices 502a, . . . , 502n also receive read addresses and requests 512 from the local processor, and compare the memory read access addresses to the entries in the snoop cache 502a, . . . , 502n. If a request matches one of the entries in the snoop cache, this entry is removed from the snoop cache, as now the cache line is going to be located in the processor's first level cache. In the preferred embodiment, multiple snoop caches operating in parallel are used, each keeping track of snoop requests from a single remote memory writer. After filtering, a fraction of unfiltered snoop requests can be forwarded to the next port snoop filter, or they can be queued for one or more shared snoop filters, or they are placed in the snoop queue of the processor interface, depending on the embodiment.

It is understood that a single snoop cache device 502 includes an internal organization of M cache lines (entries), each entry having two fields: an address tag field, and a valid line vector. The address tag field of the snoop cache is typically not the same as the address tag of the L1 cache for the local processor, but it is shorter by the number of bits represented in the valid line vector. Particularly, the valid line vector encodes a group of several consecutive cache lines, all sharing the same upper bits represented by the corresponding address tag field. Thus, the n least significant bits from an address are used for encoding 2.sup.n consecutive L1 cache lines. In the extreme case when n is zero, the whole entry in the snoop cache represents only one L1 cache line. In this case, the valid line vector has only one bit corresponding to a "valid" bit.

The size of the address tag field in the snoop cache is determined by the size of the L1 cache line and the number of bits used for encoding the valid line vector. In an example embodiment, for an address length of 32 bits (31:0), an L1 cache line being 32 bytes long, and a valid line vector of 32 bits, address bits (31:10) are used as the address tag field, (bit 31 being the most significant), address bits (9:5) are encoded in the valid line vector, and address bits (4:0) are ignored because they encode the cache line byte offset. As an illustration, three snoop caches for three different memory writers (N=3) are listed below, each snoop cache having M=4 entries, with address tag field to the left, and with 5 bits from the address used to encode the valid line vector to track 32 consecutive cache lines:

TABLE-US-00001 Snoop requests source 1 Entry 1: 01c019e 00000000000000000001000000000000 Entry 2: 01c01a0 00000000000000000000000100000000 Entry 3: 01c01a2 00000000000000000000000000010000 Entry 4: 01407ff 00000000000000000000000110000000

TABLE-US-00002 Snoop requests source 2 Entry 1: 01c01e3 00010000000000000000000000000000 Entry 2: 01c01e5 00000001000000000000000000000000 Entry 3: 01c01e7 00000000000100000000000000000000 Entry 4: 0140bff 00000000000000000000000110000000

TABLE-US-00003 Snoop requests source 3 Entry 1: 01c0227 00000000000000000001000000000000 Entry 2: 01c0229 00000000000000000000000100000000 Entry 3: 01c022b 00000000000000000000000000010000 Entry 4: 0140fff 00000000000000000000000110000000

In this example, entry 1 of the source 1 snoop cache has recorded the cache line address 38033cc hexadecimal (with least significant 5 bits of address for byte offset removed). For this cache line address, the address tag is 1c019e. This cache line has been invalidated recently and cannot possibly be in the L1 cache. Therefore, the next snoop request to the same cache line will be filtered out (discarded). Similarly, entry 4 of the source 1 snoop cache will cause snoop requests for cache line addresses 280ffe7 (address tag 01407ff) and 280ffe8 to be filtered out.

Referring now to FIG. 10, the control flow for the snoop filter implementing a snoop cache device for a single snoop source is shown. At the start of operation, all M lines in the snoop cache are reset as indicated at step 600. When a new snoop request from a snoop source i is received, the address of the snoop request is parsed into the "address tag" field 526 and into bits used for accessing the valid line vector 524. The valid line vector of the snoop request has only one bit corresponding to each L1 cache with address bits matching the address tag field. This is performed in the step 602. In the step 604, the "tag" field of the snoop request is checked against all tag fields in the snoop cache associated with the snoop source i. If the snoop request address tag is the same as one of the address tags stored in the snoop cache, the address tag field has hit in the snoop cache. After this, the valid line vector of the snoop cache entry for which a hit was detected is compared to the valid line vector of the snoop request. If the bit of the valid line vector in the snoop cache line corresponding to the bit set in the valid line vector of the snoop request is set, the valid line vector has hit as well. In one preferred embodiment, the valid line vector check is implemented by performing a logical operation upon the bit operands. Thus, for example, the valid line vector check may be performed by AND-ing the valid line vector of the snoop request with the valid line vector of the snoop cache line, and checking if the result is zero. It is understood that other implementations may additionally be used without departing from the scope of this invention. It is further understood that checking for a valid line vector hit can be implemented in parallel with checking for an address tag hit.

At step 606, a determination is made as to whether both the "tag" field matches and the corresponding bit in the valid line vector is set. If both the "tag" field matches and the corresponding bit in the valid line vector is set, the snoop request is guaranteed not to be in the cache as indicated at step 606. Thus, this snoop request is not forwarded to the cache; it is filtered out as indicated at step 608.

Otherwise, if the address "tag" field hits in the snoop cache but the bit in the valid line vector is not set or, alternately, if the tag does not hit in the snoop cache, this indicates that the line may be in the cache. Consequently, the snoop request is forwarded to the cache by placing it into a snoop queue as indicated at step 612. This snoop request is also added as a new entry to the snoop cache as shown at step 610.

Referring now to FIG. 11, there is shown the details of step 610 (FIG. 10) describing the process of adding new information in the snoop cache. This is accomplished by several tasks, as will now be described. At step 614, a determination is first made as to whether the address tag is already stored in the snoop cache (i.e., the address tag was a hit). For this step, the information calculated in step 602 (FIG. 10) can be used. If the address tag check gave a hit, then the process proceeds to step 624, where the bit in the valid line vector of the selected snoop cache entry corresponding to the snoop request is set. If the address tag check gave a miss in step 614, a new snoop cache entry has to be assigned for the new address tag, and the process proceeds to 616 where a determination is made as to whether there are empty entries available in the snoop cache. If it is determined that empty entries are available, then the first available empty entry is selected as indicated at step 626. Otherwise, if it is determined that there are no empty entries in the snoop cache, one of the active entries in the snoop cache is selected for the replacement as indicated at step 618. The replacement policy can be round-robin, least-recently used, random, or any other replacement policy known to skilled artisans without departing from the scope of this invention. Continuing to step 622, the new address tag is then written in the selected snoop cache line and the corresponding valid line vector is cleared. Then, as indicated at step 624, the bit in the valid line vector of the selected snoop cache entry corresponding to the bit set in the valid line vector of the snoop request is set.

In yet another embodiment, the new information is not added into the snoop cache based on the hit or miss of a snoop request in the snoop cache only, but instead, the addition of new values--being whole snoop cache lines or only setting a single bit in a valid line vector--is based on the decision of the decision logic block 450 (FIG. 7). In this embodiment, the new information is added into the snoop cache only if the decision logic block does not filter out the snoop request. If any other filter in the snoop port filter block 400 (FIG. 7) filters out the snoop request (i.e., determines that the data are not in the local L1


Free Web Sudoku Puzzles.
Solve with your browser.
9     2 4 1      
      7          
  1 5       6    
  8       9   5 1
5               7
7 2   5       6  
    8       5 1  
          5      
      4 8 3     2
What is it?



Add Your Site · Terms Of Service · Privacy Policy


DISCLAIMER
Linkgrinder is a free service that searches the Internet and indexes all files found so that you may search quickly and easily for shared files. These files are created and made available individually by users whose identity we are not aware of and who we have no control over. In essence we function like a search engine tool; these files ARE NOT STORED OR SERVED BY OUR NETWORK. We are not responsible for any materials obtained by using our service. We do not monitor any of the contents of these files. These files may contain viruses, illegal materials, materials inappropriate for minors, offensive files and the like. BY USING OUR SERVICE, YOU ASSUME FULL RESPONSIBILITY FOR DOWNLOADING THESE MATERIALS AND WILL INDEMNIFY US FOR ANY DAMAGES THAT MAY BE INCURRED.

For More Specific Information VIEW OUR TERMS OF SERVICE.

Thank you and Enjoy!